AA narrowing of AI research?
Joel Klinger Juan Mateos-Garcia Konstantinos StathoulopoulosNovember 18, 2020
Abstract
Artificial Intelligence (AI) is being hailed as the latest example of a General PurposeTechnology that could transform productivity and help tackle important societal chal-lenges. This outcome is however not guaranteed: technological races and a myopic focuson short-term benefits could lock AI into technologies that turn out to be sub-optimal.For this reason, it may be valuable to preserve diversity in the AI trajectories thatare explored until there is more information about their relative merits and dangers.Recent controversies about the dominance of deep learning methods and private labsin AI research suggest that the field may be getting narrower, but the evidence base islacking. We seek to address this gap with an analysis of the thematic diversity of AIresearch in arXiv, a widely used pre-prints site. Having identified 110,000 AI papers inthis corpus, we use hierarchical topic modelling to estimate the thematic compositionof AI research, and this composition to calculate various metrics of research diversity.Our analysis suggests that diversity in AI research has stagnated in recent years, andthat AI research involving private sector organisations tends to be less diverse thanresearch in academia. Diversity in academia is bolstered by smaller institutions andresearch groups with less incentives to ‘race’ and lower levels of collaboration with theprivate sector. We also find that private sector AI researchers tend to specialise indata-hungry and computationally intensive deep learning methods at the expense ofresearch involving other (symbolic and statistical) AI methods, research that consid-ers the societal and ethical implications of AI, and applications in sectors like health.Our results provide a rationale for policy action to prevent a premature narrowing ofAI research that could constrain its societal benefits, but we note the informational,incentive and scale hurdles standing in the way of such interventions. a r X i v : . [ c s . C Y ] N ov enembaum told me he looks to the story of backprop for inspiration. Fordecades, backprop was cool math that didn’t really accomplish anything. Ascomputers got faster and the engineering got more sophisticated, suddenly itdid. He hopes the same thing might happen with his own work and that of hisstudents, “but it might take another couple decades” .Interview with AI researcher Joshua Tenenbaum in Somers [2017] Technological change has a direction as well as a rate: different designs for a technologyare possible, and some of them may be more societally desirable than others [Aghion et al.,2009]. Fortuitous events, shortsightedness and lack of coordination can however createsituations where an inferior design becomes dominant and hard to switch away from evenafter its limitations become apparent [David, 1985, Arthur, 1994]. Consider for examplethe automobile, where the combustion engine surpassed environmentally friendlier alterna-tives based on steam or electricity, or the case of the nuclear reactor, where early demandfrom the military locked this technology into a light water design that was less suitablefor civilian applications [Cowan, 1990]. Faced with uncertainty about the benefits andrisks of alternative technologies and the desire to avoid premature lock-in to an inferior de-sign, it might be reasonable to put in place policies to preserve diversity in the technologylandscape [Aghion et al., 2009]. In military procurement, funders have long adopted port-folio strategies to explore multiple technologies in parallel [Johnson, 2012]. Contemporaryapproaches to mission-oriented innovation policy call for bottom-up exploration of tech-nological opportunities that avoid picking a single solution for tackling complex societalchallenges [Mazzucato, 2018].Are similar initiatives needed to sustain technological diversity in Artificial Intelligence(AI) research? AI systems based on deep learning, a machine learning technique thatinfers patterns from large unstructured datasets have been deployed successfully in manydigital and media products and services [LeCun et al., 2015]. AI’s technical and commercialsuccesses have attracted large R&D investments from corporations and venture capitalistsand the attention of economists and policymakers who have respectively heralded AI asthe latest example of a General Purpose Technology (GPT) with revolutionary potential[Cockburn et al., 2018] and launched national strategies and plans to bolster indigenousAI industries and mitigate the disruption caused by labour-displacing AI systems [Paunovet al., 2019].Others have highlighted the limitations of this AI trajectory. Some researchers note thatAI systems based on deep learning are brittle and liable to under-perform or even fail in1nexpected and catastrophic ways when they are exposed to new situations or gamed bymalicious actors, potentially making them unsuitable for high stake domains such as trans-port or health [Marcus, 2018, Russell, 2019, D’Amour et al., 2020]. Economists worry thatbusinesses may deploy ‘mediocre’ AI systems that fail to offset their labour displacementeffects with productivity gains [Acemoglu and Restrepo, 2019]. Activists, journalists andcritical scholars of technology have found that some AI systems generate discriminatory andunfair outcomes [Lum and Isaac, 2016] and/or enable mass surveillance by governmentsand commercial actors [Zuboff, 2019] . There is increasing evidence that data-hungry,computationally intensive deep learning systems may be environmentally unsustainable[Thompson et al., 2020].The terms of the debate echo a three-step process often described in the directed technicalchange literature: a powerful yet in some ways flawed technological design gains momentumamong researchers, entrepreneurs, investors and policymakers (step 1), reducing the diver-sity of ideas that are explored and creating the risk of lock-in (step 2) that may be regrettedin the future, when the full costs of the dominant design become manifest (step 3). In thispaper we study the first two steps of this process in the context of AI research: we analysethe evolution of its thematic and organisational composition and how this is reflected inthe diversity of ideas that are pursued by AI researchers, paying particular attention tothe activities of private sector organisations that have become increasingly influential inthe AI research landscape [Hain et al., 2020, Ahmed and Wahed, 2020, Hagendorff andMeding, 2020] and which, according to the literature, may play an important role in nar-rowing its diversity by concentrating on those technology designs with the shortest-termcommercial potential regardless of their longer term impacts and externalities [Bryan andLemus, 2017]. To do this, we analyse the corpus of AI research in arXiv, a pre-prints repository widely usedby the AI research and development community. We estimate the thematic composition ofthis corpus with a topic model and use this information to calculate its thematic diversityaccording to various metrics. We study the evolution of these metrics and compare thediversity of AI research involving private firms with the rest of the corpus.Our results suggest that thematic diversity in AI research has stagnated and even declinedin recent years, and that private sector organisations (in particular large technology compa-nies such as Google and Microsoft) are more narrowly focused on frontier, computationallydemanding deep learning technologies than organisations in academia and the public sec-tor. Our results suggest that there may be a policy rationale to preserve diversity in AIresearch. The metrics that we develop could help inform this agenda.After highlighting our contributions in the rest of this section, in Section 2 we review the Throughout the paper we will use the term thematic diversity to refer to the topics covered in ourcorpus - they mostly comprise research and technological subjects so our usage of the term is closely relatedto notions of research and technological diversity.
Our analysis contributes to a growing body of AI mapping that is starting to considerthe constituents of AI’s technological trajectory in terms of the technologies and methodsthat are developed, the actors involved in research, its application sectors and connec-tions with other disciplines [Klinger et al., 2018, Cockburn et al., 2018, Frank et al., 2019,Stathoulopoulos and Mateos-Garcia, 2019]. We note in particular a recent stream of re-search considering the participation of private companies in AI research and the role theymay play in the ‘de-democratisation’ of the knowledge being produced and creating con-flicts of interest for researchers [Ahmed and Wahed, 2020, Hagendorff and Meding, 2020].We advance this literature by generating, to the best of our knowledge, the first set ofaggregate measures of thematic diversity in a large corpus of AI research, and studying itslinks with increasing participation by private sector firms (Freire et al. [2020] also measuresAI diversity but focusing on a small number of submissions to AI conferences, and ignoringthematic aspects of diversity.). This grounds our analysis in the directed technologicalchange literature, and increases its policy relevance.Our results are consistent with theories of directed technological change that study theprocesses and factors through which technology designs attain dominance [Arthur, 1994,Bryan and Lemus, 2017]. This literature has for the most part relied on qualitative data,historical case studies and simulations, to which we contribute with our quantitative anal-ysis. We also contribute to the AI safety literature, which has highlighted the safety riskscreated by research and innovation races where institutions or countries seek to developpowerful AI systems ahead of their competitors [Armstrong et al., 2016]. This literaturehas generally focused on the longer-term consequences of these races, such as the risk ofdeveloping AI systems that are misaligned with human values and preferences. Here weprovide evidence of how racing behaviours taking place now may be narrowing AI researchin a way that could lead to the development of inferior and unsafe AI systems. In doingthis, we respond to recent calls for studies that bridge the gap between short term andlong term perspectives in AI governance research [Dafoe, 2017].Methodologically, we build on a growing literature that applies Natural Language Process-ing (NLP) methods to the analysis of the composition, diversity and interdisciplinarity ofresearch and technological fields [Suominen, 2017, Paez-Aviles et al., 2018]. Top-SBM, the3lgorithm we use in our analysis, presents important advantages over the Latent DirichletAllocation algorithm which is standard in this literature.The novel dataset that we have created for this paper is available, together will all ourcode here.
Following Arthur, we define a technology as ‘a means to fulfil a human purpose’ and ‘anassemblage of practices and components’ [Arthur, 2009]. The notion of directed techno-logical change reflects the fact that there are generally many different components andassemblages of components (which we will refer to as ‘technological designs’) that couldbe feasibly deployed to build a technology that fulfils a human purpose: for example, anautomobile that fulfils the human purpose of individual transportation can be based on acombustion engine design, a steam-powered design or an electricity-powered design. As wewill see below, an AI system that imitates human intellectual faculties could be based ona collection of conditional statements that govern its behaviour (symbolic design), or onan algorithm that has ‘learned’ which actions are more conducive to a successful outcomeafter being trained on a labelled dataset (supervised machine learning design), exploringa synthetic environment (reinforcement learning design) or observing the behaviour of ahuman agent (inverse reinforcement learning design).One important implication of directed technological change is that the evolution of tech-nologies is rarely if ever a deterministic process: given a set of technological designs T = { t , t , ..., t n } that could fulfil a human purpose, and a set of contexts C = { c , c , ..., c m } where those designs could be deployed, there is not a design t j (cid:15)T such that its perfor-mance P in context c k P c k ( t j ) > P c k ( t i ) ∀ t i (cid:15)T, c k (cid:15)C . If such design existed, then it wouldalways be adopted regardless of the context and it would be unnecessary to consider itsalternatives. The design would be equivalent to the technology, and its direction would besingular. If to the contrary the relative performance of a technological design depends onits context, this means that multiple technological designs are in principle possible, andthe direction of a technology matters.Technology designs evolve following technological trajectories shaped by incremental andradical improvements in their constituent components, changes in the supply of comple-mentary inputs such as natural resources, skills and other technologies, as well as economic,social and cultural factors [Dosi, 1982]. Arthur provides a third definition ( ‘the entire collection of engineering practices and devices availableto a culture’ ) but this refers to technology as a field rather than the specific technologies that constitute it.
Our discussion so far suggests that technological development will not necessarily evolve inthe most societally desirable direction. To the contrary, random events and misalignmentbetween private and public interest (specially in the presence of shortsightedness and rac-ing) coupled with increasing returns to scale and network effects mean that this outcomemay well be the exception rather than the rule.This creates a dual role for technology policy: to prevent lock-in to inferior technologicaldesigns and, if possible, steer technological development towards superior technologicaldesigns [Aghion et al., 2009]. Unfortunately, there are significant informational challengesstanding in the way of such policy decisions, specially in the early stages of the developmentof a technology when there is a high degree of uncertainty about its benefits, risks anddistributional aspects [Rotolo et al., 2015]. Given this, one promising strategy would beto preserve diversity in the technology landscape, avoiding ‘excess momentum‘ that couldresult in the premature lock-in of a particular design until there is a fuller understanding ofits implications by all stakeholders. Maintaining second-tier technologies in order to reducevulnerability to ‘surprises’ [Acemoglu, 2011], increasing the diversity of researcher andinnovator populations and providing incentives for investment in long-term research linescounter-balancing myopic innovation races can also help maintain technological diversityin the face of homogenising tendencies [Bryan and Lemus, 2017].There are other rationales for preserving technological diversity beyond this portfolio man-agement logic. First, some technologies may need to be deployed in a wide range of circum-stances (the set C could be large and quite heterogeneous in its interactions with elementsof the set of technology designs T ), making it necessary to maintain multiple versions of thetechnology so as to avoid excluding specific user groups and communities from its benefits6it is worth noting that large markets that aggregate the demand of such user groups andcommunities may provide sufficient incentives for the preservation of diversity in a mani-festation of Adam Smith’s dictum that the division of labour - a manifestation of economicdiversity - is limited by the extent of the market). Second and related to this, technologicaldiversity could have a political/ethical component: if specific technology designs are alignedwith particular political philosophies or values ([Gabriel, 2020]), then this could create arationale for preserving technological diversity in the interest of political pluralism (i.e. toavoid a situation where technology supply skews the political process in favour of thosevalues that are most aligned with available technologies) and inter-generational solidar-ity (to avoid a situation where future generations are locked-in to technologies misalignedwith future values that we cannot currently envisage). Current agendas of ‘technologicalsovereignty‘ aimed at empowering local communities to develop digital technologies suitedto their needs and values instead of relying on commercial products and services illustratethis idea [Lynch, 2020].Third and last, there is an innovation rationale for technological diversity: more diversityincreases the range of technological components that can be drawn on for combinatorialinnovations that bolster existing technological designs or provide the foundation for newones [Arthur, 2009, Yaqub, 2018]. Having reviewed the literature on directed technological change and its links with tech-nological diversity, we turn our attention to our empirical setting, the field of AI, with aparticular focus on events and trends in the field that mirror processes described in theliterature (see table 1 for a summary of those parallelisms).
Eras of AI research:
Russell and Norvig [2009] define AI as the scientific discipline andtechnological practice that specialises in ‘the designing and building of intelligent agentsthat receive percepts from the environment and take actions that affect that environment’ .This broad definition encompasses efforts to build general-purpose ‘machines that think’ aswell as AI systems that perform narrow classification, prediction or optimisation tasks (untilnow, most progress has been in the area of narrow AI). Advances in AI could transformhow organisations process and use information, enable the automation of many cognitivetasks and augment the skills and intelligence of human workers [Markoff, 2016]. Thisexplains strong commercial and government interest in AI since its beginning as a scientificdiscipline in the aftermath of World War II [Wooldridge, 2020].AI researchers and technologists have explored many designs to build AI systems. In the1950s they used symbolic methods to implement logical behaviours in computers, and inthe 1980s they created Expert Systems embedding the knowledge and heuristics of human7 irected technology change dynamic AI trendMany alternative technology designs are pos-sible Various technology designs explored in differ-ent eras of AI researchTechnology lock-in not fully based on merit Complementor lotteries and bandwagonsInstitutional isomorphism Increasing overlaps between academic re-search and industryShort-termism Commercial and geo-political AI racesConcerns about premature lock-in to flawedtechnology Lead-user bias in the AI systems being devel-oped and calls for exploration of alternatives
Table 1:
Dynamics in the directed technological change literature and related trends incontemporary AI researchexperts. In both cases, initial interest and investment in AI were followed by disappointingresults, leading to so-called AI Winters where funding dried out.Machine learning strategies proved more scalable. Instead of looking for pre-defined pat-terns in the data, machine learning algorithms are trained to induce patterns from labelleddatasets. The 2010s in particular saw important advances in deep learning [LeCun et al.,2015], a machine learning technique loosely inspired by the operation of the human brainwhere artificial neurons learn abstract patterns from large, unstructured datasets usingbackpropagation, an algorithm that adjusts the strength of the links between neurons inthe network to minimise prediction error [Chauvin and Rumelhart, 1995].In 2012, AlexNet, a deep learning system won the ImageNet image classification competi-tion, kick-starting the most recent ‘AI boom’ [Krizhevsky et al., 2012]. Since then, deeplearning-based AI systems have experienced rapid improvements in performance in a vari-ety of tasks including computer vision and image and video generation, speech recognitionand synthesis, translation, question answering, robotics and game playing, and have beensuccessfully deployed in mainstream products and services such as search engines, socialnetworking sites, translation systems, digital personal assistants and self-driving vehicles[Agrawal et al., 2018]. Technical advances have been accompanied by growing investmentand policy initiatives to support AI. According to the AI Index, a project to measure var-ious dimensions of the AI ecosystem, global private investment in AI in 2019 amounted to $ omplementor lotteries: The history of deep learning highlights the importance ofnetwork effects and serendipity in AI’s technological trajectory. Many of the key ideasunderlying this AI design had been introduced in the 1950s and 1980s but it was notuntil the 2010s, with the increasing availability of suitable hardware, software and data,that these methods could be implemented at scale [Wooldridge, 2020]. One hardwareinnovation in particular - Graphics Processor Units (GPUs) developed for video-gamesapplications - became an important enabler for deep learning techniques that benefit fromparallelisation of tasks. Hooker [2020] argues that the lag between the development of keyresearch ideas about artificial neural networks and their application in deep learning are anexample of ‘hardware lotteries’ in AI research: this refers to the fact that the adoption ofan idea depends not only on its merits but also on the availability of suitable complements:hardware and software to implement the idea, and in the case of machine learning methods,large datasets for training.
Institutional isomorphism:
The advent of deep learning has seen a shift in the locusof research activity from academia to the private sector. In 2016, AlphaGo, an AI systemdeveloped by Alphabet’s AI subsidiary DeepMind defeated Go champion Lee Sedol ina highly publicised five-match contest [Silver et al., 2016]. Microsoft AI Researcherswere responsible for the first speech recognition system to achieve parity with humans in2017. Word embeddings, a technique that uses the byproducts of a deep learning modelto represent text data in a multidimensional semantic space was developed by researchersat Google [Mikolov et al., 2013]. In addition to generating new research results, privatecompanies have created the most popular open source frameworks for the implementationof deep learning techniques - Tensor Flow (developed by Google) and PyTorch (developedby Facebook).Academic and industrial AI research have become increasingly intertwined in recent years:Private labs recruit large numbers of researchers and graduates from leading academicinstitutions, and often collaborate with academic researchers who in many cases have dualacademic - industrial affiliations [Michael Gofmand, 2019, Hain et al., 2020, Hagendorffand Meding, 2020]. Collaborating with industry is one of the main channels through whichacademic researchers are able to access the data and infrastructure required for state-of-the-art deep learning research [Ahmed and Wahed, 2020]. Private sector companies alsoparticipate actively in key conferences such as NeurIPS (Neural Information ProcessingSystems) or ICML (International Machine Learning Conference). In 2019, Google andAlphabet’s subsidiary DeepMind had the largest number of accepted papers at the NeurIPSconference. It should be noted that this system combined reinforcement deep learning with Montecarlo tree search,an older heuristic search technique. https://medium.com/@dcharrezt/neurips-2019-stats-c91346d31c8f One of the motives for the creation of OpenAI was a desire to preventa dominance of private interests over AI research, echoing concerns about misalignmentbetween private interests and public value in the directed technology literature. Overtime, OpenAI’s modus operandi has arguably converged with those of private sector labs.In 2019, it received a $ This illustrates the increasing importance of large scale infrastructureand process innovation in modern AI R&D and a convergence in the values, processes andstrategies of the organisations participating in it, consistent with the idea of institutionalisomorphism in AI research [Caplan and danah boyd, 2018].
AI R&D races:
AI development and deployment are frequently described as ‘winner-takes-all’ processes: private sector companies that lead on AI will be able to dominatetheir markets and expand into new ones. The countries that control the direction of AIdevelopment will be able to assert their political systems and values [Lee, 2018]. Thisnarrative, influenced by the notion of an ‘intelligence explosion’ where AI systems thatreach a certain level of performance become able to self-improve recursively, is leading toraces in AI development and deployment [Armstrong et al., 2016]. Some researchers haveraised concerns that this could result in a ‘race to the bottom’ in AI safety, and that thepressure to achieve state-of-the-art results is creating ‘troubling trends’ in AI scholarshipsuch as metric-gaming, overfitting of models to benchmark datasets and lack of replicabilityin research outputs [Lipton and Steinhardt, 2019].
A premature narrowing of AI research?
There is increasing awareness of the lim-itations of AI systems based on deep learning, and concerns about premature lock-in tothis design. Like other machine learning systems, Deep learning systems are brittle andliable to under-perform and/or fail when they confront situations outside their training set[D’Amour et al., 2020, Marcus, 2018]. Deep learning systems optimise metrics of predictiveperformance even when this creates undesirable outcomes such as discriminatory decisionsreflecting biases in the training data, or behaviours that go against the intentions of theirusers [Russell, 2019]. They require large datasets and computational infrastructures totrain that create substantial environmental impacts [Thompson et al., 2020].It could be argued that some of these limitations are less consequential for AI systems https://openai.com/about/ https://thenextweb.com/neural/2020/09/03/openai-reveals-the-pricing-plans-for-its-api-and-it-aint-cheap/ These systemsmay also generate future societal and environmental externalities that are likely to beunder-valued by private sector firms. This could ultimately reduce the economic and socialbenefits of AI systems and concentrate them in a smaller number of organisations andcommunities.In response to all this, some researchers have called for the creation of a new AI trajectorybased on hybrid systems that combine deep learning techniques with methods from otherAI traditions such as symbolic logic or causal inference that have been sidelined duringthe deep learning boom [Marcus, 2018, Pearl, 2018]. Others have expressed concerns thatthe development of hardware solutions optimised for commercially successful deep learningtechniques may hinder the exploration of alternative, less developed ideas [Barham andIsard, 2019, Hooker, 2020]. This illustrates how technological diversity could provide a poolof ideas and techniques that can be redeployed in combinatorial innovations to overcomesome of the limitations of a dominant design, and also the dangers of a tendency towardsthematic homogenisation that may reduce this diversity and hinder a renewal in AI researchtrajectories.
The discussion above motivates our analysis of the evolution of AI’s technological diversityand its links with private sector participation in AI research. More specifically, we set outto address the following research questions:1. How have the levels of activity and topical composition of AI research in arXivevolved?2. How has the level of participation in AI research by private sector organisationsevolved? Having said this, misaligned and gamed AI systems in internet platforms have been blamed for politicalpolarisation and manipulation, and for the spread of misinformation that contribute to societal conflict andincrease health risks.
11. Is the thematic diversity of AI research increasing or declining?4. How does the thematic diversity of private sector organisations compare with thosein academia and the public sector?5. In which AI research topics do private sector organisations tend to specialise?We would expect the answers to questions 1 and 2 to reflect our summary of the history ofAI research in section 2, with fast growth in the levels of AI research since the early 2010s,a stronger focus on topics related to deep learning and increasing participation by privatesector companies. In answering these questions we seek to confirm general perceptionsabout the recent evolution of AI research with a novel and timely dataset. This willalso help validate our data, our method to identify AI research and the topic modellingalgorithm we use to estimate AI’s thematic composition.The expected answer to question 3 is ambiguous. On the one hand, rapid growth in AIresearch may have increased diversification in its methods and application domains andattracted new entrants into the field, bolstering its thematic diversity . On the otherhand, the widespread adoption of a narrow set of deep learning methods may have reducedthematic diversity in AI research.With questions 3 and 4, we would expect private firms to have narrower research agendas(lower levels of thematic diversity in their research portfolio), and to be more focused onstate-of-the-art deep learning technologies with high performance and strong commercialapplications. We would expect them to be less focused on techniques developed in previouseras of AI research such as symbolic methods and statistical (non deep learning related)machine learning. We would also expect prestigious academic institutions that have levelsof collaboration with the private sector and are particularly likely to be active in publishingraces to have thematically narrower research agendas than other academic institutions.The analysis that follows does not consider all the directed technological change dynamicsthat seem to be present in AI research, such as dwindling replicability of AI researchresults, labour flows between academia and industry, the impact and commercial valueof AI technologies and their limitations when applied in real-world contexts compared toother AI techniques. All of these questions deserve a fuller treatment in future research,to which we will come back in Section 5. In order to address our research questions, we create a novel dataset that combines infor-mation from arXiv, Microsoft Academic Graph (MAG) and the Global Research Identifier. See Balland et al. [2020] for evidence that larger environments - in this case cities - are able to host agreater range of complex economic and technological activities able Source Variable Definition article arXiv title Article title article arXiv Created Date when the article was created article arXiv categories arXiv categories (scientific and technical sub-disciplines) that the articles have been labelledwith (Can be more than one per article) article arXiv abstract Article abstract institution MAG institution Institutional affiliation of article authors (set) institution
GRID Institutiontype Can be Company, Education, Facility, Govern-ment, Nonprofit, Healthcare, Other
Table 2:
Summary of key variables in the dataTable 2 presents the key variables in our data. arXiv is an online pre-print repository widely used by researchers in Science, Technology,Engineering and Mathematics (STEM) subjects to share their work before publication. In recent years arXiv has become an important outlet for the dissemination of AI researchresults close to real time, motivating our decision to use it in this analysis.arXiv pre-prints are not subject to peer review, potentially raising concerns about theirquality. It is however worth noting that there are some minimum quality thresholds to pub-lish in the platform: submissions are reviewed for relevance and new authors are validatedby others who already participate in the platform. As we show later, leading businessesand academic institutions across the world use arXiv to share their AI research suggest-ing that it is relevant for our analysis. Previous work has showed that the majority ofpapers submitted to the prestigious AI conference NeurIPS are also posted in arXiv, andfound a strong correlation between the geography of deep learning research in arXiv andpeer-reviewed AI publications as well as data-oriented technology startups [Klinger et al.,2018]. This suggests that variation in AI research in arXiv is associated with AI research inother publication channels as well as with other technological and entrepreneurial activitiesrelated to AI.One potential advantage of using arXiv over peer-reviewed publications is that it may For each article, we obtainits id, title, date when it was created, its categories (the arXiv categories that the authorslabel a article with when they submit it reflecting the scientific discipline or sub-disciplinethat it belongs to) and its abstract.
One important limitation of the raw arXiv data is that it does not contain informationabout the institutional affiliation of an article’s authors, which we need in order to identifythe companies in the data. We obtain this information from Microsoft Academic Graph(MAG), a large scientometric database collected and enriched by Microsoft Cognitive Ser-vices [Wang et al., 2020]. We query MAG with the title of the arXiv articles (see Klinger etal (2018) for additional details about our approach) and extract the institutional affiliationof each article author as of the time of publication.
We fuzzy match the institute names extracted from Microsoft Academic Graph with theGlobal Research Identifier (GRID), a public database with detailed metadata about re-search institutions globally in order to identify their type (Company, Education etc). Todo this, we use the same algorithm as Klinger et al. [2018], which combines multiple fuzzymatching methods to identify GRID institutes that have the same names as institutionsin MAG. This yields just over 1 million articles with institutional information and 2.45million article-institute pairs. There is a large number of articles with missing institutionsin the early years of arXiv operation (where as we will show AI activity was very limited)and slight growth of articles without matched institutes in recent years. Visual checks ofthe articles lacking institutional information suggests that this is because this informationis not available from MAG or because the research involves less well-known institutes in The data has also recently become available as a bulk download from machine learning competitionsite Kaggle ( ). One issue with our approach is that it generates duplicate matches for multinational institu-tions with presence in multiple locations (for example,
Google , a single institute in MAG,is matched with multiple GRID institutes including
Google (United States) , Google(United Kingdom) etc.). To address this we split name strings on parentheses, retain theorganisation name and remove duplicate article-organisations observations. This meansthat in the analysis that follows we do not consider how many times a single institutionparticipates in an article, but simply that it does.After inspecting the data manually we have identified a small number of instances ofmisclassified or missing data: DeepMind’s papers are classified in ‘Google’ by MAG, andOpenAI is not included in GRID. To address this, we scrape the research section of thewebsites of both organisations, extract the arXiv ids of their papers and reclassfy the papersin our dataset. In the rest of our analysis we assume that DeepMind papers do not involveGoogle researchers although we know this is not always the case.
Having described our data sources, we focus on the methods that we use to identify AIarticles in our corpus, the topic modelling algorithm that we use to estimate the topicalcomposition of this corpus based on its abstracts, and the diversity metrics that we estimatewith this information, as well as the sentence embedding approach we use to map thesemantic similarity of organisations participating in AI research.
Recent AI mapping studies have used various methods to identify AI research includingkeyword searches and topic modeling of abstracts to identify relevant terms [Cockburnet al., 2018, Klinger et al., 2018, Stathoulopoulos and Mateos-Garcia, 2019, Frank et al.,2019, Bianchini et al., 2020]. Here, we are interested in identifying articles using methodsfrom previous eras in the history of AI that we are less familiar with, and which mayhave a weaker presence in the corpus, making them harder to detect using for exampletopic models trained in all of arXiv, or all activity in a single arXiv category. We haveidentified several arXiv categories that are relevant for us - we refer to them as the ‘coreAI categories’. They are:1. cs.AI : Computer Science (Artificial Intelligence) We would expect the exclusion of these institutions from the data to bias our metrics of diversitydownwards. cs.NE : Computer Science (Neural and Evolutionary Computing)3. cs.LG : Computer science (Machine Learning)4. stat.ML : Statistics (Machine Learning)In total, there are just under 89,000 unique articles in these categories in the data (thereare 139,000 articles with at least one of the categories but there are significant overlapsbetween them, as Figure 1 shows). One risk of removing articles outside of these categoriesis that this could lead us to exclude important applications of AI in other fields of STEMand computing (as the General Purpose Technology - GPT - literature has shown, GPTsoften see important advances in application sectors that end ‘rolled back’ into the maintrajectory of the technology [Bresnahan and Trajtenberg, 1995]). Previous research aboutdeep learning has for example shown that arXiv categories such as cs.CV (ComputerVision) and cs.CL (Computer Language) have been important sites for AI developmentand deployment [Klinger et al., 2018]. The challenge is how to systematically incorporateinto our corpus such applications of AI outside core AI categories.In order to do this, we consider core AI categories to be research areas that specialise in thedevelopment of AI techniques that are then applied in other areas. We would expect termsrelated to those AI techniques to be over-represented (salient) in core AI categories, andpresent in other articles adopting those AI techniques. We have developed an algorithmthat identifies and expands those terms and then identifies other articles where those termsappear often. More specifically, we take the following steps:1.
Preprocess text in the corpus of arXiv abstracts : This includes lower-casing, removingsymbols, numbers and commonly used stop-words and tokenising the abstracts, andcombining commonly co-occurring tokens into bi-grams (e.g. ’machine learning’) andtri-grams (’deep neural network’).2.
Define salient vocabulary in each AI category : We Identify salient terms S i in the sub-corpora C i that belong to each of the AI categories i = { cs.AI, cs.N E, cs.LG, stat.M L } :Given the vocabulary V i in an AI category, for each n-gram t(cid:15)V i where f req i ( t ) > freq i ( t ) freq ( t ) where f req ( t ) is its frequency in the broader corpus the categorybelongs to (e.g. ngram frequencies in cs categories are normalised by ngram frequen-cies in cs, stats.ML is normalised by ngram frequencies in statistics). We select thetop 20 tokens according to this measure of salience.3. Expand the salient vocabulary for an AI category : We expand the list of salient terms S i in each category with a list of similar terms V i based on a Word2Vec semanticmodel trained on the whole corpus [Mikolov et al., 2013]. This allows us to identifyterms that appear in a similar context to salient terms in the AI category. We select the top 30 terms by similarity to those in S i with a similarity score above 0.5. Identify articles with high frequencies of terms from the expanded salient vocabulary .Count the occurrences of X i = S i ∪ V i in C i and in O i = C i (cid:116) C (all articles outsidethe category). Any article j(cid:15)O i with f req j ( X i ) > K i where K i is a critical valuebased on the mean frequency of X i in C i , the mean frequency ¯ f req of X i in O i , thestandard deviation σ of X i in O i and a scaling factor F i is labelled as belonging to E i ,an expanded AI corpus from source AI category i (here, we assume that AI articlesoutside of an AI core category will have a number of terms related to AI above theaverage for non-AI core categories, but lower than the average for AI core categories).We manually inspect the results with different parametres paying special attention to F i ,the scaling factor that we use to calculate what frequency of AI category related terms inarticles outside of a corpus indicate that it may belong to an AI category, and choose a setthat reduces the number of false positives (non-AI articles classified as AI) that may biasupwards our measures of diversity.Tables 3 and 4 shows, for each core AI category, the scaling factor we use to set ourcritical threshold (higher values mean that we require a higher frequency of salient wordsin an abstract before including it in our expanded corpus), the frequency of salient termsinside the corpus C i and outside O i , and the expanded salient vocabulary. It shows, forexample, that the salient vocabulary cs.AI and cs.NE often include quite generic terms( agent , belief , search ) which could produce false positives in our search - this is whywe set higher scaling factors for these two categories. We note the strong presence ofdeep learning related terms in the cs.NE , cs.LG and stat.ML categories and the strongoverlap between the vocabularies of cs.LG , stat.ML , reflecting the fact that articles areoften labelled with both categories. Meanwhile, cs.AI includes terms more often associatedto symbolic approaches to AI that dominated previous eras of AI research. As an example, with lower scaling factors the expansion of cs.AI created false positives with economicsresearch that also uses the language of actors, utilities and preferences. I category i F i ¯ f i ( C i ) ¯ f i ( O i ) Expanded salient vocabulary X i cs.AI 2 1.4 0.1 humans, strategic, beliefs, reasoning, reinforcement learning, causal inference, actions,agents , rules, experts, deep reinforcement learning, online learning, practitioners, artifi-cial intelligence, multi agent, reward, bayesian, engagement, learners, intention, semantics,meta learning, policies, decision, knowledge base, agents, learning algorithms, programminglanguages, recommendations, agent, decisions, knowledge, learning, cognition, reinforce-ment, belief, priority, incentives, learner, active learning, decision making, answer, expert,rewards, exploration, causal, decision tree, planning, policy, perceptioncs.NE 2 3 0.4 learning , network architectures, task, functions, recurrent neural network , supervisedlearning, unsupervised learning, generative adversarial networks gans, deep reinforcementlearning, deep neural network, semi supervised learning, algorithm, neural network, on-line learning , convolutional neural network cnn, tasks, convolutional neural networks, deepconvolutional neural networks, optimizations, machine learning algorithms, classification,multi task learning, bayesian optimization, classification tasks, deep learning, input, deepneural networks, solutions, training, convolutional neural networks cnns, search, evolution-ary, architecture, deep convolutional neural network, reinforcement learning, learning al-gorithms, network architecture, artificial neural networks, recurrent neural networks, deepneural networks dnns, optimization, networks, machine learning techniques, meta learning,convolutional neural network, genetic, algorithms, problems, neural networks, dictionarylearning Table 3:
Expansion statistics and vocabulary for cs.AI (Artificial Intelligence) and cs.NE (Neural and EvolutionaryComputing) I category i F i ¯ f i ( C i ) ¯ f i ( O i ) Expanded salient vocabulary X i cs.LG 1 1.16 0.03 learning, interpretability, regret, recurrent neural network, supervised learning, unsuper-vised learning, active learning, generative adversarial networks gans, deep reinforcementlearning, deep neural network, semi supervised learning, sgd, neural network, online learn-ing, convolutional neural network cnn, learning based, machine learning, convolutional neu-ral networks, deep convolutional neural networks, machine learning algorithms, dnn, multitask learning, deep learning, bayesian optimization, transfer learning, classification tasks,adversarial examples, interpretable, deep neural networks, reward, convolutional neural net-works cnns, dnns, federated learning, deep convolutional neural network, feature selection,reinforcement learning, learning algorithms, artificial neural networks, deep neural networksdnns, recurrent neural networks, machine learning techniques, meta learning, convolutionalneural network, sample complexity, gradient descent, classifiers, neural networks, adversarialtraining, classification task, dictionary learningstat.ML 1 1.4 0.1 semi supervised learning, architectures, reinforcement learning, learning based, learns, ar-chitecture, convolutional neural network, deep reinforcement learning, network architecture,trained, reward, convolutional neural network cnn, training, learned, neural networks, em-beddings, convolutional neural networks, deep learning, deep convolutional neural network,learners, neural network, tasks, meta learning, deep neural network, labels, gradients, trans-fer learning, learning algorithms, artificial neural networks, convolutional neural networkscnns, network trained, learning, deep neural networks dnns, unsupervised learning, multitask, adversarial training, autoencoder, adversarial, train, deep convolutional neural net-works, deep neural networks, deep, supervised learning, recurrent neural network, dictionarylearning, multi task learning, learnt, gradient descent, generative adversarial networks gans,recurrent neural networks Table 4:
Expansion statistics and vocabulary for cs.LG (Machine Learning) and stat.ML (Machine Learning) n Figure 1 we show the number of AI articles in different arXiv categories in the top, andthe share of category occurrences for a focal category in the bottom, reflecting categoryoverlaps (we focus on the top 20 categories by number of occurrences in the AI corpus).It shows that cs.LG and stat.ML comprise the biggest number of AI articles, and thatthere are strong overlaps between them ( stat.ML accounts for 40% of other category oc-currences in cs.LG articles). They are followed by cs.AI and cs.CV (computer vision) and cs.CL (computer language) two application domains that have proven fertile ground forthe deployment of deep learning techniques.Figure 2 presents number of monthly articles labelled with an AI core category (blue lines)and expanded from an AI core category (orange lines). It shows strong growth in the levelsof activity in machine learning categories ( cs.LG and stat.ML ) reflecting the adoptionof machine learning and deep learning in AI research. Growth in cs.AI has been muchslower, probably explained by the fact this category contains AI research based on symbolicmethods that are less widely used today. In the case of cs.NE we see much faster growthin the number of articles identified using the ‘expanded category’, consistent with the ideathat techniques frequently mentioned in neural and evolutionary computing such as thoserelated to artificial neural networks and deep learning are often mentioned in other articlesthat are not labelled with the cs.NE category, including in particular categories such as cs.CV and cs.LG . These findings support qualitatively the idea that machine learningand deep learning techniques have become dominant in AI research, an idea that we explorefurther in Subsection 4.2.
We topic model the thematic composition of AI research in order to calculate variousmetrics of diversity that we describe in further detail below. More specifically, we usetopSBM, a hierarchical topic modelling algorithm that uses a network science approachto estimate topics in the data [Gerlach et al., 2018]. This involves transforming the pre-processed corpus of n article abstracts into a network where different words are connectedthrough their co-occurrence in abstracts. This network is decomposed into communitiesusing the stochastic block model (SBM), a generative model for random graphs [Abbeet al., 2015]. This results in a collection of k topics T = { t , ..., t k } where:1. Each topic i has a word mix W i = { w i, , w i, , ..., w i,s } representing the probabilitythat a word belongs to it. This is also consistent with the idea that practical applications of deep learning have advanced fasterthan the theoretical understanding of the methods underpinning them. One advantage of this modeling strategy that we do not exploit in this paper is that it is also possibleto use word co-occurrences to build a network of documents that can be clustered into communities. igure 1: The top panel shows the number of AI articles per arXiv category(with double counting across bars) and the bottom panel shows the share ofoccurrences of other categories (vertical axis) in the articles from a category(horizontal axis). 21 igure 2:
Number of monthly articles labelled in an AI ‘core category‘ andin an expanded version of a category. All series are smoothed using rollingaverages.2. Each article l has a topic mix P l = { p l, , ...p l,k } representing the probability that atopic is present in the article.In the rest of the paper we think of those topic mixes as proxies for the thematic compositionof a paper and, when aggregated in ways that we describe below, as measures of thethematic composition of particular sub-corpora in the corpus, such as all AI research in ayear, or all AI research in private sector companies or particular organisations.TopSBM has some important advantages over other topic modelling algorithms applied inthe literature such as Latent Dirichlet Allocation [Blei et al., 2003]. These include the factthat it generates a hierarchy of topics that can be explored at various levels of resolution,it makes weaker assumptions about the data-generating processes, and it automaticallyidentifies the optimal number of topics in a corpus, reducing the need for manual tuningusing hard to interpret tests and heuristics.When we fit the topic model on our corpus of 115,000 AI articles it yields 750 topics at thehighest level of resolution from which we remove 193 topics that appear in more than 10% ofthe corpus, and generally contain generic and uninformative collections of terms frequentlyused in academic research such as ‘using demonstrates previously reported’ , ‘recent previousvariety ones demonstrating’ or ‘found initial investigated analyzed investigation’ , and hardto interpret topics comprising two words or less.22 .3.3 Metrics of diversity In general, diversity is used to refer to the heterogeneity of the elements of a set in relationto some class that takes different values, such as for example the species in an ecosystem, theindustries in an economy, or the ethnicity of a population [Page, 2010]. This heterogeneitycan be measured along different dimensions:1.
Variety , capturing the number of classes that are present in the set.2.
Balance , capturing whether the set is dominated by elements from a few classes orthe proportions of different classes in the data are evenly distributed.3.
Disparity , capturing the degree of difference between the classes that are present inthe set.In general, we expect a set with more classes, with a population more widely distributedbetween classes, and with classes that are more different from each other to be more diverse.In Section 4.3 we take the corpus of AI research in arXiv as our population, and researchideas and techniques belonging to different topics as the classes. We use these to calculatethree metrics of diversity that put different emphasis on the dimensions above. Theyinclude a:1. A metric of balance that considers the concentration of AI research on different topics2. Weitzman’s metric of ecosystem diversity based on overall distance (disparity) be-tween topics in the corpus [Weitzman, 1992]3. The Rao-Stirling measure of diversity, that takes into account the balance and dis-parity of topics in the corpus [Stirling, 2007]By using these three metrics in parallel, and parametrising them in alternative ways, weaim to ensure that our findings are robust to various definitions and operationalisationsof diversity, and also to increase their interpretability by considering them from differentperspectives. Tables 5 and 6 respectively define our metrics and the parametres we haveused to operationalise them.We note some important differences between our metrics that will bear on their inter-pretation. The balance and Stirling metric are similar in that both of them consider thedistribution of topics in the population, in the case of Rao-Stirling weighted by the distancebetween topics that we measure using pairwise distances in the topic distance matrix, withtopics closer within that distance matrix (based on their co-occurrence in articles) appear-ing closer to each other. This means that if there are many topics in the corpus but they23 etric Definition OperationalisationBalance Distribution of classes Hirschmann-Herfindahl (HH) indexand Shannon entropy based on theshare of all topics or articles ac-counted by a topicWeitzman Sum of distances of the dendrogramrepresenting a hierarchical clusteringalgorithm trained on the data Distance measures based on topic co-occurrence in articlesRao-Stirling Product of shares of classes by theirpairwise distances Shares based on topic presence incorpus or article, distances based ontopic co-occurrence in articles
Table 5:
Definition and operationalisation of metrics of diversity
Metric Parametre Set 1 Parametre Set 2 Parametre Set 3Balance HH index of topic distri-bution over population oftopics present in the cor-pus HH index of topic dis-tribution over articles as-signed to their top topic Shannon entropy of topicdistribution over popula-tion of topics present inthe corpusWeitzman Cosine distance betweentopics Chebyshev distance be-tween topics Jaccard distance be-tween topics (binarisingon topic presence)Rao-Stirlingindex Rao-Stirling index oftopic distribution overpopulation of topicspresent in the corpuswith a threshold above0.1 and cosine topicdistance Rao-Stirling index oftopic distribution overpopulation of topicspresent in the corpuswith a threshold above0.1 and correlation topicdistance Rao-Stirling index oftopic distribution overarticles assigned to theirtop topic and cosinetopic distance
Table 6:
Parametre set details by diversity metric24ccount for a small share of the total, this will reduce the corpus’ diversity. Weitzmandiversity, by contrast, does not consider the distribution of topics in the corpus but onlytheir distance (again based on co-occurrences). This means that it would be possible tohave an extremely concentrated topic distribution with very high diversity if there is along tail of minority topics distant from the dominant ones. In a way one could thinkof the Weitzman metric of diversity as an indicator of the diversity of classes hosted in apopulation (perhaps even its potential to generate future diversity) rather than its actualmanifestation in terms of the importance of these classes. In his economic analysis of eco-logical diversity (with an application to a bird species - cranes), Weitzman complementedan initial analysis of diversity using this metric with an analysis of the economic value ofpreserving diversity that took into account the size of different classes - we do not takethat second step here [Weitzman, 1993].
In the final part of our analysis we will consider the position of organisations participatingin AI research in a semantic space that we map using BERT (Bidirectional Encoder Repre-sentations and Transformers), a deep learning technique that learns vector representationsof text based on their context [Devlin et al., 2019]. We train this model on the 1.8 mil-lion abstracts in the arXiv corpus, which yields a 674-dimensional representation of eacharticle. We calculate the average of these vectors for the articles involving organisationsof interest and visualise their positions using tSNE (t-Stochastic Neighbour Embedding),a dimensionality reduction technique that projects high-dimensional data in a 2-d or 3-dspace [Maaten and Hinton, 2008].
We begin our report of findings by considering the evolution of AI research in arXiv. Isthe widespread perception of an ‘AI boom’ reflected in our data?Figure 3 shows the monthly number of papers published in AI research and the rest of thearXiv corpus (top), and the share of the arXiv corpus accounted by AI papers (bottom).By contrast to non-AI categories, where the number of articles has grown at a steady pacesince the 1990s, growth in AI research has been more recent, and taken place very rapidly.Remarkably, in the first half of 2020 AI articles comprised 20% of all publications uploadedto arXiv. 94% of all the AI papers in our corpus were published after 2012, and 60% after2018. 25 igure 3:
The top panel shows the total number of monthly articles postedin arXiv in AI and all other categories. The bottom panel shows AI articles asa share of the arXiv corpus. Both series are smoothed using rolling averages.Has rapid growth of activity in arXiv has been accompanied by shifts in its topical com-position? To answer this question, we have assigned each topic in our data to the arXivcategory where it has the highest salience.We calculate this analogously to a location orrevealed comparative advantage quotient: Q i,c = s i,c s i (1)Where s i,c is the share of topic i in the category c and s i is the share of the topic in thecorpus. Q i > stat.ML refer to statistical techniquessuch as Gaussian processes and Bayesian inference while cs.CR (Cryptography and Secu-rity) contains topics about software vulnerabilities, privacy, adversarial attacks againstdeep learning systems and authentication. cs.CY addresses a range of societally orientedtopics related to health (and more specifically pandemics, including AI research on Covid-19), education and ethical issues. These results suggest that our topic model capturesrelevant information about the thematic composition of different AI sub-fields.In Figure 4 we visualise the evolution of the relative importance of topics associated todifferent arXiv categories as S c,y share of all the topical activity in the corpus in a given26ear t . We calculate this share , for category c , as: S c,y = (cid:80) i = c,l(cid:15)P,y = t B i,l,t (cid:80) i(cid:15)C,l(cid:15)P,y = t B i,l,t (2)Where B i,l = 1 if p i,l > k . k is our threshold for accepting that a topic is present in apaper, which we set to 0.1. Figure 4:
Evolution of topical composition of AI research: each topic has beenlabelled with the arXiv category where it has the strongest representation. Thepopulation each year is the sum over categories of all topics occurrences over0.05 in a paper.Consistent with the narrative of a ‘deep learning’ boom, it shows rapid growth in the im-portance of topics salient in cs.CV (Computer vision) and cs.CL (Computer Language),two domains with an abundance of unstructured data where deep learning has contributedto important advances. Since the mid 2010s, cs.CR (Cryptography and security) has alsogained prominence - this is linked to growing interest in the risk of adversarial attacks, aswell as privacy and cyber-security. Other topical areas that have seen growth include cs.CY (Computer and Society), with research about AI applications in health, its educational as-pects and ethical implications, cs.IR (information retrieval), linked to the development ofsearch engines and recommendation system, an important AI application area for technol-ogy companies, and cs.RO (robotics). At the same time, we see a decline in the relativeimportance of topics related to cs.AI , which tend to be more focused on symbolic tech-niques and slight decline (specially since the mid 2010s) of stat.ML (statistical machinelearning) topics involving various machine learning techniques outside of deep learning.Interestingly cs.NE has seen a decline since the 2010s despite the increasing popularity of‘neural computing’ techniques. One explanation for this result is that many deep learning-27elated techniques have been developed in application domains such as computer vision andcomputer language, reflecting the idea that GPT in application sectors can be as importantas those that take place in the GPT development sector, as well as the fact that practicalapplications of deep learning have progressed faster than theoretical analyses.Figure 4 also suggests qualitatively that the thematic diversity of AI research has increasedover time, with less concentration of topical activity in a small number of categories. Thefigure tells us little, however, about the thematic composition of AI at the level of individualtopics, or about the extent to which the topics being pursued have high disparity, that is,are significantly different from each other. We will come back to these questions in Section4.3. Before doing this, we will consider the levels of private sector participation in AIresearch.
As we discussed in Section 4.2, private sector companies - and in particular ‘tech’ companies- have played an important role in recent AI breakthroughs. To which extent is thisphenomenon visible in our data?Figure 5 compares the number of participation of various organisation types in AI researchwith the situation in other research areas. More specifically, this is calculated as the totalnumber of articles involving an organisation types normalised by the number of articlesinvolving all organisations. It shows, perhaps unsurprisingly, that academic institutions(
Education ) are the most active in the corpus. We also find strong evidence for the ideathat the private sector is relatively active in AI research: private companies account foraround 10% of all research participations, a share ten times higher than their participationin arXiv outside of AI. By contrast, Government, nonprofit and Facilities are less active inAI research compared to their shares of participation in the wider corpus.
Figure 5:
Level of participation of different organisation types in AI researchcompared to the wider arXiv corpus. Participation is measured as the share oforganisation type participation (number of articles) normalised by all researchparticipations 28 ategory Examplescs.AI artificial intelligence artificial computer intelligence life...machine machines pattern recognition steady turing machine...describe elements description forms element...proof proofs theorems automated reasoning conjectures...cs.NE capacity coefficients coefficient analytically permutation...development engineering prototype creation organization...describing chapter ais dca som...building paper presents modelling built integration...cs.LG emph algorithmic line satisfy arising...arms arm thompson sampling bandit bandits...loss losses cross entropy cross entropy loss sharpness...memory stored store storing memories...stat.ML gaussian process gaussian processes gps inducing gaussian process regres...independent mean dependent fraction moment...signal signals signal processing snr complex valued...bayesian likelihood posterior bayesian inference posterior distribution...cs.CV style art sketch styles sketches...source domain adaptation target domain cross domain unsupervised domain˙..tracking tracker speaker verification front end speaker recognition...spectral spectra laplacian band bands...cs.CL rare drug medicine biomedical drugs...natural language meaning symbolic descriptions compositional...sequence sequences hmm hidden markov variable length...vector vectors proximity inner product dot product...cs.CR security iot secure protection vulnerabilities...privacy private differential privacy differentially private privacy prese...adversarial adversarial examples adversarial training adversarial attacks˙..authentication fingerprints biometric fingerprint pain...cs.IR feature selection classification regression elm linear discriminant analy...items recommendation recommendations item recommender systems...emotion emotions emotional stress emotion recognition...retrieval hashing image retrieval triplet triplet loss...cs.CY infection pandemic epidemic infected virus...program programs programming induction syntax...students course education educational university...law society legal ethical stakeholders...
Table 7:
Random salient topics in selected arXiv categories29igure 6 shows the evolution in the shares of research participations by different organ-isation types excluding academic institutions. The right panel shows that private sectorparticipation in AI research is a recent phenomenon and that it has almost doubled in thelast decade. Government institutions used to play a more important role in AI researchbefore 2010, although there were very low volumes of AI research at that point. Healthcareand nonprofit participation in AI research has remained stable in recent years, and belowthe levels we see outside AI research (which we present in the left panel).
Figure 6:
Evolution of participation of different organisation types outside ofAI research (left) and in AI research (right). The scale in the Y axis is differentbetween both panels. Series are smoothed using a 10 month rolling average.Figure 7 shows levels of AI activity involving the 15 companies most active in the corpus.Our results confirm the idea that technology and internet companies are actively involvedin AI research: the top 5 companies in our list are US technology companies - Google,Microsoft, IBM, DeepMind and Facebook. While Google, Microsoft and IBM have beenactive in AI research since the early 2000s, DeepMind and Facebook only started gainingprominence since the mid 2010s. We note the presence of several Chinese internet com-panies (Tencent, Alibaba and Baidu) in our data, as well as two European manufacturingbusinesses (Bosch and Siemens) and semiconductor manufacturers Intel and Nvidia.When we focus on the share of AI research involving these companies in the bottom panelof Figure 7, we find, perhaps surprisingly, that their share of AI activity has declined sincethe mid 2000s, when almost 60% of AI papers involved them, with particularly strong levelsof participation by Microsoft, IBM, Google and Intel. It is worth noting that the overalllevels of AI research were very low at that point (see top panel). If we consider more recentyears when the levels of AI research in arXiv start increasing, we find an initial increase inthe share of AI research involving these companies, reaching around 25% of all AI researchbetween 2015 and 2019. Since 2019 this share has been declining although this is drivenby rapid growth in the volume of AI research in wider arXiv rather than a decline in AIpublication levels in these companies. Despite this, in the first months of 2020, almost two30 igure 7:
The top panel presents the evolution in the number of AI papersinvolving the top 15 companies by total levels of activity in the AI arXiv corpus.The bottom panel presents the evolution of the share of AI papers involvingthose same companies. Both series are smoothed using an 8-month rollingaverage.in ten AI research papers in arXiv involved one of these private sector companies.
In the rest of the paper we focus on the evolution of thematic diversity in AI research,and on comparing the thematic diversity of AI research involving private sector companieswith those in other organisational categories.Figure 8 presents the evolution of thematic diversity in AI research based on our threediversity metrics and three parametre sets. In rough terms, all our operationalisations ofdiversity present a similar picture of the evolution of thematic diversity of AI research,with an initial increase in diversity as the volume of AI research increased followed by31tabilisation / stagnation and perhaps even a slight decline in recent years.
Figure 8:
Evolution of thematic diversity in AI research. Each panel presentsthe evolution of diversity according to one of our metrics (balance, Weitzmanand Rao-Stirling) and different parametre set. We have calculated the z-scoreinside each of our time series to make them comparables across parametre setsyielding different absolute values.The balance metric (which focuses on the distribution of activity over topics) and the Rao-Stirling metric (which weights balance by the distance between topics) present very similartrends, with a strong increase of diversity after 2012 followed by stabilisation from 2017onwards. The Weitzman metric, which considers the total distance between topics in thecorpus but not their distribution is smoother, and its temporal sequence is different, withearlier growth and stabilisation of diversity or (in the case of parametre set 1 where wecalculate topic distance using the Chebyshev metric) a recent increase of diversity followingby a drop in 2020. . This decline may be linked to the fact that our 2020 corpus does not include a full year of activity,reducing the number of topics present in the topic and therefore the sum of distances between them whichis used to calculate the Weitzman metric) The slight increases in the series suggest an increasing propensity for im-portant topics to co-occur together in a way that could reduce their distance, the disparityin the corpus and its thematic diversity.We dig deeper into the evolution of the distance between AI topics by comparing topic co-occurrence networks between 2013-2016 and 2019-2020. In these networks, each node is atopic and the edges between them are the number of times they co-occur in articles. . Wedisplay the resulting networks in Figure 10, where in order to simplify the visualisation wefilter the co-occurrence network with a maximum spanning tree algorithm that preservesthose edges with the largest weights that return a maximally connected network. The sizeof a node represent the topic’s number of occurrences in the period, and its colour the arXivcategory where the topic is most salient (calculated using the approach that we outlinedin sub-section 4.2). Those topics that are not salient in any of the key categories in thelegend are displayed without a colour.The networks illustrate the important thematic transition undertaken by AI research inrecent years. The 2013-2016 network displayed in the top panel is dominated by topicsrelated to cs.AI and stat.ML . The 2019-2020 network in the bottom panel is more denselyconnected, and deep learning-related categories such as cs.CV , cs.CL and cs.CR appearmore prominently, in several cases forming communities of frequently co-occurring topics.The statistics of network connectivity and distance presented in table 8 are consistentwith the idea that the distance between topics in the topic co-occurrence network havedecreased over time in a way that could reduce disparity and the stabilisation or decline ofdiversity visible in Figure 8: the older network had more disconnected components, and, inits largest connected component, a longer diameter (maximum distance between topics),and a larger average path length (average number of steps that have to be taken to reachall other topics in the network from a given topic). Eigenvector centrality is defined as the centrality of a node (its number of connections) weighted bythe eigenvector centrality of the nodes it is connected with. We consider that a topic occurs in an article if the probability estimated by our topic model is higherthan 0.1 igure 9: The top panel presents the share of topic occurrences accountedby the most important topics at different levels of the distribution. The bot-tom panel presents the mean eigenvector centrality in the topic co-occurrencenetwork for topics at different levels of the corpus occurrence distribution.
The results presented above suggest that thematic diversity in AI research has stagnated inrecent years. One potential mechanism for this kind of decline put forward in the directedtechnological change literature (and consistent with some of recent trends in AI researchoutlined in subsection 4.2) is the presence of organisations with strong incentives to focuson those technologies that perform better presently at the expense of alternatives thatwould preserve technological diversity. We would expect those incentives to be stronger inthe private sector than in academia and the public sector. Here we explore this questionempirically.We begin by comparing thematic diversity in the corpus of AI research involving the private34 igure 10:
Topic co-occurrence graphs for 2013-2016 (top panel) and 2019-2020 (bottom panel). Each graph is the maximum spanning tree of the networkof topic co-occurrences in articles published in the year above a minimumthreshold of 0.1. The size of a node represents the number of instances itstopic occurs in the topic and its colour the arXiv category where it is mostsalient. 35etwork Number of components Average Path Length DiameterNetwork 2013-2016 13 5.823 14Network 2019-2020 10 5.157 12
Table 8:
Network statistics for topic co-occurrence networks including numberof (disconnected) components in the network, average path length between top-ics in the largest component of the network and diameter (maximum distancebetween nodes in the largest connected component).sector and the rest of the corpus, after which we perform a multivariate analysis of driversof thematic diversity at the organisation level and map AI research organisations in asemantic space whose structure could shed some light on the state and recent evolution ofthematic diversity in AI research. We conclude by considering the AI research topics thatprivate sector companies tend to specialise on.
Figure 11 compares thematic diversity in the corpus of AI research involving private sec-tor companies and the rest of the corpus. We seek to account for potential confoundersbetween these variables by only including in the comparison articles published after 2015and involving at least one US institution. We find that the levels of thematic diversity inresearch private sector organisations is lower than the levels of thematic diversity in therest of the corpus for all metrics and parametre sets. This is consistent with the idea thatthe private sector has a narrower thematic focus than research in other domains (and mostspecifically academia, where the majority of AI research still takes place).Another potential confounder in the relation between thematic diversity and organisationtype is the size of the corpora that we are comparing: if thematic diversity tends to belower in smaller corpora and the corpus of private sector AI research activity is smaller thanthe academic corpus (as mentioned before, the top companies in AI research accounted foraround 18% of all papers in 2020), then this could explain the differences highlighted above.To account for this, we extract random samples of the same size (1000 articles) from eachof our sub-corpora (research involving private sector and non private sector organisations)and use them to calculate thematic diversity with our three metrics and parametre sets.Figure 12 compares the mean scores for our metrics and parametre sets with 30 randomsample draws per metric / parametre set and sub-corpus. We still find that thematicdiversity in AI research is consistently lower for all metrics and parametre sets.36 igure 11:
Thematic diversity in AI research after 2015 and involving /excluding private sector companies according to different metrics of diversityand parametre sets. The scale in the Y axis is different across charts
Having compared the thematic diversity of AI research involving and excluding privatesector organisations at the field level, here we focus on the drivers of thematic diversityat the organisational level within a linear regression framework. This allows us to startbuilding a more finely grained understanding of the micro-dynamics of field level diversity,and to account for heterogeneity of different organisation types.Our model setup is thus: d i,m,p,y = α + β is comp i + β log ( article n i,y ) + β y + (cid:15) i (3)Where d i,m,p,y is thematic diversity in organisation i according to metric m and parametreset p in year y . is comp is a dummy capturing whether i is a company or not, and log ( article n i,y ) is the number of articles published by the organisation in y (logged). Weinclude year fixed effects and, in one of the specification of the model, organisation fixed37 igure 12: Thematic diversity in AI research after 2015 and involving /excluding private sector companies according to different metrics of diversityand parametre sets, calculated in samples of 1000 articles drawn from eachsub-corpora, with 30 runs per comparison. The scale in the Y axis is differentacross chartseffects to capture unobservable sources of heterogeneity between organisations.We are specially interested in the coefficients for β , the estimate of the link between anorganisation type and its thematic diversity after we control for other important factors, inparticular its levels of research activity, which could impact independently on its thematicdiversity.We focus our analysis on 188 organisations with at least 75 AI publications in the data,and in the last three years (2018,2019 and 2020).Before presenting our results, we should highlight that the analysis we present here is notaimed at estimating causal effects - it does not even include a clear treatment as such - butto generate additional evidence about the link between thematic diversity and organisationtypes. Section 5 considers next steps for the research aimed at identifying causal drivers38f diversity in AI research.Table 9 presents our regression results.The coefficient β is negative for balance and Rao-Stirling metrics of diversity. Interestingly,the effect gains strength and significance when we introduce organisation fixed effects,suggesting some heterogeneity inside organisation types. We explore the interpretation ofthis result in further detail below by looking at the fixed effects of individual organisationsin the data. The association between company status and Weitzman diversity are alsogenerally negative but weaker, and in one case (parametre set 1) are positive. This pointsat important differences between the interpretation of the Weitzman indicator and theother metrics of diversity highlighted in Section 3.In all cases, β , the coefficient for the link between number of papers and diversity metricsis positive, consistent with the idea that broader corpora and larger organisations are, otherthings equal, able to maintain more thematically diverse research profiles.The coefficient β for the link between time and thematic diversity is generally positive,suggesting an increase in thematic diversity over time at the organisational level. We notethat this could still lead to an stagnation or decline in thematic diversity in the aggregateif there is homogeneity in the topics being pursued by different organisations and/or thecomposition of the field changes in a way that gives more importance to the activities ofless thematically diverse organisations.Figure 13 presents the coefficients for the organisation fixed effects on the Rao-Stirlingdiversity metric with parametre set 2 in the left panel, and number of papers involvingthe organisation in the right panel. Our fixed effects capture an organisation’s diversitycompared to the rest of the population after adjusting for its logged publication levels,publication year and its organisation type (whether it is a company or not). The group ofmost thematically narrow organisations in the data comprises some of the largest and mostprestigious US institutions in AI research (MIT, Carnegie Mellon, University of CaliforniaBerkeley and Stanford) as well as Google and Microsoft. We note with interest the lowlevels of thematic diversity in the research profile of OpenAI, a previously mentioned not-for-profit which has focused its research activities on the development of large scale languagemodels requiring large datasets and computational infrastructures.Although we advise caution in the interpretation of these results (the wide confidenceintervals mean that many of the results are not statistically significant), these estimatessuggest that even though, on average, educational institutions tend to be more thematicallydiverse than private companies, there are some important differences inside both groups:some of the most prestigious and active US universities seem to be thematically narrower As shown throughout, the Rao-Stirling diversity results are very similar for different parametre sets,an to the balance metrics - we choose parameter set 2 because it has a better goodness of fit than thealternative specifications. R N
564 564 564 564 564 564Fixed Effects No Yes No Yes No YesWeitzman 1 1 2 2 3 3Company index 0.04* 0.09* -0.01* 0.04* -0.1*** -0.23***(0.69) (0.89) (-0.15) (0.43) (-4.41) (-2.84)Number of papers (log) 1.22*** 0.86*** 1.24*** 0.96*** 1.26*** 1.08***(39.27) (18.89) (44.25) (24.75) (67.08) (30.12)Year -0.01* 0.01* -0.01* 0.01* 0.01* 0.02**(-0.57) (0.89) (-0.6) (0.74) (0.96) (2.27) R N
564 564 564 564 564 564Fixed Effects No Yes No Yes No YesRao-Stirling 0 0 1 1 2 2Company index -0.09* -0.26* -0.02* -0.12* -0.31*** -0.72**(-1.28) (-0.69) (-0.32) (-0.28) (-3.66) (-2.22)Number of papers (log) 0.85*** 1.12*** 0.72*** 0.99*** 0.99*** 1.37***(12.75) (8.89) (9.56) (6.71) (13.26) (9.3)Year 0.02* 0.0* 0.01* -0.0* 0.04* 0.02*(0.39) (0.08) (0.19) (-0.1) (1.02) (0.62) R N
564 564 564 564 564 564Fixed Effects No Yes No Yes No Yes
Table 9:
Regression results for various diversity metrics (Balance in top table,Weitzman in middle table, Rao-Stirling in bottom table) and parametre sets(see columns). t-values in parentheses. *** p < < < Our analysis above suggests that some private sector companies tend to be thematicallynarrower than other institutions in the corpus. But what topics do they focus on specifi-cally?To answer this question, we have compared the share of AI articles involving private sectorinstitutions that have a topic with the share of AI articles that do not involve private sectorinstitutions. This is a proxy for the levels of relative specialisation (or over-representation)of private sector AI research in particular topics. We present the results in Figure 14, andsome notable examples in Table 7.We find that private sector companies tend to be more specialised in arXiv categories suchas cs.CL (Computer Language), cs.DC (Distributed and parallel computing), cs.IR (infor-mation retrieval) and cs.CV (computer vision). They are less focused on cs.AI (ArtificialIntelligence concentrating on symbolic techniques), cs.NE (Neural and evolutionary com-puting),
Stat:ML (statistical machine learning) and cs.CY (Computers and Society). Thisis consistent with the idea that private sector companies focus on applications of AI enabledby deep learning and in research to advance the computational infrastructure required bylarge-scale, safe deep learning research. They are less focused on AI techniques outside ofdeep learning, and on wider AI application and implications.We illustrate these differences by considering some notable topics. As table 10 shows,private sector AI research specialises in topics related to recommendation systems andadvertising (keys 2 and 3), and research in areas that complements deep learning, such asgraphical processing unit optimisation (key 1) or the analysis of adversarial examples thatcould threaten the performance of deep learning systems. In less applied categories suchas cs.AI or cs.NE , private sector companies focus on techniques allied to the deep learningdesign such as reinforcement learning or recurrent neural networks. We also note thatcompanies are also over-represented in a topic that mentions code and GitHub (a widelyused code sharing repository), suggesting that private sector researchers are more likely torelease open source code for others to adopt and build on, encouraging the disseminationof the techniques that they develop.When we consider examples of the topics that academic researchers tend to focus on, wefind that they are relatively specialised in AI applications in health, including analyses ofMRI scans (key 4), healthcare systems (key 12) and infections and pandemics (key 14), aswell as research that considers the ethical and legal implications of AI.41 igure 13: The right panel shows fixed effects and confidence intervals for thecoefficient of top organisations in the Rao-Stirling specification with parametreset 2. The left bar shows the number of AI publications by organisations in theperiod. The colour of the bar shows the organisation type according to GRID.42ey Topic label arXiv category Specialised1 optimizations gpu gpus tensorflow cpu cs.DC
Company
Company
Company
Company
Company
Company
10 recurrent lstm rnn recurrent neural networksrnns cs.NE
Company
11 swarm pso abc particle swarm optimizationmetaheuristic cs.NE Academia12 clinical patients medical patient healthcare cs.CY Academia13 law society legal ethical stakeholders cs.CY Academia14 infection pandemic epidemic infected virus cs.CY Academia
Table 10:
Legend for topics in Figure 14. The key is the key for the topic inthat figure, the topic label its name, the arXiv category is the category wherethe topic is most salient, and specialised shows whether companies or academicinstitutions are most specialised in the topic.43 igure 14:
Each scatter presents, for each arXiv category, the share of AIresearch involving private sector organisations (orange point) and public sectororganisations (blue point) with a topic. Topics are assigned to arXiv categoriesfollowing the approach presented in subsection 4.2, and sorted by their overallimportance in the category. The categories are sorted from left to right andtop to bottom based on the mean difference of shares of activity in privatesector AI research and public sector AI research for all topics in the category.We have highlighted in red some notable topics (see table 7 for topic additionalinformation) 44 .4.4 Semantic map of AI research organisations
We conclude our findings with a semantic map of organisations active in AI research. Inorder to produce these maps, we characterise the research profiles for these organisationsthrough the sentence embeddings (vector representations) of the papers that they haveparticipated on, and project these profiles on a two-dimensional space that we can visualiseusing tSNE, a dimensionality reduction technique.We present the results in Figure 15, where we show three plots where we progressively‘zoom out’ from the most active organisations (based on their AI publications in 2020) toencompass a wider range of organisations. We annotate the first figure with dashed circlesto highlight three qualitative clusters of institutions with semantically similar researchprofiles.1. The red cluster includes large technology companies such as Google, Microsoft, Face-book, Amazon and DeepMind.2. The green cluster includes various elite US research institutions such as MIT, StanfordUniversity, Carnegie Mellon as well as prestigious universities outside of the US suchas the University of Toronto and Oxford University.3. The blue cluster includes various Chinese universities such as Tsinghua University,Zheijang University and Peking University, togethet with the Chinese Academy ofSciences and Chinese technology companies such as Tencent and Alibaba.These results support the idea that our map captures meaningful patterns in the data,and suggest that geographical and political differences between AI research ecosystems(e.g. China and the US) could enhance thematic diversity. The plots also show thatlarger organisations (those that were already present in the initial ‘zoomed in’ plot) tendto cluster semantically as we broaden our focus, with smaller institutions more widelyscattered across the map. This is consistent with our previous observation that smallerresearch institutions that are perhaps less involved in collaborations with private sectorcompanies and publication races that induce institutional isomorphism in the field couldhelp preserve thematic diversity in AI research. At the same time, we note that someprivate companies such as DeepMind, Amazon or Facebook are more distant from the coreof the field, suggesting some differences between their research profiles and the mainstream.45 lite universities (USA)Technology companiesChinese institutions and companies
Figure 15: tSNE visualisations of AI research organisations based on theirresearch profiles. Each of the plots visualises an expanding subset of the or-ganisations active in AI research based on their number of articles in 2019 and2020 (top left shows the top 100 organisations, top right the top 500 and bot-tom left the top 1000). The size of the nodes represents the number of articlespublished by the institutions in that period, and the color their organisationtype based on the GRID classification. We annotate three notable clustersof organisations with dashed circles in the first plot. They are US technol-ogy companies (in green), elite universities (red) and Chinese universities andtechnology companies (in blue). 46
Conclusion
We have studied the thematic diversity of AI research in the arXiv pre-print corpus using avariety of metrics, parametre sets and approaches. We do this motivated by the literatureon directed technological change, which has identified a set of processes that may leadto a loss of diversity in a technology landscape and its dominance by technologies thatare (or are eventually found to be) inferior. Recent trends in AI research that we alsodocument suggest the presence of such trends as powerful deep learning systems buildmomentum despite some misgivings about their limitations, the field becomes increasinglydominated by private sector companies focused on technologies that complement theirassets and capabilities, and processes of institutional isomorphism lead to a convergence inthe behaviours and profiles of organisations in the field.Our findings are broadly consistent with the idea of a narrowing of AI research: after aninitial period where thematic diversity in AI research increased as deep learning techniquesemerged and started to be deployed in a variety of settings, thematic diversity has sta-bilised and perhaps even started to decline in recent years. This process involves greaterconcentration of research activity in popular research techniques (topics) and a decline inthe disparity of these techniques.Our comparison of the thematic diversity of private sector organisations with other partic-ipants in AI research suggests that the former are more narrowly focused on a specialistset of techniques related to the digital economy such as computer language, computervision and information retrieval (including search engines, recommendation systems andad-targeting). These companies also specialise in research topics about computationalinfrastructures to scale up deep learning systems and cutting-edge techniques such as re-inforcement learning. Private sector companies pay less attention to health applications ofAI and ethical and legal considerations.We also find some evidence of institutional isomorphism in the field: elite universities in theUS have comparatively narrow research profiles, and are clustered closely in our semanticmap of AI research. Smaller academic institutions including many outside of the USA aremore widely scattered across this semantic map, suggesting that they might be helping topreserve thematic diversity in AI research.
Our analysis does not consider the complex mechanisms underpinning the co-evolution oforganisations and technologies in AI research: for example, we tacitly assume that theprominence of private sector companies in AI research is shaping the trajectory of the47eld. This assumption needs to be tested empirically with a strong focus on identifyingthe mechanisms through which this ‘shaping’ takes place. For example:1. Have key contributions by researchers in the private sector provided building blocksfor subsequent work by others in the field? Did they foreclose alternative AI trajec-tories?2. To which extent are labour flows and collaborations between academia and industrydriving the processes of institutional isomorphism and narrowing of technologicaltrajectories that we allude to in this paper?3. How are competitive dynamics in academia - including races to publish and presentresearch in high profile conferences - independently contributing to a narrowing ofAI research?4. What has been the impact of the recent influx of public funding in AI research, insome cases with the explicit goal of ‘winning the AI race’ impacted on the dynamicsthat we have described here?5. What are the links between lack of socio-demographic diversity in the AI researchworkforce that has been evidenced elsewhere and a thematic narrowing (potentiallyleading to discriminatory and unfair outcomes) of AI research?Micro-analyses of the behaviours of individual researchers, organisations and communitiesand their interconnections could shed light on these questions, with important implicationsfor research funders who may want to preserve thematic diversity in AI research.This begs another important question: throughout our analysis we have assumed that thereis an intrinsic value in technological / thematic diversity. This is motivated by the historyof AI, where previous efforts to preserve technological diversity provided the foundationfor subsequent advances (including the back-propagation algorithm which, as our epigraphnotes, for decades was ‘cool math that didn’t accomplish anything’ but eventually provideda key building block for the deep learning revolution), and by concerns about importantlimitations in the deep learning trajectory that dominates the AI landscape today. Atthe same time, much thematic diversity in AI research may be dysfunctional, reflecting forexample scholarly inertia and lack of skills or infrastructure preventing academic researchersfrom adopting state-of-the-art techniques. It could also be the case that some of our metricsof diversity, by estimating this variable in relative rather than absolute terms, may beunderestimating the range of ideas and methods being explored by a growing communityof AI researchers.Future research should set out to quantify the value of thematic diversity in the terms thatwe have discussed in the paper: how does the work of organisations with more diverse48esearch profiles contribute to advances in the field, for example by providing componentsfor combinatorial AI innovations? Recent years have seen a growing number of exampleswhere deep learning methods are combined with methods from other AI traditions such asrandom forests, bayesian inference and causal inference [Miller et al., 2017, Kendall andGal, 2017]. To which extent are these novel combinations helping overcome some of thelimitations of the dominant AI design, and enabling the development of AI applicationsin novel domains that are poorly served by state-of-the-art commercial technologies? Thisfocus on evidencing the benefits of technological diversity could be complemented withanalyses of the downsides of homogeneity in terms of diminishing returns in the deeplearning trajectory and deployment failures that could have been avoided with a morediverse mix of techniques. A natural expansion of our analysis here would be to imitateWeitzman’s strategy for the economic valuation of ecological diversity with an analysis thatconsiders the costs and benefits of preserving diversity in AI research taking into accountcurrent levels of activity in different topics and the ‘minimum viable threshold’ below whicha research topic becomes unsustainable [Weitzman, 1993].Finally, our analysis is based on a single data source about research pre-prints that we anal-yse using experimental semantic methods. Going forward it will be important to expandthis analysis using other data sources capturing AI development and deployment includingpeer-reviewed publications, patents, open source software development and business devel-opment and diffusion activities. In addition to triangulating our findings, such extensionswould help to identify bottlenecks in the deployment of AI systems and opportunities toovercome them with other methods from thematically diverse AI corpora. Further analysesof thematic diversity in AI using alternative metrics and operationalisations and measure-ment approaches such as for example citation patterns along the lines of Frank et al. [2019]would help validate the results we have presented here. To enable such analyses, expan-sions and validations, we have released all the code that we have used in our analysis aswell as the underlying data. This paper has presented a theoretical, qualitative and quantitative rationale for policies topreserve thematic diversity in AI research. Although we currently lack sufficient evidenceabout mechanisms to confidently recommend specific interventions to that end, we notesome available options highlighted elsewhere in the literature and in some cases alreadystarting to be implemented in practice such as: • Increasing diversity in the backgrounds and disciplines of the individuals and groupsinvolved in AI research The code can be accessed here . A list of links to relevant datasets in Figshare can be accessed here . Reducing the brain drain of academic researchers from academia to industry • Ensuring that academic researchers have sufficient computational resources to under-take their work independently from the private sector • Modifying incentive structures in academia to reduce the intensity of research races • Developing funding models and grant assessment strategies that help diversify theset of ideas that are supported • Using horizon scanning methods to develop funding agendas that take into accountcounterfactual trajectories in AI research • Funding mission-oriented work to deploy AI systems in domains that may requirenew techniques and combinations of techniques • Developing new benchmarks of AI system performance that capture the strengths ofalternative designsWe conclude by highlighting three important challenges standing on the way of thesepolicies.First, there is an incentive problem : policymakers face a similar dilemma to AI researchers:they have incentives to support those AI technologies with the greatest present potentialbecause they are more likely to produce economic impacts and help develop competitiveAI industries. The notion of an AI global race provides an additional geopolitical push tofocus on today’s state-of-the-art technologies at the expense of investments in technolog-ical diversity that may yield benefits in the future or be captured by other organisationsor countries. Such externalities could be internalised if policymakers in different coun-tries coordinated their activities, but the perception of a global AI race will hinder suchcoordination.Second, there is an information problem : even if they overcame the incentive problem,policymakers would need to develop strategies to identify suitable research lines that couldhelp preserve thematic diversity in AI. Here, they face an information asymmetry withresearchers and businesses that have private information about the quality of their research(including its uniqueness and how it could help to diversify AI research) and are thus ableto behave opportunistically to secure funding. The fact that strategies to preserve thematicdiversity will tend to focus on ideas and communities outside of the research mainstreamthat are harder to assess reduces their success prospects, and increases the likelihood ofwasting resources and distorting research efforts in unproductive directions.Third, there is a scale problem : most public research budgets are small compared to theR&D budgets of the private sector and in particular large technology companies. Targetedefforts to steer the trajectory of AI research and increase its diversity are unlikely to have50ignificant impacts unless they are complemented by allied policies to steer technologydevelopment and adoption in the private sector. This could include regulatory interventionsthat penalise algorithmic failures, the environmental costs of AI systems and heavy usageof personal data, thus giving the private sector incentives to develop and adopt alternativetechniques. Another option would be to increase the supply and diversity of AI talentto bolster the ‘public interest AI sphere‘ and increase the demographic diversity of theAI workforce in industry, contributing indirectly to a diversification in the technologicaltrajectories that are explored.Recent trends in AI research including the evidence that we have provided suggest thatresearch funders and other stakeholders in the AI policy ecosystem need to look for ways toovercome these challenges and mitigate the risk of a decline in AI’s technological diversity,preserving spaces for public interest AI R&D, and patiently investing in runner-up AItechniques that might under-perform today but could play an important role in future AIbreakthroughs. We hope that the methods and metrics that we have developed in thispaper can play a part in informing this policy effort.
References
Emmanuel Abbe, Afonso S Bandeira, and Georgina Hall. Exact recovery in the stochasticblock model. IEEE Transactions on Information Theory, 62(1):471–487, 2015.Daron Acemoglu. Diversity and technological progress. In The Rate and Direction ofInventive Activity Revisited, pages 319–356. University of Chicago Press, 2011.Daron Acemoglu and Pascual Restrepo. The wrong kind of AI? Artificial intelligenceand the future of labour demand. Cambridge Journal of Regions, Economy andSociety, 13(1):25–35, 2019. ISSN 1752-1378. doi: 10.1093/cjres/rsz022. URL https://doi.org/10.1093/cjres/rsz022 . eprint: https://academic.oup.com/cjres/article-pdf/13/1/25/33213534/rsz022.pdf.Philippe Aghion, Paul A David, and Dominique Foray. Science, technology and innovationfor economic growth: linking policy research and practice in ‘stig systems’. Researchpolicy, 38(4):681–693, 2009.Ajay Agrawal, Joshua Gans, and Avi Goldfarb. Prediction machines: the simple economicsof artificial intelligence. Harvard Business Press, 2018.Nur Ahmed and Muntasir Wahed. The de-democratization of ai: Deep learning and thecompute divide in artificial. 2020.Stuart Armstrong, Nick Bostrom, and Carl Shulman. Racing to the precipice: a model ofartificial intelligence development. AI & society, 31(2):201–206, 2016.51 Brian Arthur. Increasing returns and path dependence in the economy. University ofmichigan Press, 1994.W Brian Arthur. The nature of technology: What it is and how it evolves. Simon andSchuster, 2009.Pierre-Alexandre Balland, Cristian Jara-Figueroa, Sergio G Petralia, Mathieu PA Steijn,David L Rigby, and C´esar A Hidalgo. Complex economic activities concentrate in largecities. Nature Human Behaviour, 4(3):248–254, 2020.Paul Barham and Michael Isard. Machine learning systems are stuck in a rut. InProceedings of the Workshop on Hot Topics in Operating Systems, HotOS ’19, page177–183, New York, NY, USA, 2019. Association for Computing Machinery. ISBN9781450367271. doi: 10.1145/3317550.3321441. URL https://doi.org/10.1145/3317550.3321441 .Stefano Bianchini, Moritz M¨uller, and Pierre Pelletier. Deep learning in science, 2020.David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation. Journalof machine Learning research, 3(Jan):993–1022, 2003.Timothy F Bresnahan and Manuel Trajtenberg. General purpose technologies ‘engines ofgrowth’ ? Journal of econometrics, 65(1):83–108, 1995.Kevin Bryan, Jorge Lemus, and Guillermo Marshall. Innovation during a crisis: Evidencefrom covid-19. Available at SSRN 3587973, 2020.Kevin A Bryan and Jorge Lemus. The direction of innovation. Journal of Economic Theory,172:247–272, 2017.Robyn Caplan and danah boyd. Isomorphism through algorithms: Institutional depen-dencies in the case of facebook. Big Data & Society, 5(1):2053951718757253, 2018. doi:10.1177/2053951718757253. URL https://doi.org/10.1177/2053951718757253 .Yves Chauvin and David E Rumelhart. Backpropagation: theory, architectures, andapplications. Psychology press, 1995.Iain M Cockburn, Rebecca Henderson, and Scott Stern. The impact of artificial intelligenceon innovation. Technical report, National bureau of economic research, 2018.Robin Cowan. Nuclear power reactors: A study in technological lock-in. The Journal ofEconomic History, 50(3):541–567, 1990. doi: 10.1017/S0022050700037153.Allan Dafoe. AI Governance: A Research Agenda. Technical report, Future of HumanityInstitute, 2017. 52lexander D’Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi,Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman,Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam,Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Monta-nari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne, RajivRaman, Kim Ramasamy, Rory Sayres, Jessica Schrouff, Martin Seneviratne, ShannonSequeira, Harini Suresh, Victor Veitch, Max Vladymyrov, Xuezhi Wang, Kellie Web-ster, Steve Yadlowsky, Taedong Yun, Xiaohua Zhai, and D. Sculley. Underspecificationpresents challenges for credibility in modern machine learning, 2020.Paul A David. Clio and the economics of qwerty. The American economic review, 75(2):332–337, 1985.Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Proceedings of the2019 Conference of the North, 2019. doi: 10.18653/v1/n19-1423. URL http://dx.doi.org/10.18653/v1/N19-1423 .Paul J DiMaggio and Walter W Powell. The iron cage revisited: Institutional isomorphismand collective rationality in organizational fields. American sociological review, pages147–160, 1983.Giovanni Dosi. Technological paradigms and technological trajectories: a suggested inter-pretation of the determinants and directions of technical change. Research policy, 11(3):147–162, 1982.Morgan R Frank, Dashun Wang, Manuel Cebrian, and Iyad Rahwan. The evolution ofcitation graphs in artificial intelligence research. Nature Machine Intelligence, 1(2):79–85, 2019.Ana Freire, Lorenzo Porcaro, and Emilia G´omez. Measuring diversity of artificial intelli-gence conferences, 2020.Iason Gabriel. Artificial intelligence, values and alignment. arXiv preprintarXiv:2001.09768, 2020.Martin Gerlach, Tiago P Peixoto, and Eduardo G Altmann. A network approach to topicmodels. Science advances, 4(7):eaaq1360, 2018.Thilo Hagendorff and Kristof Meding. Ethical considerations and statistical analysis ofindustry involvement in machine learning research, 2020.Daniel Hain, Roman Jurowetzki, Juan Mateos-Garcia, and Kostas Stathoulopoulos. Theprivatisation of ai research. 2020. 53aren Hao. The messy, secretive reality behind OpenAI’s bid to save the world. MITTechnology Review, 2020.Sara Hooker. The hardware lottery, 2020.Hugo Hopenhayn and Francesco Squintani. The direction of innovation. 2016.S Johnson. The secret of apollo: Systems management in american and european spaceprograms, 2012.Mary Kaldor. The baroque arsenal. Andre Deutsch, 1982.Peter Karnøe and Raghu Garud. Path creation: Co-creation of heterogeneous resourcesin the emergence of the danish wind turbine cluster. European Planning Studies, 20(5):733–752, 2012.Alex Kendall and Yarin Gal. What uncertainties do we need in bayesiandeep learning for computer vision? In I. Guyon, U. V. Luxburg, S. Ben-gio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems 30, pages 5574–5584.Curran Associates, Inc., 2017. URL http://papers.nips.cc/paper/7141-what-uncertainties-do-we-need-in-bayesian-deep-learning-for-computer-vision.pdf .Joel Klinger, Juan C Mateos-Garcia, and Konstantinos Stathoulopoulos. Deep learning,deep change? mapping the development of the artificial intelligence general purposetechnology. Mapping the Development of the Artificial Intelligence General PurposeTechnology (August 17, 2018), 2018.Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. ImageNet Classification withDeep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, andK. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25,pages 1097–1105. Curran Associates, Inc., 2012. URL http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf .Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015.Kai-Fu Lee. AI Superpowers: China, Silicon Valley, and the New World Order. HoughtonMifflin Harcourt, 2018.Zachary C. Lipton and Jacob Steinhardt. Troubling trends in machine learning scholarship.Queue, 17(1):45–77, Feb 2019. ISSN 1542-7749. doi: 10.1145/3317287.3328534. URL http://dx.doi.org/10.1145/3317287.3328534 .54ristian Lum and William Isaac. To predict and serve? Significance, 13(5):14–19, 2016.Casey R. Lynch. Contesting digital futures: Urban politics, alternative economies, andthe movement for technological sovereignty in barcelona. Antipode, 52(3):660–680,2020. doi: 10.1111/anti.12522. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/anti.12522 .Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal ofmachine learning research, 9(Nov):2579–2605, 2008.Gary Marcus. Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631, 2018.John Markoff. Machines of loving grace: The quest for common ground between humansand robots. HarperCollins Publishers, 2016.Juan C Mateos-Garcia. The complex economics of artificial intelligence. Available at SSRN3294552, 2018.Mariana Mazzucato. Mission-oriented innovation policies: challenges and opportunities.Industrial and Corporate Change, 27(5):803–815, 2018.Zhao Jin Michael Gofmand. Artificial intelligence, human capital, and innovation. SSRN,2019. URL https://dx.doi.org/10.2139/ssrn.3449440 .Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of wordrepresentations in vector space. arXiv preprint arXiv:1301.3781, 2013.Kevin Miller, Chris Hettinger, Jeffrey Humpherys, Tyler J. Jarvis, and David Kartchner.Forward thinking: Building deep random forests. ArXiv, abs/1705.07366, 2017.Richard R Nelson. An evolutionary theory of economic change. harvard university press,2009.Cristina Paez-Aviles, Frank J. Van Rijnsoever, Esteve Juanola-Feliu, and Josep Sami-tier. Multi-disciplinarity breeds diversity: the influence of innovation project character-istics on diversity creation in nanotechnology. The Journal of Technology Transfer, 43(2):458–481, 2018. doi: 10.1007/s10961-016-9553-9. URL https://doi.org/10.1007/s10961-016-9553-9 .Scott E Page. Diversity and complexity, volume 2. Princeton University Press, 2010.Caroline Paunov, Sandra Planes-Satorra, and Greta Ravelli. Review of national pol-icy initiatives in support of digital and ai-driven innovation. (79), 2019. doi: https://doi.org/https://doi.org/10.1787/15491174-en. URL . 55udea Pearl. Theoretical impediments to machine learning with seven sparks from thecausal revolution. arXiv preprint arXiv:1801.04016, 2018.Daniele Rotolo, Diana Hicks, and Ben R. Martin. What is an emerging technology?Research Policy, 44(10):1827 – 1843, 2015. ISSN 0048-7333. doi: https://doi.org/10.1016/j.respol.2015.06.006. URL .Stuart Russell. Human compatible: Artificial intelligence and the problem of control.Penguin, 2019.Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. PrenticeHall Press, USA, 3rd edition, 2009. ISBN 0136042597.David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van denDriessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanc-tot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever,Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, and DemisHassabis. Mastering the game of go with deep neural networks and tree search. Nature,529(7587):484–489, 2016. doi: 10.1038/nature16961. URL https://doi.org/10.1038/nature16961 .James Somers. Is AI Riding a One-Trick Pony? MIT Technology Review,September 2017. URL .Konstantinos Stathoulopoulos and Juan C Mateos-Garcia. Gender diversity in ai research.Available at SSRN 3428240, 2019.Andy Stirling. A general framework for analysing diversity in science, technology andsociety. Journal of the Royal Society Interface, 4(15):707–719, 2007.Fernando F Su´arez and James M Utterback. Dominant designs and the survival of firms.Strategic management journal, 16(6):415–430, 1995.A. Suominen. Topic modelling approach to knowledge depth and breadth: Analyzing tra-jectories of technological knowledge. In 2017 IEEE Technology Engineering ManagementConference (TEMSCON), pages 55–60, 2017.Neil C Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F Manso. The compu-tational limits of deep learning. arXiv preprint arXiv:2007.05558, 2020.Kuansan Wang, Zhihong Shen, Chiyuan Huang, Chieh-Han Wu, Yuxiao Dong, and AnshulKanakia. Microsoft academic graph: When experts are not enough. Quantitative ScienceStudies, 1(1):396–413, 2020. 56artin L Weitzman. On diversity. The Quarterly Journal of Economics, 107(2):363–405,1992.Martin L Weitzman. What to preserve? an application of diversity theory to crane con-servation. The Quarterly Journal of Economics, 108(1):157–183, 1993.Langdon Winner. Do artifacts have politics? Daedalus, pages 121–136, 1980.Michael Wooldridge. The Road to Conscious Machines. Pelican, 2020.Ohid Yaqub. Serendipity: Towards a taxonomy and a theory. Research Policy, 47(1):169– 179, 2018. ISSN 0048-7333. doi: https://doi.org/10.1016/j.respol.2017.10.007. URL