IInterdisciplinary research and technological impact
Qing Ke ∗ Northeastern University, Boston, MA 02115, USA
Interdisciplinary research has been considered as a solution to today’s complex societal challenges.While its relationship with scientific impact has been extensively studied, the technological impact ofinterdisciplinary research remains unexplored. Here, we examine how interdisciplinarity is associatedwith technological impact at the paper level. We measure the degree of interdisciplinarity of a paperusing three popular indicators, namely variety, balance, and disparity, and track how it gets citedby patented technologies over time. Drawing on a large sample of biomedical papers published in 18years, we find that papers that cites more fields (variety) and whose distributions over those citedfields are more even (balance) are more likely to receive patent citations, but both effects can beoffset if papers draw upon more distant fields (disparity). Those associations are consistent acrossdifferent citation-window lengths. Additional analysis that focuses on the subset of papers with atleast one patent citation reveals that the intensity of their technological impact, as measured asthe number of patent citations, increases with balance and disparity. Our work may have policyimplications for interdisciplinary research and scientific and technology impact.
Keywords: Interdisciplinarity; technological impact; patent-to-paper citation; non-patent reference
I. INTRODUCTION
National Academy of Sciences et al. [1] defined inter-disciplinary research (IDR) as research that “integrates... from two or more disciplines or bodies of specializedknowledge to advance fundamental understanding or tosolve problems whose solutions are beyond the scope ofa single discipline or area of research practice.” IDR hasreceived intense attention from diverse stakeholders: Sci-ence policymakers have been constantly discussing IDR[2, 3]; funding bodies have been actively promoting inter-disciplinary working [4]; and institutes have establishedinterdisciplinary centers [5]. These efforts to supportingIDR may partly due to the promise that it is beneficialin many aspects. For example, Ledford [6] argued thatsolving today’s complex societal challenges, ranging fromclimate change to sustainability, requires knowledge thattranscends traditional discipline boundary. IDR has alsobeen shown to be a factor for creativity [7, 8].One line of inquiry surrounding empirical understand-ing of the benefits of IDR is examining its relationshipwith scientific impact. Those studies have introduceddifferent IDR indicators, using the information of citeddisciplines. Early versions are one dimensional. Rinia et al. [9] defined interdisciplinary papers as those pub-lished in journals whose disciplines are different from themain focal program of interest (physics) and found noevidence of bibliometric or peer-review bias against IDR.Similarly, Larivire and Gingras [10] defined IDR of a pa-per as the fraction of its cited references that were pub-lished in journals of other disciplines. The diversity ofcited disciplines is another class of indicators. Steele andStier [11] quantified IDR using Brillouin’s diversity in-dex and found a positive association with citation rate.Other works used diversity measures to quantify inter- ∗ [email protected] disciplinarity of journals [12] and authors [13]. Recentliterature has highlighted that IDR is not only about howdiverse cited disciplines are but also about how they arerelated to each other. This has spurred proposals of mul-tidimensional indicators emphasizing different aspects ofIDR. Wang et al. [14] measured IDR through three di-mensions: variety, balance, and disparity. Variety countsthe number of cited disciplines, balance quantifies the di-versity of these disciplines, and disparity measures theirrelatedness. Using the three dimensions of IDR, Wang et al. [14] established their distinct effects on scientific im-pact; that is, variety and disparity are negatively linkedto short-term citations but positively to long-term cita-tions, and balance is negatively associated with long-termcitations but lacks a significant effect on short-term cita-tions. Yegros-Yegros et al. [15] presented a similar anal-ysis. Literature has also proposed integrated measures ofthe three indicators, such as the Rao-Stirling (RS) indexand its variations [16]. Porter and Rafols [17] applied theRS index to papers published in 1975–2005 and observeda modest increase of interdisciplinarity. Cassi et al. [18]applied the index to institutions. A recent large-scalestudy analyzed 19 million articles from 1900 to 2017 andobserved increasing interdisciplinarity across disciplines[19].Here we expand empirical characterizations of the ben-efits of IDR from the dominantly studied scientific impactto technological impact. We present the first, to our bestknowledge, bibliometric study that explores the relation-ship between the extent of IDR of papers and their tech-nological impact, by tracing their received citations madeby patents. Patents have been extensively used to rep-resent technological development [20], and compared topatent-to-patent citations, patent-to-paper citations bet-ter capture knowledge flows [21]. We first operationalizethe technological impact of a paper as whether it getscited by patents. We find that the effects of IDR on tech-nological impact is dependent on different dimensions. In a r X i v : . [ c s . D L ] J un particular, variety has a positive, but small, effect on thelikelihood of getting patent citations. Balance has a pos-itive, sizable effect. Disparity, on the other hand, has anegative effect. These associations are consistent acrossdifferent lengths of citation-window. In our further anal-ysis that focuses on papers with patent citations, we findthat both balance and disparity are positively associatedwith the number of patent citations. These findings mayhave important policy implications for IDR and impact. II. DATA AND METHODSA. Sample selection
As our interest resides in the biomedicine area, we useMEDLINE—a widely used database for the biomedicalresearch literature—as our primary source for publica-tion data. We select documents published between 1980and 1997. The choice of this period is constrained by theperiod (1976-2012) during which non-patent references ofpatents are matched to MEDLINE. Therefore, those doc-uments have a long period of time to accumulate patentcitations. We obtain additional bibliographical informa-tion of those documents from the Web of Science (WoS)database. The papers included in our final corpus sat-isfy the following three conditions. First, they are des-ignated as research articles, based on the “publicationtype” tag in MEDLINE and the “document type” tag inWoS. Second, their research fields, as specified as WoSSubject Category (SC), are not social sciences and hu-manities, as papers from those fields may be less likelyto get patent citations. Third, they cite at least two SCs,so that we can compute IDR measures. Our final corpushas 2 870 266 unique papers. Since a paper can be as-signed to multiple SCs, we treat each field separately asindividual observations, resulting in a total of 4 308 264observations. Table I provides the number and percent-age of papers for the 30 most presented fields, which intotal account for 74.7% of all observations.
B. Dependent variables
To study technological impact of papers, we link themto the patented technology space and investigate if theyare cited as “prior art” in front-page non-patent ref-erences (NPRs). In our previous work [22], we havematched NPRs of USPTO patents granted between 1976and 2012 to MEDLINE papers. Based on the set of cit-ing patents of the focal paper, we consider two categoriesof dependent variables. The first one is binary variablesindicating whether a paper has been cited by patentsgranted within 5-, 10-, and 15-years after its publication,respectively denoted as citedbypat5, citedbypat10, andcitedbypat15. The second group is the number of citingpatents in 5-, 10-, and 15-years, respectively denoted aspatc5, patc10, and patc15. The main reason for looking at different lengths of citation-window is that the accu-mulation of patent citations is time-dependent, as can beseen from the rightmost three columns in Table I wherewe present the percentage of papers that obtain patentcitations within 5, 10, and 15 years after publication. ForBiochemistry & Molecular Biology papers, for instance,only 5.6% of them are cited by patents in 5 years, whichdrastically increases to 18.3% in 15 years. Fig. 1A plotsthe distributions of number of patent citations, also sug-gesting its dependence on time.
C. Independent variables
For the focal paper, we first calculate the fraction ofeach cited SC, denoted as p i for SC i . Following the ex-isting literature, we then construct three indicators cap-turing the multi-facet features of IDR:1. Variety: the number of cited SC;2. Balance: the Shannon entropy diversity index nor-malized by the number of SC, formally:balance = entropyln n = − n (cid:88) i p i ln p i ; (1)3. Disparity: the average dissimilarity between twoSCs: disparity = 2 · (cid:80) i In light of previous literature, we consider several con-trol variables. The first one is the number of scientificcitations—the number of scientific articles that cite thefocal paper. We include this variable because previousstudies have found that it correlates with both patentcitations [22] and interdisciplinarity [14, 15]. Similar topatent citations, we also count article citations accruedin 5, 10, and 15 years. The second control variable isthe Impact Factor (IF) of the journal where the focal pa-per was published, as publishing in high IF journals mayincrease visibility and readership, which may help expe-dite the knowledge flow to the technology domain. Other TABLE I. Number and percentage of papers by field, as well as percentage of papers that get cited by patents within 5, 10,and 15 years after publication. % cited by patents afterField Papers (%) 5 y. 10 y. 15 y.Biochemistry & Molecular Biology 414143 (9.61) 5.58 13.67 18.29Pharmacology & Pharmacy 217327 (5.04) 2.57 6.20 8.80Neurosciences 194357 (4.51) 1.60 4.06 5.86Immunology 175536 (4.07) 4.69 11.98 16.23Surgery 163319 (3.79) 1.56 3.52 4.80Cell Biology 162977 (3.78) 5.32 13.10 17.27Oncology 142655 (3.31) 3.33 8.42 11.63Medicine, General & Internal 136307 (3.16) 1.22 2.77 3.88Biophysics 114760 (2.66) 3.57 8.83 12.17Endocrinology & Metabolism 106730 (2.48) 2.29 5.89 8.18Physiology 101036 (2.35) 1.03 2.91 4.32Cardiac & Cardiovascular Systems 97622 (2.27) 2.32 5.01 6.85Genetics & Heredity 91057 (2.11) 4.83 11.26 14.54Microbiology 89456 (2.08) 4.69 11.93 16.29Clinical Neurology 88152 (2.05) 1.23 3.04 4.56Radiology, Nuclear Medicine & Medical Imaging 82784 (1.92) 2.27 4.67 6.16Medicine, Research & Experimental 80412 (1.87) 4.47 10.24 13.44Public, Environmental & Occupational Health 72211 (1.68) 0.34 0.99 1.47Pathology 71754 (1.67) 1.13 3.33 4.88Pediatrics 66757 (1.55) 0.51 1.49 2.14Multidisciplinary Sciences 64969 (1.51) 14.42 28.90 34.72Toxicology 63721 (1.48) 0.81 2.27 3.45Hematology 62184 (1.44) 3.59 8.95 12.36Obstetrics & Gynecology 56787 (1.32) 1.07 2.80 4.05Psychiatry 54754 (1.27) 0.87 1.98 2.94Peripheral Vascular Disease 53990 (1.25) 2.89 6.78 9.51Gastroenterology & Hepatology 50449 (1.17) 1.47 3.90 5.52Veterinary Sciences 49991 (1.16) 0.84 2.67 3.96Infectious Diseases 46929 (1.09) 3.60 9.43 12.83Urology & Nephrology 45531 (1.06) 1.50 3.83 5.46 control variables include whether the paper involves in-ternational collaboration, the number of authors, and thenumber of MeSH terms. Figs. 1F–I plot the distributionsof these control variables.Furthermore, we consider the publication year and fieldfixed-effects and create dummy variables for each yearand each SC. Thus the estimations capture within-yearand within-field differences, meaning that the effects ofIDR on technological impact are compared for papersin the same year and the same field. Year fixed-effectis included to control for some features, like the num-ber of citing patents, that are fixed in a year but changeover time. Field fixed-effect is included, because thereis an apparent field-dependent tendency of getting citedby patents for papers in different fields, as demonstratedin Table I. About 35% of papers in the MultidisciplinarySciences category have patent citations in 15 years. Onthe other extreme, less than 5% of papers in several clin- ical medicine fields, such as General & Internal Medicineand Surgery, get cited by patents. In between is Cell Bi-ology, where 17% of papers achieve technological impact.Table II reports the summary statistics of all the in-troduced variables. III. RESULTSA. Likelihood of technological impact We employ logistic regression to model the effects ofIDR of a paper on its likelihood of getting cited bypatents. Table A.1 presents the modeling results for the5-year citation window case, where the dependent vari-able is whether a paper has been cited by patents that aregranted within 5 years after the publication of the paper.Model 1 is the baseline model where we only consider Patent citations+110 CC D F A D e n s i t y B C D E Scientific citations+110 CC D F F D e n s i t y G H I P a p e r s ×10 J FIG. 1. Distribution of variables.TABLE II. Summary statistics of variables.Variable Mean Std. Dev. Min Max Ncitedbypat5 0.03 0.171 0 1 4308264citedbypat10 0.073 0.261 0 1 4308264citedbypat15 0.1 0.299 0 1 4308264patc5 0.054 0.448 0 66 4308264patc10 0.216 1.524 0 361 4308264patc15 0.391 2.911 0 1572 4308264RS 0.277 0.123 0.001 0.779 4308264variety 6.396 2.853 1 30.667 4308264balance 0.794 0.125 0.073 1 4308264disparity 0.434 0.14 0.01 0.998 4308264artc5 13.356 28.025 0 7240 4308264artc10 22.857 57.302 0 30327 4308264artc15 29.111 81.476 0 49315 4308264jif 2.202 2.531 0.001 39.104 4216119authorintl 0.093 0.291 0 1 3298071numauthor 4.023 2.595 1 546 4308264nummesh 12.197 3.986 2 49 4308264year 1989.645 5.193 1980 1997 4308264 control variables. Model 2 includes the RS index andindicates its positive, statistically significant relationshipwith the likelihood of having technological impact. Aftercontrolling for confounders, a one-unit increase of RS islinked to a 95% ( e . − 1) increase in the odds of gain-ing technological impact. This result is in contrast withYegros-Yegros et al. [15], which found that RS has nosignificant relationship with scientific impact.Models 3–5 focus on each of the three dimensions ofIDR separately. We find that the three dimensions have statistically significant, yet distinct, associations with thelikelihood of being cited by patents. In particular, bothvariety and balance have positive effects, whereas dis-parity has a negative effect. Model 3 suggests that thesize of the positive effect of variety is small; citing onemore field is translated to a 3.2% increase in the odds.Model 4, on the other hand, indicates that the effect sizeof the positive association between balance and likeli-hood of technological impact is pronounced; a one-unitincrease of balance is associated with 191% increase inthe odds. Model 5 shows that for a one-unit increase indisparity, the odds of receiving patent citations are ex-pected to decrease by a factor of 0.94, holding all controlvariables constant.Model 6 examines the three aspects of IDR together,reassuring that their associations with technological im-pact persist after controlling for each other. The effectsize of variety remain similar to the Model 5 case—2.9%increase in the odds. The effect size of balance decreasesto 161%, still a sizable effect. A one-unit increase of dis-parity is associated with decrease in the odds by a factorof 76%–a larger effect than that in Model 5.In summary, we find that the number of fields a papercites and the evenness of the distribution over those citedfields have positive effects on the probability of beingcited by patents, but both effects can be offset if thepaper draws upon distant fields.The results so far look at whether a paper has patentcitations within 5-years since publication. We further ex-amine if the effects of three dimensions of IDR on techno-logical impact may be dependent on the length of citationwindow. We repeat running logit model for 10- and 15-year window, and the results are reported in Tables A.2and A.3. For ease of comparison, we show in Fig. 2 theodds ratio of the three variables in the full models. We varietybalancedisparity .5 1 1.5 2 2.5 3odds ratiocitedbypat5 citedbypat10 citedbypat15 FIG. 2. Odds ratios of variety, balance, and disparity in thelogit models where dependent variables are whether a paperhas patent citation in 5, 10, and 15 years. find that these associations are qualitatively similar. Va-riety has a positive but small effect for all the 3 cases.Balance has a positive, sizable effect on attaining techno-logical impact. The size decreases as we increase windowlength. Disparity has a consistent negative effect, andsize increases.Tables A.1–A.3 also reveal that (1) scientific impact ispositively correlated with technological impact; (2) pa-pers published in high IF journals are more likely to getpatent citations; and (3) the number of authors is posi-tively linked to the likelihood of patent citations, consis-tent with its positive effect on scientific impact. B. Intensity of technological impact We have looked at whether papers are cited by patents.We now focus on the number of patent citations and ex-amine it is affected by the three aspects of IDR. We re-strict this analysis to the subsets of papers in our corpusthat have gained technological impact. We use nega-tive binomial regression, since the number of patent ci-tations is an over-dispersed variable (Table II) and het-ergeneously distributed (Fig. 1), with one paper gettingcited by 1 572 patents. Fig. 3 presents the modeling re-sults, which indicate that the associations between IDRand number of patent citations are in general consistentwith the results presented in the previous section. Va-riety has a significant, though small, effect on the num-ber of patent citations. Balance is positively linked topatent citations, with a one-unit increase translating toa significant 12.5% increase of number of patent cita-tions. Interestingly, disparity also has a positive linkageto patent citations, which contrast itself with the pre-vious case where we examined the likelihood of patentcitations. varietybalancedisparity 0 .1 .2 .3 .4Coefficientpatc5 patc10 patc15 FIG. 3. Negative binomial regression of number of patentcitations. C. Robustness tests We perform one additional test to examine the robust-ness of our results. We have presented modeling resultswithout including the indicator of whether a paper in-volves international collaboration as a control variable.This is because our corpus covers papers published in along period of time (18 years) and for a significant por-tion of them (23%; Table II), we lack enough affiliationinformation to allow us to calculate this variable. In Ta-bles A.5–A.7 and A.8, we present the modeling resultsconsidering international collaboration. We make twoobservations. First, the associations between the threeIDR indicators and technological impact remain robust-ness. Second, interestingly, international collaboration isnegatively correlated with both the likelihood of gettingpatent citations and the number of patent citations, re-gardless of the citation-window length. This means thatfor comparable papers in the same field and year, inter-national collaboration papers are less likely to get patentcitations than papers involving only domestic collabora-tion. This association goes in the opposite direction fromits positive linkage with scientific impact. IV. DISCUSSION The main purpose of this work was to present a bib-liometric study on the relationship between IDR andtechnological impact. IDR is commonly referred to asresearch that integrates knowledge from different disci-plines and has been operationalized using different indi-cators. As pointed out in a recent paper by Wang andWiborg Schneider [23], there may be no “best” indicatorfor IDR. As our goal here is to explore its association withtechnological impact, we used three popular indicators,namely variety, balance, and disparity, to quantify theextent of IDR of a paper based on its cited disciplines.Our technological impact indicators were captured bycitations received from patents. We introduced twogroups of indicators: (1) whether a paper has been citedby patents, and (2) the number of patent citations. Us-ing regression techniques, we found that variety and bal-ance have positive effects on the likelihood of being citedby patents, and disparity has a negative effect. Theselinkages are persistent regardless of the citation-windowlength. These results indicate that papers that cite moredisciplines are more likely to obtain patent citations thancomparable papers in the same field and published inthe same year, so are papers that emanate more bal-anced citations and papers that cited disciplines that aremore similar to each other. Our further analysis focusingon papers that obtained patent citations show that bothbalance and disparity have positive correlations with thenumber of patent citations.Our work contributes to the literature about IDR andscience policy. First, our work enriches the multi-facetnature of the notion of impact, expanding from the over-whelmingly studied scientific impact aspect to the tech-nological impact dimension. While extant studies havefocused on IDR and scientific impact, the relationship between IDR and technological impact has been unex-plored.Second, the effects of different aspects of IDR on tech-nological impact revealed from our analysis are distinctfrom previous studies that examined the relationship be-tween IDR and scientific impact. This may suggest moresophisticated policies. One one hand, our results resonatewith those previous works that found variety is positivelyassociated with scientific impact. From this perspective,policymakers may encourage cross-disciplinary research.On the other hand, the positive relationship between bal-ance and technological impact contrasts with its negativeeffect on scientific impact. The negative linkage betweenbalance and scientific impact indicates that one effectiveconsideration to yield scientific impact is to root researchin one discipline and in the meantime, source from diverseother disciplines. Such a strategy, however, may be lesseffective to generate technological impact, which requiresdrawing knowledge from different disciplines evenly. Inaddition, the negative association between disparity andthe likelihood of technological impact may point to policyencouragement of IDR across closely related disciplines.This effect turns into play because integrating disciplinescan generate research that is useful to technologies. [1] National Academy of Sciences, National Academy of En-gineering, and Institute of Medicine, Facilitating Inter-disciplinary Research (The National Academies Press,Washington, DC, 2005).[2] N. Metzger and R. N. Zare, Science , 642 (1999).[3] C. S. Wagner, J. D. Roessner, K. Bobb, J. T. Klein,K. W. Boyack, J. Keyton, I. Rafols, and K. B¨orner,Journal of Informetrics , 14 (2011).[4] P. Lowe and J. Phillipson, Journal of Agricultural Eco-nomics , 165 (2006).[5] T. R. Cech and G. M. Rubin, Nature Structural & Molec-ular Biology , 1166 (2004).[6] H. Ledford, Nature , 308 (2015).[7] T. Heinze, P. Shapira, J. D. Rogers, and J. M. Senker,Research Policy , 610 (2009), special Issue: EmergingChallenges for Science, Technology and Innovation PolicyResearch: A Reflexive Overview.[8] J. A. Jacobs and S. Frickel, Annual Review of Sociology , 43 (2009).[9] E. J. Rinia, T. N. van Leeuwen, H. G. van Vuren, andA. F. J. van Raan, Research Policy , 357 (2001).[10] V. Larivire and Y. Gingras, Journal of the American So-ciety for Information Science and Technology , 126(2010).[11] T. W. Steele and J. C. Stier, Journal of the AmericanSociety for Information Science , 476 (2000).[12] F. Silva, F. Rodrigues, O. Oliveira, and L. da F. Costa,Journal of Informetrics , 469 (2013).[13] N. Carayol and T. U. N. Thi, Research Evaluation ,70 (2005).[14] J. Wang, B. Thijs, and W. Gl¨anzel, PLOS ONE ,e0127298 (2015).[15] A. Yegros-Yegros, I. Rafols, and P. DEste, PLOS ONE , e0135095 (2015).[16] L. Leydesdorff, C. S. Wagner, and L. Bornmann, Journalof Informetrics , 255 (2019).[17] A. L. Porter and I. Rafols, Scientometrics , 719 (2009).[18] L. Cassi, R. Champeimont, W. Mescheba, and E. de Tur-ckheim, PLOS ONE , e0170296 (2017).[19] A. J. Gates, Q. Ke, O. Varol, and A.-L. Barab´asi, Nature , 32 (2019).[20] L. Fleming and O. Sorenson, Strategic ManagementJournal , 909 (2004).[21] M. Roach and W. M. Cohen, Management Science ,504 (2013).[22] Q. Ke, Journal of Informetrics , 706 (2018).[23] Q. Wang and J. Wiborg Schneider, Quantitative ScienceStudies , 239 (2020). TABLE A.1. Logistic regression modeling of whether a paper has patent citations in 5 years.(1) (2) (3) (4) (5) (6)artc5 (ln) 0.832 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00338) (0.00338) (0.00340) (0.00338) (0.00338) (0.00342)jif 0.0162 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000833) (0.000843) (0.000834) (0.000832) (0.000843) (0.000843)nummesh 0.00914 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000725) (0.000729) (0.000729) (0.000726) (0.000729) (0.000738)numauthor (ln) 0.272 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00592) (0.00594) (0.00593) (0.00592) (0.00593) (0.00595)RS 0.668 ∗∗∗ (0.0286)variety 0.0312 ∗∗∗ ∗∗∗ (0.00113) (0.00122)balance 1.067 ∗∗∗ ∗∗∗ (0.0283) (0.0290)disparity -0.0563 ∗ -0.276 ∗∗∗ (0.0252) (0.0270)Constant -8.168 ∗∗∗ -8.373 ∗∗∗ -8.271 ∗∗∗ -9.058 ∗∗∗ -8.141 ∗∗∗ -8.932 ∗∗∗ (0.0427) (0.0436) (0.0429) (0.0489) (0.0443) (0.0506)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 4216119 4216119 4216119 4216119 4216119 4216119Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.2. Logistic regression modeling of whether a paper has patent citations in 10 years.(1) (2) (3) (4) (5) (6)artc10 (ln) 0.785 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00223) (0.00223) (0.00225) (0.00223) (0.00223) (0.00225)jif 0.0319 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000697) (0.000712) (0.000697) (0.000699) (0.000706) (0.000706)nummesh 0.0133 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000509) (0.000511) (0.000512) (0.000509) (0.000511) (0.000519)numauthor (ln) 0.257 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00405) (0.00406) (0.00406) (0.00405) (0.00406) (0.00407)RS 0.207 ∗∗∗ (0.0194)variety 0.0359 ∗∗∗ ∗∗∗ (0.000768) (0.000827)balance 0.923 ∗∗∗ ∗∗∗ (0.0188) (0.0193)disparity -0.528 ∗∗∗ -0.845 ∗∗∗ (0.0171) (0.0183)Constant -7.483 ∗∗∗ -7.544 ∗∗∗ -7.593 ∗∗∗ -8.258 ∗∗∗ -7.244 ∗∗∗ -7.874 ∗∗∗ (0.0275) (0.0281) (0.0276) (0.0318) (0.0285) (0.0328)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 4216119 4216119 4216119 4216119 4216119 4216119Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.3. Logistic regression modeling of whether a paper has patent citations in 15 years.(1) (2) (3) (4) (5) (6)artc15 (ln) 0.754 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00191) (0.00191) (0.00193) (0.00192) (0.00192) (0.00194)jif 0.0413 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000679) (0.000697) (0.000679) (0.000683) (0.000688) (0.000688)nummesh 0.0140 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000454) (0.000456) (0.000456) (0.000454) (0.000456) (0.000463)numauthor (ln) 0.247 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00356) (0.00357) (0.00357) (0.00357) (0.00357) (0.00358)RS 0.0976 ∗∗∗ (0.0170)variety 0.0376 ∗∗∗ ∗∗∗ (0.000675) (0.000725)balance 0.833 ∗∗∗ ∗∗∗ (0.0163) (0.0168)disparity -0.629 ∗∗∗ -0.965 ∗∗∗ (0.0151) (0.0161)Constant -7.044 ∗∗∗ -7.071 ∗∗∗ -7.156 ∗∗∗ -7.745 ∗∗∗ -6.768 ∗∗∗ -7.317 ∗∗∗ (0.0230) (0.0235) (0.0231) (0.0269) (0.0239) (0.0278)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 4216119 4216119 4216119 4216119 4216119 4216119Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.4. Negative binomial regression modeling of number of patent citations.patc5 patc10 patc15variety -0.0118 ∗∗∗ -0.0134 ∗∗∗ -0.0122 ∗∗∗ (0.000974) (0.000697) (0.000638)balance 0.258 ∗∗∗ ∗∗∗ ∗∗∗ (0.0235) (0.0165) (0.0150)disparity 0.267 ∗∗∗ ∗∗∗ ∗∗∗ (0.0214) (0.0150) (0.0137)jif 0.00241 ∗∗∗ ∗∗∗ ∗∗∗ (0.000531) (0.000418) (0.000413)nummesh -0.00277 ∗∗∗ -0.00268 ∗∗∗ -0.00352 ∗∗∗ (0.000555) (0.000406) (0.000379)numauthor (ln) 0.0625 ∗∗∗ ∗∗∗ ∗∗∗ (0.00455) (0.00323) (0.00296)artc5 (ln) 0.164 ∗∗∗ (0.00250)artc10 (ln) 0.252 ∗∗∗ (0.00166)artc15 (ln) 0.274 ∗∗∗ (0.00145)Constant -0.595 ∗∗∗ -0.876 ∗∗∗ -0.785 ∗∗∗ (0.0428) (0.0297) (0.0261)lnalpha -2.133 ∗∗∗ -0.752 ∗∗∗ -0.384 ∗∗∗ (0.0155) (0.00387) (0.00275)Field fe (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) Observations 128180 310290 420653 BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.5. Logistic regression modeling of whether a paper has patent citations in 5 years. All models include internationalcollaboration as a control variable. (1) (2) (3) (4) (5) (6)artc5 (ln) 0.832 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00378) (0.00378) (0.00380) (0.00378) (0.00378) (0.00382)jif 0.0179 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000946) (0.000957) (0.000947) (0.000945) (0.000957) (0.000957)nummesh 0.0104 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000791) (0.000796) (0.000795) (0.000792) (0.000796) (0.000806)numauthor (ln) 0.271 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00663) (0.00665) (0.00664) (0.00663) (0.00664) (0.00666)authorintl -0.156 ∗∗∗ -0.156 ∗∗∗ -0.158 ∗∗∗ -0.155 ∗∗∗ -0.156 ∗∗∗ -0.156 ∗∗∗ (0.00990) (0.00990) (0.00989) (0.00990) (0.00990) (0.00990)RS 0.534 ∗∗∗ (0.0319)variety 0.0307 ∗∗∗ ∗∗∗ (0.00124) (0.00135)balance 0.971 ∗∗∗ ∗∗∗ (0.0315) (0.0323)disparity -0.152 ∗∗∗ -0.391 ∗∗∗ (0.0280) (0.0302)Constant -8.188 ∗∗∗ -8.354 ∗∗∗ -8.295 ∗∗∗ -8.995 ∗∗∗ -8.115 ∗∗∗ -8.814 ∗∗∗ (0.0502) (0.0513) (0.0505) (0.0568) (0.0519) (0.0586)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 3228379 3228379 3228379 3228379 3228379 3228379Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.6. Logistic regression modeling of whether a paper has patent citations in 10 years. All models include internationalcollaboration as a control variable. (1) (2) (3) (4) (5) (6)artc10 (ln) 0.786 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00251) (0.00251) (0.00253) (0.00251) (0.00251) (0.00254)jif 0.0345 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000805) (0.000823) (0.000806) (0.000809) (0.000814) (0.000814)nummesh 0.0139 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000559) (0.000561) (0.000562) (0.000559) (0.000562) (0.000569)numauthor (ln) 0.247 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00452) (0.00453) (0.00453) (0.00452) (0.00453) (0.00454)authorintl -0.0875 ∗∗∗ -0.0874 ∗∗∗ -0.0893 ∗∗∗ -0.0860 ∗∗∗ -0.0874 ∗∗∗ -0.0884 ∗∗∗ (0.00689) (0.00689) (0.00689) (0.00690) (0.00689) (0.00690)RS 0.102 ∗∗∗ (0.0217)variety 0.0367 ∗∗∗ ∗∗∗ (0.000849) (0.000918)balance 0.866 ∗∗∗ ∗∗∗ (0.0211) (0.0217)disparity -0.606 ∗∗∗ -0.951 ∗∗∗ (0.0191) (0.0205)Constant -7.466 ∗∗∗ -7.497 ∗∗∗ -7.586 ∗∗∗ -8.190 ∗∗∗ -7.191 ∗∗∗ -7.756 ∗∗∗ (0.0318) (0.0325) (0.0320) (0.0365) (0.0329) (0.0376)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 3228379 3228379 3228379 3228379 3228379 3228379Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.7. Logistic regression modeling of whether a paper has patent citations in 15 years. All models include internationalcollaboration as a control variable. (1) (2) (3) (4) (5) (6)artc15 (ln) 0.753 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00216) (0.00216) (0.00218) (0.00217) (0.00217) (0.00219)jif 0.0449 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000789) (0.000808) (0.000788) (0.000793) (0.000797) (0.000796)nummesh 0.0147 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.000500) (0.000502) (0.000503) (0.000500) (0.000503) (0.000510)numauthor (ln) 0.238 ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ ∗∗∗ (0.00397) (0.00398) (0.00398) (0.00397) (0.00398) (0.00399)authorintl -0.0630 ∗∗∗ -0.0630 ∗∗∗ -0.0651 ∗∗∗ -0.0616 ∗∗∗ -0.0630 ∗∗∗ -0.0647 ∗∗∗ (0.00617) (0.00617) (0.00618) (0.00618) (0.00617) (0.00618)RS -0.00142(0.0191)variety 0.0385 ∗∗∗ ∗∗∗ (0.000749) (0.000806)balance 0.787 ∗∗∗ ∗∗∗ (0.0183) (0.0189)disparity -0.708 ∗∗∗ -1.070 ∗∗∗ (0.0169) (0.0181)Constant -7.022 ∗∗∗ -7.022 ∗∗∗ -7.144 ∗∗∗ -7.682 ∗∗∗ -6.710 ∗∗∗ -7.206 ∗∗∗ (0.0265) (0.0271) (0.0266) (0.0307) (0.0275) (0.0317)Field fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) Observations 3228379 3228379 3228379 3228379 3228379 3228379Pseudo R BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < . TABLE A.8. Negative binomial regression modeling of number of patent citations. All models include international collabora-tion as a control variable. patc5 patc10 patc15variety -0.0116 ∗∗∗ -0.0131 ∗∗∗ -0.0116 ∗∗∗ (0.00108) (0.000770) (0.000706)balance 0.245 ∗∗∗ ∗∗∗ ∗∗∗ (0.0261) (0.0184) (0.0168)disparity 0.275 ∗∗∗ ∗∗∗ ∗∗∗ (0.0238) (0.0167) (0.0153)jif 0.00241 ∗∗∗ ∗∗∗ ∗∗∗ (0.000595) (0.000472) (0.000468)nummesh -0.00243 ∗∗∗ -0.00250 ∗∗∗ -0.00326 ∗∗∗ (0.000605) (0.000444) (0.000414)numauthor (ln) 0.0627 ∗∗∗ ∗∗∗ ∗∗∗ (0.00511) (0.00361) (0.00330)authorintl -0.0741 ∗∗∗ -0.0981 ∗∗∗ -0.0994 ∗∗∗ (0.00753) (0.00541) (0.00504)artc5 (ln) 0.169 ∗∗∗ (0.00279)artc10 (ln) 0.255 ∗∗∗ (0.00185)artc15 (ln) 0.275 ∗∗∗ (0.00163)Constant -0.598 ∗∗∗ -0.878 ∗∗∗ -0.769 ∗∗∗ (0.0498) (0.0343) (0.0300)lnalpha -2.106 ∗∗∗ -0.749 ∗∗∗ -0.387 ∗∗∗ (0.0168) (0.00426) (0.00304)Field fe (cid:88) (cid:88) (cid:88) Year fe (cid:88) (cid:88) (cid:88) Observations 105734 254302 343065 BIC Standard errors in parentheses ∗ p < . ∗∗ p < . ∗∗∗ p < ..