Author Mentions in Science News Reveal Wide-Spread Ethnic Bias
AAuthor Mentions in Science News RevealWide-Spread Ethnic Bias
Hao Peng, Misha Teplitskiy, David Jurgens ∗ School of Information, University of Michigan105 S State St, Ann Arbor, MI 48109, USA ∗ To whom correspondence should be addressed; E-mail: [email protected].
Abstract
Media outlets play a key role in spreading scientific knowledge to the general public andraising the profile of researchers among their peers. Yet, given time and space constraints,not all scholars can receive equal media attention, and journalists’ choices of whom to men-tion are poorly understood. In this study, we use a comprehensive dataset of 232,524 newsstories from 288 U.S.-based outlets covering 100,208 research papers across all sciencesto investigate the rates at which scientists of different ethnicities are mentioned by name.We find strong evidence of ethnic biases in author mentions, even after controlling for awide range of possible confounds. Specifically, authors with non-British-origin names aresignificantly less likely to be mentioned or quoted than comparable British-origin namedauthors, even within the stories of a particular news outlet covering a particular scientificvenue on a particular research topic. Instead, minority scholars are more likely to have theirnames substituted with their role at their institution. This ethnic bias is consistent across alltypes of media outlets, with even larger disparities in General-Interest outlets that tend topublish longer stories and have dedicated editorial teams for accurately reporting science.Our findings reveal that the perceived ethnicity can substantially shape scientists’ mediaattention, and, by our estimation, this bias has affected thousands of scholars unfairly.
Scientific breakthroughs often attract media attention, which serves as a key mechanism forpublic dissemination of new knowledge (Scheufele, 2013; Brossard and Scheufele, 2013). Sci-ence reporting not only distills research insights but also puts a face on who was responsiblefor the research. The media coverage can then feed back into researchers’ careers (Cronin and1 a r X i v : . [ c s . C Y ] S e p ugimoto, 2014). Besides well-established gender and ethnic disparities in conventional sci-entific outcomes including funding allocation (Ley and Hamilton, 2008; Ginther et al., 2011;Oliveira et al., 2019; Hoppe et al., 2019), hiring decisions (Xie et al., 2003; Turner et al., 2008;Moss-Racusin et al., 2012; Way et al., 2016), publishing (Ding et al., 2006; West et al., 2013),citations (Larivi`ere et al., 2013; Huang et al., 2020), and monetary or non-monetary rewards(Holden, 2001; Shen, 2013; Xie, 2014), emerging evidence has pointed to demographic dispari-ties in general media coverage (Behm-Morawitz and Ortiz, 2013; Jia et al., 2016; Merullo et al.,2019; Smith, 1997; Devitt, 2002), raising the possibility that some scientists are not receivingtheir due attribution (Jia et al., 2015; Amberg and Saunders, 2018).Going unnamed as an author in science reporting not only removes the reputational ben-efits associated with the report, signalling a person is not worthy of public mention, but alsopotentially shifts the public’s perception of who is a scientist (Miller et al., 2018). Under-representing certain demographics groups can perpetuate the stereotype that scientists are whitemales (Turner et al., 2008; Banchefsky et al., 2016), which in turn weakens the pipeline of re-cruiting and training diverse students into new scientists, exacerbating the current representationissues (Cole, 1979; Reuben et al., 2014; Hill et al., 2018).Academic careers are characterized by cumulative advantage, where successes compound,amplifying each other and become easier to sustain (Merton, 1968). As a result, the inhibitorybiases against minority groups have a cumulative penalty that reduce representation and visibil-ity, and can result in a loss of symbolic capital for advancing one’s career (Leahey, 2007).Given known institutional and cultural barriers faced by minority scholars during the earlystages of research (e.g., gathering resources) and middle stages (e.g., publishing), a sizable gapstill remains in our understanding of the latter stages as research disseminates to the public.While it is possible that, once published in the academic literature and covered by the newsmedia, similar contributions receive similar attention regardless of the authors’ perceived iden-tities, a number of mechanisms may produce divergence between contribution and attention inscience coverage.Here, we present the first large-scale and science-wide effort to measure demographic biases2n science news through a computational analysis of 232,524 news stories mentioning 100,208published scholarly work (Section S1). Specifically, we investigate whether the first author of ascientific paper is mentioned by name in news stories that reference their paper. In multi-authorpapers, first authors are commonly junior scholars who are directly responsible for the workand stand the most to gain in recognition from being mentioned.We use mixed-effects regression models to examine and quantify demographic differencesin author mentions, while controlling for a broad range of plausible confounding factors. Thecomplexity of our models and the scale of the data enable unusually strict controls, such asmeasuring differential mentions within a particular news outlet covering a particular academicjournal on a particular research topic. These controls help ensure that we are comparing mediamentions of researchers doing comparable work.Furthermore, the richness of the data enables us to delve into the mechanisms causing thedisparities, and to refer to them using the stronger language of “bias.” Ethnic and gender biasesin mentions may be plausibly caused by a number of mechanisms, involving different actors.First, journalists may not be the relevant actors at all. Some news coverage originates from pressreleases created by in-house public relation staff at universities to disseminate their researchers’work. News outlets often reprint these press releases in part or in full, and any biases thereinmay thus be passed on to the outlets’ audiences. We test this hypothesis by comparing men-tions in journalist-written pieces versus press releases, and by whether journalists differentiallymention additional information about particular researchers, such as their institutions.Second, biases may be driven by pragmatic difficulties of interviewing researchers in dis-tant time-zones and possibly with limited English proficiency. Journalists (and/or their editors)may use researchers’ names and institutions to “statistically discriminate” and infer from themscheduling or other difficulties. We test this hypothesis by focusing on a subset of the datawhere journalists and researchers are located in relatively close geographic proximity (withinthe U.S.), and by comparing simple mentions of names vs. direct quotes.Lastly, journalists may have personal animus towards particular ethnic or gender groups orexpectations of animus from their audience members to whom they cater. We use “animus”3o refer to direct negative attitudes towards particular demographic groups and/or incorrect orunfounded negative inferences about their English proficiency and other factors that can affectarticle quality. We test for the possible role of audience by comparing mentions across outlet(and presumably audience) types, and statistically control for English proficiency using ease-of-reading measures on the abstracts of the research papers. Results
Who Gets Named?
We find strong ethnic bias in mentioning first authors by name in science news reporting sci-entific papers. This bias is robust to the inclusion of increasingly stringent controls (Model5 in Table S5). Specifically, compared to British-origin named authors, all minority-ethnicityauthors are significantly less likely to receive name attributions in science reporting. Indeed,this bias appears to increase with English-centric assessments of cultural distance, with otherEuropean ethnicities penalized the least while Asian and African authors penalized the most.Surprisingly, we find no gender bias in author mentions. However, when random effects fornews outlets and publication venues are not considered, the first author gender variable appearsto have a significant effect. As gender representation varies widely across academic disciplines(Xie et al., 2003; Handelsman et al., 2005), this result suggests that gender differences in men-tion rates are likely to be explained by relative attention rates to publication venues in differentfields. This phenomenon is reminiscent of the Simpson’s paradox observed for gender bias ingraduate school admissions (Wagner, 1982), which, when academic department was controlledfor, revealed no gender bias.To quantify the exact effect of having a name with a perceived demographic on the probabil-ity of being mentioned by name in media coverage, we calculated the average marginal effectsfor the first author ethnicity and gender variable respectively using our finest model.As shown in Fig. 1, the estimated probability of being mentioned decreases by an absolute1.0%–6.4% for authors with minority-ethnicity names, compared to their British-origin namedcounterparts. As the average mention rate is only 36.6% (Section S1), these absolute drops4 .10 0.05 0.00AfricanIndianMiddle EasternChinesenon-Chinese East AsianEastern EuropeanScandinavian & GermanicRomance LanguageFemaleProb. of being mentioned compared to Male/British-origin named authors
Figure 1: The marginal effects for first authors’ gender and ethnicity, averaged over all 285,708observations in the dataset. A negative average marginal effect indicates a decrease in mentionprobability compared to authors with Male (for gender) or British-origin (for ethnicity) names.The colors are proportional to the absolute probability changes.
Female is colored as blueto reflect its difference from ethnicity identities. The error bars indicate 95% bootstrappedconfidence intervals.represent significant disparities: the 6.3% and 6.4% marginal decreases for Chinese and Africanauthors represent a 17.5% relative decrease in media representation. This result reveals that themainstream U.S. media outlets have profound bias against authors from all minority ethnicitiesin mentioning them by name in science news: Given the current disparities, we estimate thatmore than four thousand minority scholars have gone unmentioned in our data alone.
Does Location Matter?
In reporting on research, journalists often directly seek out the authors by phone or email tocontextualize and explain their results. If an author is at a non-U.S. institution, a journalist froma U.S.-based outlet could be less likely to reach out due to perceived challenges in time-zonedifferences or lower expectations of fluency, potentially resulting in a lower rate of being men-tioned or quoted. Since non-U.S. institutions typically have more Asian and African authors dueto their locations, this mechanism could potentially explain the disparity in being mentioned.To examine the effect of geographical factors, we measured the bias separately for (i) the5ubset of our data where the first author is from U.S.-based institutions, and (ii) that for non-U.S. authors. Compared to U.S.-based authors, international scientists have far lower rates ofbeing mentioned, with coefficients (negatively) decreased by a factor of 2-4 for each ethnic-ity compared with their domestic counterparts (Table S6). This considerable gap reveals thatgeographic location is one major issue influencing mention biases in science news. However,international location alone does not explain all disparities in who is mentioned: The averagemarginal effects shown in Fig. 2 indicate that similar magnitude of mention biases still existamong U.S.-based authors. This comparative result indicates that other factors besides locationplay a substantial effect in which authors are named.
How Authors Are Referred To?
Mention
Quote
Inst. Substitution
Probability of being credited compared to Male/British-origin named authors
Figure 2: U.S.-based authors with minority-ethnicity names are less likely to be mentionedby name ( left ) or quoted ( middle ), and are more likely to be substituted by their institution( right ). The average marginal effects are estimated based on 169,984 observations where thefirst author is from U.S.-based institutions. A negative (positive) marginal effect indicates adecrease (increase) in probability compared to authors with Male (for gender) or British-origin(for ethnicity) names. The colors are proportional to the absolute probability changes.
Female is colored as blue to reflect its difference from ethnicity identities. The error bars indicate 95%bootstrapped confidence intervals.Journalists have multiple options in how they incorporate the scientists performing the re-search. They may go beyond simply naming the scientist and incorporate quotes from them6bout the research; alternatively, they may have the scientist play a minimal agentive role byusing references like “researchers from
University .” These discourse mechanisms serve to fur-ther integrate or distance the scientist from their role in the described research—giving them aname and a voice or removing their individuality.Our prior result demonstrates that, even within the U.S., African and Asian authors expe-rience substantial under-reporting in being named. As U.S.-based authors may still differ intheir perceived fluency in oral English, and also journalists may simply be less willing to con-tact certain ethnic authors even if they speak fluent English, we hypothesize that authors fromprivileged demographics will be more likely to receive a quote, whereas those from disadvan-taged demographics will be more likely to indirectly mentioned as a role associated with theirinstitutions, rather than explicitly named.To test these hypotheses we further identified (i) authors who are named as part of quotations(a subset of name mentions), and (ii) authors who get unnamed but their institution is namedinstead (Section S1). Since fluency is correlated with location, we focused on the U.S. subsetand applied the same mixed-effects regression framework to model two dependent variables:(1) whether the first author is quoted, and (2) whether the first author is indirectly mentioned bytheir institution instead being named or quoted.The average marginal effects in Fig. 2 reveal that U.S.-based African and Asian authorsare less likely to be quoted, and instead are more likely to be substituted by their role withintheir institutions (See Fig. S3 for results based on our full data). The significant differencesin being quoted in U.S. subset indicate that the perceived English fluency may play a majorrole in name mentions. However, language proficiency is not the only driving mechanism, as astrong bias appears for authors with Indian names, despite English being an official language inIndia. This, along with the “positive” effect in being substituted by institutions when name isnot mentioned for Asian and African authors, suggests that journalist animus also plays a rolein author mentions. This is the case especially given that journalists can always contact authorsperceived to be less fluent via email to get a quote as a way to bypass potential challenges inoral communications, and that overall journalists are dealing with authors of research papers7ritten in English, which would potentially signal some English proficiency for all authors.Note that the result on institution substitution also demonstrates that the mention bias doesnot result from a potential mechanism where Asian and African authors working on researchthat is more likely to be used in news stories where there is no need for agency at all (e.g.,survey-like stories summarizing lots of recent results that briefly mention research papers ontheir topic without any form of attribution).
Does It Matter Who Is Reporting?
Understanding whether this ethnic bias is related to journalists’ own demographics is anothercrucial step towards uncovering its mechanisms, as they are the actors who are directly respon-sible for writing the stories. First, journalists may differ in their overall tendencies to mentionfirst authors when covering science. Second, there might exist interaction effects between au-thors and journalists. One intuitive hypothesis, which we call “cultural hierarchy,” is that alljournalists, regardless of their gender and ethnicity, prefer to mention Male and British-originnamed scholars over minority others. At the same time, journalist may also prefer to mentionauthors from demographic categories that match their own, which we call “cultural homophily.”(McPherson et al., 2001)Our model controls for journalists’ demographics and their interactions with that of first au-thors (Section S1). Due to insufficient instances of identified journalists (Table S3), we reportthe result based on our finest model trained with the full data. No meaningful ethnic prefer-ences are seen for author-journalist interactions to suggest either cultural hierarchy or culturalhomophily hypothesis. However, when dropping controls for outlets (Table S5, Models 3-4),journalists’ ethnicities become significant, suggesting that journalists’ behavior might be ex-plained by variations at the outlet level, i.e., certain news outlets mention authors more or lessoften and certain groups of journalists are under- or over-represented in those outlets.8 ifferences Across Outlet Types
Outlets vary in the depth and breath of their reporting, e.g., Science & Technology outletswrite about 650 words per story on average, while General News outlets write about 850 words(Section S1; Fig. S2). These differences suggest potentially important variability in the natureof journalists’ day-to-day work and backgrounds. To explore the discrepancy of bias acrossdifferent types of outlets in author mentions, we fitted the specification of Model 5 separatelyfor three outlet types in our data and quantified the average marginal effects.
Press Releases
Sci. & Tech.
General News
Probability of being mentioned compared to Male/British-origin named authors
Figure 3: The relative decrease in the probability of being mentioned for first authors of minoritygender and ethnicity reveals a consistent behavioral bias across three types of outlet—yet withstarkly different magnitude of effects. Note that the average mention rates in Press Releases,Science & Technology, and General News outlets are 44.9%, 51.8%, and 22.1%, respectively.The colors are proportional to the absolute probability changes. Error bars represent 95% boot-strapped confidence intervals.Surprisingly, the ethnic bias remains consistent across all outlet types, as shown in Fig. 3,with authors having non-British-origin names being mentioned less frequently across all threeoutlet types. Larger disparities are found for ethnic categories that are more distant from British-origin (e.g., Asian and African). However, outlet types vary substantially in the magnitude oftheir bias: Science & Technology outlets and General News outlets are, on average, three timesmore biased against non-British-origin named scholars than outlets in Press Releases (6% vs.2% marginal decrease). 9he bias in stories from Press Releases outlets is particularly notable, as stories in theseoutlets typically reuse content from university press-releases, suggesting that universities’ pressoffices themselves, while less biased than other outlet types, still prefer to mention scholars withBritish-origin names. This result is surprising because local press offices are expected to havegreater direct familiarity with their researchers, reducing the misuse of stereotypes, and to bemore responsible for representing minority researchers equitably.The largest disparities are seen in General News outlets, e.g. The New York Times andThe Washington Post, where again African and Chinese scholars have nearly a 10% absolutedrop in representation. General News outlets mention first authors with a 22.1% chance onaverage (Table S4), so this drop in author coverage nearly halves the perceived role of a largecommunity of scientists. As General News outlets have well trained editorial staff and sciencejournalists dedicated to accurately reporting science and tend to publish longer stories that haveroom to mention and engage with authors, this result is alarming. Historically, these ethnicminorities have been underrepresented, stereotyped, or even completely avoided in U.S. media(Behm-Morawitz and Ortiz, 2013), which has continued in objective science reporting acrossall outlet types. The mechanisms behind variations by outlet type deserve further investigation.
Is the Situation Getting More Equitable?
The longitudinally-rich nature of our dataset allows us to examine how author mentions inscience news have changed over the last decade. Mention rates are on average decreasing overtime, as shown by the coefficient for the mention year scalar variable in Model 5 (Table S5).To examine the time trends across demographic categories, separate models (Model 5) weretrained to quantify the marginal change per year increase for each gender and ethnicity in ourdata. Note that demographic attributes not under study are still included in each model, e.g.,when examining the temporal changes in mention rates for male and female authors, ethnicityis still included as a factor, and vice versa.As shown in Fig. 4, the mention year has a negative association with author mentions forMale and most ethnicity groups, indicating that most authors are less likely to be mentioned10 Increase in probability of beingmentioned with one year change
Figure 4: Average marginal effects on mention probability for a one-unit increase in mentionyear for authors in each gender (blue) and ethnicity (red) group, revealing that the benefits ofprestiged demographics (Male, British-origin) are decreasing over time. However, only smallimprovements are seen for Chinese and Indian first authors. African is not shown due to insuf-ficient data for fitting a Model 5. Error bars show 95% bootstrapped confidence intervals.in later years. When compared with the average marginal effects of minority ethnicities on thelikelihood of being mentioned (Fig. 1), the larger decreases for ethnic groups such as British-origin and Scandinavian & Germanic indicate that their overall advantages are shrinking.Indeed, Chinese and Indian authors, two of the most disadvantaged groups in this study, havemention rates that are increasing over time, although more data is needed for precise estimation.However, their estimated rates of increase are relative small, suggesting that ethnic biases forthese authors are unlikely to disappear soon without purposeful behavior change. Based on theabsolute mention rate disparities between minority and British-origin named authors shown inFig. 1, and assuming a constant change rate per year for each ethnicity shown in Fig. 4, weestimate that only authors with Romance Language, Chinese, or Indian names will reach paritywith their British-origin named colleagues within 5-12 years in their rates of being mentioned;all other ethnicities see their overall mention rates drop similarly to that for British-origin names,indicating the current gap will persist. 11 iscussion
Our analyses reveal that the attention researchers get in news coverage is strongly associatedwith their ethnicities. The associations are robust to a variety of plausible confounds, and evenappear when controlling for the (1) particular news outlet, (2) particular scientific venue, and(3) particular research topic. Although we cannot claim the reported associations as causal, thisunusually strong observational evidence is a “smoking gun” of bias in coverage and deservesattention.
Ethnicity and Gender
Authors with non-British-origin names are mentioned substantially less when their research isdiscussed. The disparity appears for all non-British-origin names. However, mention rates areespecially low for Asian and African names, less pronounced for Indian, Middle Eastern, andRomance Language names, are even less pronounced for Scandinavian & Germanic and EasternEuropean names. The pattern is suggestive of stronger biases against non-Western ethnicities,but more evidence is needed to explain it. As science becomes more global and is increasinglydriven by non-Western ethnicities, the way English-language media responds to non-British-named scholars will only grow in importance.In contrast to ethnicity, we do not find bias in mentions of female scholars, once researchfields are controlled for. One possible reason is that fields vary in their overall level of coverageand in their gender representation (Handelsman et al., 2005). Looking within fields may thusmask or sidestep gender bias that is manifested between them.
Ruling in and out different mechanisms
Our analyses above point to a multi-causal generation of ethnic biases, in which both pragmaticdifficulties of interviewing distant researchers and journalists’ personal biases play key roles.In support of the pragmatic difficulties mechanism, we find that biases are substantially smallerwhen both the journalists and researchers are U.S.-based. Additionally, the largest biases appearin direct quotations, which may be more difficult to acquire from researchers in different time-12ones and who are likely to have non-British-origin names. In these cases, journalists appear to“substitute” the researcher’s institution for a direct quote.Nevertheless, biases remain even among geographically proximate actors, and journalists’choices are key. Supportive evidence comes from outlet types: when journalists’ role in thenews articles is minimal—when the outlet simply republishes a university press release—thebiases are also minimal (however, the disparities for many groups are still statistically distin-guishable from 0); when the news stories were written by journalists themselves, the biasesare the largest. The data does not allow us to rule out that journalists’ choices reflect personalanimus-based biases or the expected biases of their audiences. For example, the biases remaineven when controlling for readability of the research abstract, a potential signal of English profi-ciency that might influence journalists’ decisions (Table S5). Furthermore, the fact that
Science& Technology and
General News outlets have biases of similar magnitude yet likely differ intheir audiences, suggests again the important role played by journalists’ personal biases.Lastly, we cannot rule out that the biases stem from the academic literature itself, and inparticular which author is designated as “corresponding” (our data did not include this designa-tion). Further disentangling these mechanisms is an important avenue for future work.
Limitations
Although the scale and the breath of our dataset enable the use of unusually fine-grained con-trols, the analysis is not without limitations. First, the observational nature of the data precludesstrong causal statements. Second, some plausible explanatory covariates are unavailable for in-clusion, such as which author is designated as corresponding or the number of citations a paperreceived at the time of being mentioned. However, we anticipate the effect of such covariatesto be small given current controls. Fig. S1 shows that the majority of papers were mentionedwithin one year after publication, which limits the citations a paper can accrue in such a shortacademic time period. Third, the
Ethnea classifier is unable to identify African American schol-ars by name due its definition of ethnicity at the country level. A manual analysis shows thatauthors with stereotypical African American names are classified as English (British-origin) if13hey have common English surnames. However, as a robustness test, we repeated our exper-iments using an additional ethnicity classification based on coarser-grained U.S. census data(Fig. S3), which is able to identify such authors as Black; the result therein does not show anysignificant under-representation of Black scholars. Note that African-named authors (based onEthnea) are not necessarily classified as Black based on the Census data (Table S7-S8). Fi-nally, we note that our data contains too few examples of some ethnicities (e.g., Polynesian andCaribbean) to accurately estimate biases; such ethnicities are regrettably omitted, though werecognize that these groups likely experience bias from their minority status as well.
Conclusions and Implications
Our work shows that science journalism is rife with biases in who receives favorable coverage,with certain ethnic groups receiving much more name mentions and quotations than their peersconducting comparable research. These ethnic biases likely have direct negative consequencesfor the careers of unmentioned scientists, and skew the public perception of who a scientistis—a key factor in recruiting and training new scientists.Our findings have two important implications for science policy and science journalism.First, simply identifying large-scale ethnic disparities in science news, of which journalists maythemselves have been unaware, can be an agent of change. Second, decision-makers at U.S.research institutions may take ethnic disparities of media attention into account when mak-ing hiring or promotion decisions. More importantly, addressing this problem requires moreresearch to investigate the mechanisms leading to it, which we hope this paper helps stimulate.
References
Anurag Ambekar, Charles Ward, Jahangir Mohammed, Swapna Male, and Steven Skiena.Name-ethnicity classification from open sources. In
KDD , pages 49–58, 2009.Amanda Amberg and Darren N Saunders. Cancer in the news: Bias and quality in mediareporting of cancer research. bioRxiv , page 388488, 2018.14ierre Azoulay, Toby Stuart, and Yanbo Wang. Matthew: Effect or fable?
Management Science ,60(1):92–109, 2013.Sarah Banchefsky, Jacob Westfall, Bernadette Park, and Charles M Judd. But you dont looklike a scientist!: Women scientists with feminine appearance are deemed less likely to bescientists.
Sex Roles , 2016.Elizabeth Behm-Morawitz and Michelle Ortiz. Race, ethnicity, and the media.
Oxford Hand-book of Media Psychology , pages 252–266, 2013.Deborah Blum and et al.
A field guide for science writers . Oxford University Press, 2006.Dominique Brossard and Dietram A Scheufele. Science, new media, and the public.
Science ,339(6115), 2013.Clifford C Clogg, Eva Petkova, and Adamantios Haritou. Statistical methods for comparingregression coefficients between models.
American Journal of Sociology , 100(5):1261–1293,1995.Jonathan R Cole.
Fair science: Women in the scientific community . Free Press, 1979.Blaise Cronin and Cassidy R Sugimoto.
Beyond bibliometrics: Harnessing multidimensionalindicators of scholarly impact . MIT Press, 2014.James Devitt. Framing gender on the campaign trail: Female gubernatorial candidates and thepress.
Journalism & Mass Communication Quarterly , 79(2):445–463, 2002.Waverly W Ding, Fiona Murray, and Toby E Stuart. Gender differences in patenting in theacademic life sciences.
Science , 313(5787):665–667, 2006.Donna K Ginther, Walter T Schaffer, Joshua Schnell, Beth Masimore, Faye Liu, Laurel L Haak,and Raynard Kington. Race, ethnicity, and nih research awards.
Science , 333(6045), 2011.Mott Greene. The demise of the lone author.
Nature , 450(7173):1165, 2007.15oger Guimera, Brian Uzzi, Jarrett Spiro, and Luis A Nunes Amaral. Team assembly mecha-nisms determine collaboration network structure and team performance.
Science , 308(5722):697–702, 2005.Jo Handelsman, Nancy Cantor, Molly Carnes, Denice Denton, Eve Fine, Barbara Grosz, Vir-ginia Hinshaw, Cora Marrett, Sue Rosser, Donna Shalala, et al. More women in science.
Science , 309(5738):1190–1191, 2005.Erin Hengel. Publishing while female. are women held to higher standards? evidence from peerreview.
Cambridge Working Papers in Economics 1753 , 2017.Patricia Wonch Hill, Julia McQuillan, Amy N Spiegel, and Judy Diamond. Discovery orienta-tion, cognitive schemas, and disparities in science identity in early adolescence.
SociologicalPerspectives , 2018.Constance Holden. General contentment masks gender gap in first aaas salary and job survey.
Science , 294(5541):396–411, 2001.Travis A Hoppe, Aviva Litovitz, Kristine A Willis, Rebecca A Meseroll, Matthew J Perkins,B Ian Hutchins, Alison F Davis, Michael S Lauer, Hannah A Valantine, James M Anderson,et al. Topic choice contributes to the lower rate of nih awards to african-american/blackscientists.
Science Advances , 5(10):eaaw7238, 2019.Junming Huang, Alexander J Gates, Roberta Sinatra, and Albert-L´aszl´o Barab´asi. Historicalcomparison of gender inequality in scientific careers across countries and disciplines.
PNAS ,117(9):4609–4616, 2020.Sen Jia, Thomas Lansdall-Welfare, and Nello Cristianini. Measuring gender bias in news im-ages. In
Proceedings of the 24th International Conference on World Wide Web , pages 893–898. ACM, 2015.Sen Jia, Thomas Lansdall-Welfare, Saatviga Sudhahar, Cynthia Carter, and Nello Cristianini.Women are seen more than heard in online newspapers.
PLOS ONE , 11(2):e0148434, 2016.16imon M Laham, Peter Koval, and Adam L Alter. The name-pronunciation effect: Why peoplelike mr. smith more than mr. colquhoun.
Journal of Experimental Social Psychology , 48(3):752–756, 2012.Vincent Larivi`ere, Chaoqun Ni, Yves Gingras, Blaise Cronin, and Cassidy R Sugimoto. Bib-liometrics: Global gender disparities in science.
Nature News , 504(7479):211, 2013.Erin Leahey. Not by productivity alone: How visibility and specialization contribute to aca-demic earnings.
American Sociological Review , 72(4):533–561, 2007.Timothy J Ley and Barton H Hamilton. The gender gap in nih grant applications.
Science , 322(5907), 2008.Miller McPherson, Lynn Smith-Lovin, and James M Cook. Birds of a feather: Homophily insocial networks.
Annual Review of Sociology , 27(1):415–444, 2001.Robert K Merton. The matthew effect in science: The reward and communication systems ofscience are considered.
Science , 159(3810):56–63, 1968.Jack Merullo, Luke Yeh, Abram Handler, II Grissom, Brendan O’Connor, Mohit Iyyer, et al.Investigating sports commentator bias within a large corpus of american football broadcasts.In
EMNLP , 2019.David I Miller, Kyle M Nolla, Alice H Eagly, and David H Uttal. The development of children’sgender-science stereotypes: a meta-analysis of 5 decades of us draw-a-scientist studies.
ChildDevelopment , 2018.Staˇsa Milojevi´c. Principles of scientific research team formation and evolution.
PNAS , 2014.Corinne A Moss-Racusin, John F Dovidio, Victoria L Brescoll, Mark J Graham, and Jo Han-delsman. Science facultys subtle gender biases favor male students.
PNAS , 109(41):16474–16479, 2012. 17iego FM Oliveira, Yifang Ma, Teresa K Woodruff, and Brian Uzzi. Comparison of nationalinstitutes of health grant amounts to first-time male and female principal investigators.
JAMA ,321(9):898–900, 2019.Ernesto Reuben, Paola Sapienza, and Luigi Zingales. How stereotypes impair women’s careersin science.
PNAS
Proceedings of the 61st Annual Meeting of the AmericanSociety for Information Science , volume 35, pages 279–289, 1998.Dietram A Scheufele. Communicating science in social settings.
PNAS , 110:14040–14047,2013.Helen Shen. Inequality quantified: Mind the gender gap.
Nature News , 495(7439):22, 2013.Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-june Paul Hsu, and KuansanWang. An overview of microsoft academic service (mas) and applications. In
WWW , 2015.Kevin B Smith. When all’s fair: Signs of parity in media coverage of female candidates.
Polit-ical Communication , 14(1):71–82, 1997.Hyunjin Song and Norbert Schwarz. If it’s difficult to pronounce, it must be risky: Fluency,familiarity, and risk perception.
Psychological Science , 20(2):135–138, 2009.Gaurav Sood and Suriyan Laohaprapanon. Predicting race and ethnicity from the sequence ofcharacters in a name. arXiv:1805.02109 , 2018.S Shyam Sundar. Effect of source attribution on perception of online news stories.
Journalism& Mass Communication Quarterly , 75(1):55–68, 1998.Andrew Tomkins, Min Zhang, and William D Heavlin. Reviewer bias in single-versus double-blind peer review.
PNAS , 114(48):12708–12713, 2017.18ucktada Treeratpituk and C Lee Giles. Name-ethnicity classification and ethnicity-sensitivename matching. In
Twenty-Sixth AAAI Conference on Artificial Intelligence , 2012.Caroline Sotello Viernes Turner, Juan Carlos Gonz´alez, and J Luke Wood. Faculty of color inacademe: What 20 years of literature tells us.
Journal of Diversity in Higher Education , 1(3):139, 2008.Clifford H Wagner. Simpson’s paradox in real life.
The American Statistician , 36(1):46–48,1982.Kuansan Wang, Zhihong Shen, Chi-Yuan Huang, Chieh-Han Wu, Darrin Eide, Yuxiao Dong,Junjie Qian, Anshul Kanakia, Alvin Chen, and Richard Rogahn. A review of microsoftacademic services for science of science studies.
Frontiers in Big Data , 2:45, 2019.Samuel F Way, Daniel B Larremore, and Aaron Clauset. Gender, productivity, and prestige incomputer science faculty hiring networks. In
WWW , pages 1169–1179, 2016.Jevin D West, Jennifer Jacquet, Molly M King, Shelley J Correll, and Carl T Bergstrom. Therole of gender in scholarly authorship.
PLOS ONE , 8(7):e66212, 2013.Yu Xie. undemocracy: inequalities in science.
Science , 344(6186):809–810, 2014.Yue Xie, Kimberlee A Shauman, and Kimberlee A Shauman.
Women in science: Career pro-cesses and outcomes . Harvard University Press, 2003.19 upplemental MaterialS1 Materials and Methods
To test for and quantify gender and ethnic bias across media outlets, we constructed a massivedataset by combining news media reports with metadata for the scientific papers they cover, andthen inferring demographics of the papers’ authors.We focused on mentions of the first authors for two reasons: (i) the first author position ismore likely to be occupied by early career researchers, and as a result, media coverage may bemore consequential for their careers; (ii) science journalism guidelines highlight the first authoras the one who has likely contributed most to the work (Blum and et al., 2006) and therefore isa natural person to mention. Papers in a few research fields that commonly use the alphabetic-based authorship contributions are also included since journalists may be unfamiliar with thisnorm.
S1.1 News Stories Mentioning Research Papers
The dataset of news stories mentioning scientific papers was collected from
Altmetric.com (accessed on Oct 8, 2019), which tracks a variety of sources for mentions of research papers,including coverage from over 2,000 news outlets around the world. To control for differencesin the frequency of scientific reporting and potential confounds from variations in journalisticpractices across different countries, the list of news outlets was curated to 423 U.S.-based newsmedia outlets, with each having at least 1,000 mentions in the Altmetric database. Locationdata for each outlet is provided by Altmetric. This exclusion criterion ensures that the datasethas sufficient volume to estimate outlet-level biases, while still retaining sufficient diversityin outlet types, stories, and the scientific articles they cover. This initial dataset consists of2.4M mentions of 521K papers by 1.7M news articles before 2019-10-06. Each mention in theAltmetric data has associated metadata that allows us to retrieve the original citing news storyas well as the DOI for the paper itself. 20
Due to access and permission limitations when retrieving news stories, 135 outlets were ex-cluded due to insufficient volume (27 outlets denied our access entirely; 65 outlets had lessthan 100 urls crawled; 43 outlets had at least 100 urls crawled, but only with non-news contentsuch as subscription ads). For the remaining 288 outlets, 48.6% of the stories were successfullyretrieved. The stories were then cleaned to remove all html tags and unrelated content such asadvertisements. Stories with less than 100 words were removed (0.7%) as a manual inspectedshowed the vast majority of these do not contain the complete content of the story. This processresults in 568,785 downloaded stories mentioning 290,469 papers from the 288 outlets.In order to control for the effects of journalists’ ethnicity and gender (cf. Section S1.8), weused the newspaper
Python package ( https://github.com/codelucas/newspaper )to extract the journalists’ names from the retrieved html news content. Since not all stories ineach outlet contain the journalist information and the newspaper package does not work per-fectly for every story that has journalist information, we focused on the top 100 outlets (rankedby the story count). With manual inspection, we verified that this package can consistently andreliably identify journalist names for 41 of the top 100 outlets. We excluded extracted nameswith words signaling institutions and organizations (such as “University”, “Hospital”, “World”,“Arxiv”, “Team”, “Staff”, and “Editors”). We also cleaned names by removing prefix words,such as “PhD.”, “M.D.”, and “Dr.”. We eventually obtained the journalist names in 100,163news stories for 41 outlets (17.5%).
S1.3 Retrieving Paper Metadata
The Altmetric database does not contain author information and therefore an additional datasetis needed to identify the authors for mentioned papers. We used the Microsoft Academic Graph(MAG) snapshot data (accessed on June 01, 2019) to retrieve information for each paper basedon its DOI (Sinha et al., 2015). Not all papers with a DOI in the Altmetric database are indexedin the MAG. We were ultimately able to retrieve 269,509 papers from MAG based on DOIs(matching based on lower-cased strings). MAG also provides rich metadata for papers, includ-21ng author names, author rank, author affiliation rank, publication year, publication venue, thepaper abstract, and paper topical keywords. As all of this information will be used in our re-gression models (cf. Section S1.8), we excluded papers with missing metadata and two papersthat list organizations as first authors, leaving us with 100,208 papers.
S1.4 Inferring Author and Journalist Gender and Ethnicity
We used
Ethnea to infer the gender and ethnicity for authors. The library makes its predictionbased on the nearest-neighbor matches on authors’ first and last names using a ground-truthdatabase of scholars’ country of origin, which offers superior performance over alternative ap-proaches (Ambekar et al., 2009; Treeratpituk and Giles, 2012).Author names in the MAG have varying amounts of completeness. While most have thefirst name and surname, special care is taken for three cases: (1) If the name has a single word(e.g., Curie), the ethnicity and the gender are both set to
Unknown , as
Ethnea requires at least aninitial. Single-word name cases occurred for seven authors total. (2) If the name has an initialand surname (e.g., M. Curie), we directly feed it into the API, which provides an ethnicityinference but returns
Unknown for gender due to the inherent ambiguity. (3) If the name has atthree or more words, we take the first word as the given name and the last word as the surname.However, if the first word is an initial and the second word is not an initial, we take the secondword as the given name (e.g., M. Salomea Curie would be Salomea Curie) to improve predictionaccuracy and retrieve a gender inference.While
Ethnea is trained with scholar names, we also applied it to predict the gender andethnicity for journalists (cf. Section S3 for robustness check).
Ethnea assigns fine-grained ethnic categories based on nationality. Here, we follow theirsame term of ethnicity, recognizing that while ethnicity and nationality are closely related, thetwo are not synonymous (discussed in the main text). To test for macro-level trends aroundlarger ethnic categories and to ensure sufficient samples to estimate the effects, we group the24 observed ethnicities into 9 higher-level categories based on linguistic families and culturaldistance (Table S1). 22 road Ethnic Category Individual Ethnicity
African
African non-Chinese East Asian
Indonesian , Japanese , Korean , Mongolian , Thai , Vietnamese
Chinese
Chinese
Eastern European
Baltic , Greek , Hungarian , Slav
British-origin
English
Indian
Indian
Middle Eastern
Arab , Israeli , Turkish
Scandinavian & Germanic
Dutch , German , Nordic
Romance Language
French , Hispanic , Italian , Romanian
Unknown Note: names are unrecognized by
Ethnea .Table S1: 24 individual ethnicities are grouped into the 9 broad ethnic categories.Note that due to sample size and our hypotheses,
African , Chinese , Indian , and
English (renamed as “British-origin”) are kept as separate high-level categories.
Caribbean and
Poly-nesian are excluded due to less than mentions in total. Examples of names classified intoeach ethnicity are provided in Table S9. Ethnea returns binary gender categories:
Female and
Male , though we recognize that researchers may identify with genders outside of these two cat-egories. For both gender and ethnicity separately, some names are classified as “Unknown” ifno discernable signal is found for the respective attribute by
Ethnea . S1.5 Final Dataset and Statistics
The final dataset consists of 232,524 news stories referencing 100,208 research papers. As somestories mentioned more than one paper and some papers were mentioned in more than one story,we have 285,708 total observations to test whether a paper’s first author is mentioned in a story.Figs. S1a-b show the distribution of papers and news stories over time and attention perpaper. News story data is left censored and primarily includes stories written after 2010. Cen-soring can be explained by the fact that
Altmetric.com was only launched in 2012, limiting thecollection of earlier news. As shown in Fig. S1c, news stories can mention papers that werepublished several decades before, highlighting the potential lasting value of scientific work.However, the majority of papers are mentioned within the same year or just a few years afterpublication. Table S2 shows the mention counts for authors in each broad ethnicity group, and23
Year10 C o un t a Scientific papersNews stories 10 Num. of news mentions per paper10 N u m . o f s c i e n t i f i c p a p e r s b 0 10 Gap in years10 N u m . o f m e n t i o n p a i r s c Figure S1: a, The number of news stories and research papers in our mention date over time. b, The distribution of the number of news mentions per paper. c, The distribution of the yeargap between paper publication date and news story mention date for all 285,708 story-papermention pairs in the final dataset.Table S3 shows the mention counts by journalist ethnicity.
Authors Broad Ethnic Category
British-origin 41,446 12,1891 2.94Scandinavian & Germanic 14,982 41,982 2.80Romance Language 14,982 41,156 2.75Chinese 9,262 25,968 2.80Middle Eastern 5,291 15,267 2.89Eastern European 4,313 12,222 2.83Indian 4327 12,576 2.91non-Chinese East Asian 4,408 11,254 2.55African 682 1902 2.79Unknown Ethnicity 515 1,490 2.89Total 100,208 285,708 2.85Table S2: The number of mentioned papers (unique ones), the total number of story-papermention pairs, and the average number of mentions per paper for authors in each of the 9 high-level ethnicity groups.
S1.6 News Outlets Categorization
To estimate differences across outlets, we grouped 288 news outlets into three categories accord-ing to their news report publishing mechanisms. The three categories are: (1) Press Releases,24 ournalists Broad Ethnic Category
British-origin 37,046Scandinavian & Germanic 5,182Romance Language 7,329Chinese 1,251Middle Eastern 1,788Eastern European 1,679Indian 1,213non-Chinese East Asian 451African 321Unknown Ethnicity 229,448Total 285,708Table S3: The number of story-paper mention pairs by journalists in each of the 9 high-levelethnicity groups.(2) Science & Technology, and (3) General News. The categorization is based on manual in-spections of three random stories for each outlet (Appendix Table S10 shows the full list).The Press Releases category is unique since many outlets in this category commonly—ifnot exclusively—republish university press-releases as stories, making them reasonable proxiesfor estimating bias from a university’s own press office. The Science & Technology categoryconsists of magazines that primarily focus on reporting science, such as “MIT TechnologyReview” and “Scientific American.” These outlets typically construct a large scientific narrativereferencing several papers in their stories. The General News category includes mainstreamnews media such as “The New York Times” and “CNN.com” that publish stories in a widevariety of topics. They also have well-trained editorial staff and science journalists who arefocused on accurately reporting science.Table S4 shows the paper-story mention pairs for three types of outlets. The average numberof words per story for each outlet type is shown in Fig. S2.25 utlet Type
Press Releases 18 EurekAlert! 81,486 44.9%Science & Technology 79 MIT Technology Rev. 69,966 51.8%General News 171 The New York Times 125,241 22.1%Table S4: The number of outlets for three outlet types, their number of story-paper mentions,and the percentage of mentions that have named the first authors. The full list of 288 outlets areavailable in Appendix Table S10.
Press Releases Sci. & Tech. General News02468 a v g . s t o r y l e n g t h ( w o r d s ) ×10 Figure S2: The average story length for three types of outlets. Error bars show 95% confidenceintervals.
S1.7 Check Author Attributions in Science News
S1.7.1 Author Name Mentions
We normalized both the news content and the author names to ensure that this computationalapproach works for names with diacritics. For each story-paper mention pair, each author’slast name is searched for using a regular expression with word boundaries around the name,requiring that the name’s initial letter be capitalized. While the chance exists that this processmay introduce false positives for authors with common words as last names (e.g., “White”),such cases are rare because (i) few authors in our dataset have common English words as theirlast names, and (ii) these words rarely appear at the beginning of a sentence in the story whenthey would be capitalized. However, a particular exception is for two common Chinese lastnames “He” and “She,” which can appear as third person pronouns at the start of sentences. Wethus imposed additional constraints for these two names such that they must be immediatelypreceded with one of the following titles to be considered as a name mention: “Professor”,26Prof.”, “Doctor”, “Dr.”, “Mr.”, “Miss”, “Ms.”, ‘Mrs.”. Ultimately, first authors were found in104,569 of the 285,708 story-paper mention pairs (36.6%).
S1.7.2 Author-Quote Detection
Authors can be mentioned by name in different forms, including quotation (e.g., “’We are get-ting close to the truth.’ said Dr. Xu”), paraphrasing (e.g., “Timnit says she is confident, however,that the process will soon be perfected.”), and simple passing (e.g., “A recent research conductedby Dr. Jha found that drinking coffee has no harmful effects on mental health.”).We used a rule based matching method to detect explicit quotes for each story-paper pair.We first parsed our news corpus using spacy ( https://spacy.io/ ). We identified 18 verbsthat were commonly used to integrate quoted materials in news stories, from the most 50 fre-quently used verbs in our news corpus, including “describe”, “explain”, “say”, “tell”, “note”,“add”, “acknowledge”, “offer”, “point”, “caution”, “advise”, “emphasize”, “see”, “suggest”,“comment”, “continue”, “confirm”, “accord”. A sentence is determined to contain a quote fromthe first author if the following two conditions are met: (i) both the quotation mark and theauthor’s last name appear in the sentence, and (ii) any of the 18 quote-signaling verbs (or theirverb tenses) appear with five tokens before or after the author’s last name. A manual inspectionof 100 extracted quotes revealed no false quote attributes. This conservative method only givesan underestimate of the quote rate, as it may not be able to detect every quote due to unusualwriting styles or article formatting. So the benefit of English-named scholars in getting a quote(Fig. 2 in the main text) may be even higher. S1.7.3 Institution Mentions
We checked institution mentions based on exact string matching with the reported instituionname for the first author in the MAG, i.e., for each story-paper pair, we examined whetherthe first author’s full institution name appeared in the news story. Similar to quote detection,this method may not be able to identify every instance of institution mentions due noise inthe MAG or the story using slightly different nomenclature such an institutions’ abbreviation.However, a full list of alternate names for each institution is not available to us, we thus used27his conservative method. For this reason, minority scholars’ the trend in being substituted byinstitutions (Fig. 2 in the main text) is likely an underestimation.
S1.8 Regression Models
We adopted a logistic regression framework to examine the demographic bias in author men-tions in science reporting. Many factors are known to influence name mentions that couldconfound the analysis of ethnicity and gender, such as author reputation, institutional prestigeand location, publication topics and venues, or outlets and journalist demographics.Here, we provide details of these factors and present a series of five regression modelsthat build upon one another by adding more rigorous control variables at each step. In ourregression framework, each story-paper mention pair is an observation, with the dependentvariable indicating whether the first author of the paper is mentioned or not in the story. Wedesigned a mixed-effects model with five groups of variables: (1) first author demographics(gender and ethnicity); (2) paper author controls, including prestige factors, last name factors,and other authors; (3) paper and story content, including temporal factors, paper readability,story length, number of papers mentioned per story, and journalist demographics; (4) fixed-effects for paper domains and topics; (5) random effects for outlets, publication venues, andpopular last authors. The increasing level of model complexity allows us to test the robustnessof the effects of ethnicity and gender, and also to examine potential factors at play in sciencecoverage. Table S5 shows the step-wise regression results.
Model 1: Naive Bias
The first model directly encodes our two variables of focus, gender and ethnicity, as the solecategorical factors of the regression model. Here and throughout the study, we treat the ref-erence coding for ethnicity as
British-origin and for gender as
Male . While overly simplisticin its modeling assumptions, Model 1 nevertheless tests for systematic differences for whetherauthors of a particular demographic are mentioned less frequently and serves as a baseline forlayering on controls to explain such bias. 28 odel 2: Paper Author Controls
Many author-level attributes other than demographics could influence journalistic perceptionson authors and the coverage of them. Model 2 introduces 20 additional factors for controllingfor features of the paper’s authors.
Prestige Factors.
The reputation of the first author may also influence the chance of beingnamed. High-status actors and institutions tend to receive preferential treatment within sci-ence (Merton, 1968; Azoulay et al., 2013; Tomkins et al., 2017), and we hypothesize that theseprestige-based disparities may carry over to media coverage as well. To account for prestige ef-fects, we include the author rank and institution rank provided by the MAG (Wang et al., 2019).This ranking estimates the relative importance of authors and institutions using paper-level fea-tures derived from a heterogeneous citation network; while similar to h-index, the method hasbeen shown to produce more fine-grained and robust measurements of impact and prestige. In-stitution and author ranks are not necessarily directly related, as institutions may be home toauthors of varying ranks (e.g., early- or late-career faculty) and the same author may appearwith different affiliations on separate papers due to a career move. Note that for rank values,negative-valued coefficients in the regression models would indicate that higher-ranked individ-uals and those from higher-ranked institutions are more likely to be mentioned.We also add a variable indicating the location of the first author’s institution with threecategories: (1) domestic, (2) international, (3) unknown. This variable controls for the geo-graphical factor that may influence journalists’ willingness to contact by phone or video chatservice and therefore influence whether they mention the author. We infer the country of originfor institutions based on their latitude and longitude provided in the MAG.
Last Name Factors.
People are known to have a preference for both familiar and moreeasily-pronounceable names (Song and Schwarz, 2009; Laham et al., 2012), and this preferencecould potentially bias which author a journalist mentions. Therefore, we introduce two factorsas proxies: (1) the number of characters in the last name as a proxy for pronounceability, and(2) the log-normalized count of the last name per 100K Americans from the 2018 census data.As journalists are drawn from U.S.-based news sources, the latter reflects potential familiarity.29 ther Authors.
Scientific knowledge is increasingly discovered by large teams, as tacklingcomplex problems often require the collaboration between experts with diverse sets of special-ization (Guimera et al., 2005; Greene, 2007; Milojevi´c, 2014). On these multi-author projects,the last author is typically the senior author responsible for directing the project—a trend that isknown in science journalism guidelines when determining whom to interview (Blum and et al.,2006). The last author could be more likely to be mentioned in press coverage, which could po-tentially reduce the chance for the first author. Therefore, we control for whether the last authoris mentioned in the news article using a binary factor. As the demographics of the last authormay influence whom a journalist decides to mention, we control for the ethnicity and gender ofthe last author, using
British-origin and
Male as the reference category respectively. Note thatsome papers are monographs with no last author. To control for these cases, we include a binaryfactor
Solo which is set to 1 for monographs, at which point all factors related to the last author(gender, ethnicity, and is-mentioned) are set to 0.When journalists examine a paper’s author list, the team size may influence their under-standing of the distribution of credits among authors, potentially reducing the chance of anyauthor being mentioned for papers with many authors. We thus include a factor for the numberof authors.
Model 3: Paper and Story Content
Besides author-level attributes, the content of the paper and story, and journalist demographicsalso can play a role in affecting author mentions. We thus control for the following factors inModel 3.
Year of News Story (Mention Year).
Bias in science coverage may have temporal variationsdue to unpredictable factors that are directly or indirectly related to research. For instance, theavailable funding resources can affect the number of research outputs in a year, which would inturn influence the amount of time and space journalists devote to scientists in news articles. Wethus control for the year of the news story, i.e., the mention year of the paper. We treat it as ascalar variable (zero-centered).
Year Gap between Story and Paper.
News stories often reference older scientific papers in30he narrative, as shown in Fig. S1c. For older papers, at the time of a recent story publication,the original authors may be unable to be reached or the story may be framed differently fromrecent science that is considered “fresh.” Indeed, citing timely scientific evidence in a newsreport can increase credibility perceptions of a story (Sundar, 1998; Rieh and Belkin, 1998).Therefore we include a factor that quantifies the year difference between the mention year andthe publication year of the mentioned paper.
Number of papers mentioned in a story.
A story can mention several papers to help frameand construct its scientific narrative, and potentially increase its news credibility perception.However, the more papers being referenced in a story may reduce the amount of space andattention allocated to each paper by journalists, and therefore may decrease the chance of itsauthors being mentioned. We thus control for the number of mentioned papers in a story.
News Story Length.
Longer articles provide more space in depicting stories about the sciencebeing covered, we thus control for the length of each story, measured as the total number ofwords.
Paper Readability.
Given the tight timelines under which journalists work, quickly iden-tifying and understanding insights is likely critical to what is said about a paper. A paper’sreadability may thus influence whether a journalist feels the need to reach out to the author,with more readable papers requiring less contact. Readability, in turn, may also be tied to au-thor’s demographics like gender (Hengel, 2017), making it important to take readability intoaccount. Due to licensing restrictions, the full text of the majority of papers is unavailablefreely; therefore we compute readability over the paper abstract using three factors: (1) theFlesch-Kincaid readability score, which estimates the grade-level needed to understand the pas-sage; (2) the number of sentences per paragraph, which is a proxy for information content anddensity; and (3) the type-token ratio, which is a measure of lexical variety. Another reason wefocus particularly on the abstract is that journalists may not read the entire paper but very likelyread the abstract.
Journalist Demographics.
It is ultimately the journalist’s decision to mention authors whenwriting science reports. Motivated by the commonly observed homophily principle in social31etworks (McPherson et al., 2001), we hypothesize that the mentioning behavior in sciencereporting is associated with homophilous effects by ethnicity and gender. To model such effects,we include the journalists’ demographics and their interactions with first authors’ gender andethnicity.Due to insufficient instances of journalists identified in news stories (cf. Section S1.2; Ta-ble S3), we further coarsen the 9 broad ethnicity categories into 4 groups: (1) Asian (Chinese,Indian, and non-Chinese East Asian), (2) British-origin, (3) European (Eastern European, Ro-mance Language, and Scandinavian & Germanic), and (4) Other Unknown (Middle Eastern,African, and Unknown).
Model 4: Paper Domains and Topics
Some scientific domains and topics may be inherently more news-worthy than others. Further-more, journalists’ academic backgrounds may be unequally distributed across scientific fields,resulting in different propensities to reach out to authors. Therefore, in Model 4, we includefactors to capture the domain of a paper using metadata from the MAG, which includes a largevolume of keywords (665K) at different levels of specificity. A paper can have multiple key-words, with each having an associated confidence score between 0 and 1. To capture high-leveltopical and methodological differences, we restrict our focus to the most-common 533 key-words that occur in at least 500 papers in our dataset. Each keyword is used as an independentvariable in the regression, whose value is the keyword’s confidence score for the paper.
Model 5: Outlets, Venues, and Famous Research Labs
News outlets and publication venues both reflect extra sources of variability in the regressionmodels. Individual news outlets may follow different standards of practice in how they describescience, creating a separate source of variability in who is mentioned. Publication venues eachcome with different levels of impact and topical focus that potentially affect the depth of jour-nalistic focus on papers published in them. Additionally, famous research labs managed bysenior researchers may be more likely to receive media attention and name attribution as a ben-efit of their visibility gained by previous research outputs. Such popularity can be approximated32y famous last authors based on their number of mentioned papers in our data. To accuratelymodel these sources of variations, we treat outlets, venues, and top 100 last authors as randomeffects in regression Model 5. This mixed-effect regression model implicitly captures a robustset of factors involved in science reporting such as the tendency of specific journals to be men-tioned more frequently (e.g.,
Nature , Science , or
JAMA ), the focus of news outlets on specifictopics covered by different journals, and the attention benefits for authors working with famousresearch labs.
S2 Regression Results
S2.1 Coefficients for Five Models in Author Mentions
The coefficients for five regression models are shown in Table S5. For space, all variables inModel 5, including the paper keywords and author-journalist interaction terms, are shown inAppendix Table S11.
S2.2 Influence of Control Variables
Although our focus in on ethnicity and gender, we find that many controls are also stronglyassociated with author mention rates. Examining the influence of these factors can lead to abetter understanding of the mechanisms at play in science reporting. Below we interpret theireffects based on Model 5 (Table S5) along three themes: (1) prestige related inequality, (2)impact of co-authorship, and (3) story content effects.Scholars who have a high professional rank or are affiliated with prestigious institutionsreceive outsized attention in science news. This result suggests that the benefits of status, theso-called “Matthew Effect” (Merton, 1968), persist even after publication.Although having more authors has a weak negative effect on the first author being men-tioned, if the last author is mentioned, the first author is substantially more likely to be men-tioned as well, suggesting that many stories tend to only engage with a few authors per refer-enced paper. Surprisingly, the demographics of last authors also play a weak role in first author33entions, with slightly negative effects for last authors with Eastern European, Middle Eastern,and Chinese names.Solo-authored papers have been decreasing over time and are associated with lower impacton average (Greene, 2007; Milojevi´c, 2014). However, our results highlight an underappreciatedbenefit—conditional on a paper being referenced in the news, a solo author is significantly morelikely to be mentioned compared to the first author of a multi-author paper. Although seeminglycounter to previous studies, this result has a natural explanation—there is only one author tomention if need be.The coefficients for story features point to the multifaceted nature of science reporting.Although the volume of science reporting is increasing over time (Fig. S1a), journalists tendto mention authors less frequently in later years. At the same time, while older papers arestill discussed in the media (Fig. S1c), journalists are less likely to mention authors of thesestudies as often. When more papers are referenced in a story, their first authors are less likelyto be mentioned. We hypothesize that such stories are often citing multiple scientific papers toconstruct a large narrative and thus those papers are only mentioned in passing.
S2.3 U.S. vs. non-U.S. Institutions in Author Mentions
When fitting a model for the U.S. subset (or non-U.S. subset), we omitted the location variableintroduced in Section S1.8 (Model 2). The coefficients for gender and ethnicity in two modelsare shown in Table S6, which reveal that scholars from non-U.S. institutions are much lesslikely to be mentioned by U.S. media than their counterparts from U.S.-based institutions, withfour categories reaching statistical significance, including Romance Language, Scandinavian &Germanic, Chinese, and Middle Eastern.
S2.4 Who is Quoted or Institutionally Substituted?
The three subplots in Fig. S3 show the average marginal effects for minority gender and ethnic-ity authors in being mentioned by name, quoted, or substituted by institution when author nameis not mentioned, respectively. Note that each model is fitted with our full data.34 .10 0.05 0.00 0.05AfricanIndianMiddle EasternChinesenon-Chinese East AsianEastern EuropeanScandinavian & GermanicRomance LanguageFemale
Mention
Quote
Inst. Substitution
Probability of being credited compared to Male/British-origin named authors
Figure S3: Authors with minority-ethnicity names are less likely to be mentioned by name( left ) or quoted ( middle ), and are more likely to be substituted by their institution ( right ). Theaverage marginal effects are estimated based on 285,708 observations in our data. A negative(positive) marginal effect indicates a decrease (increase) in probability compared to authorswith Male (for gender) or British-origin (for ethnicity) names. The colors are proportionalto the absolute probability changes.
Female is colored as blue to reflect its difference fromethnicity identities. The error bars indicate 95% bootstrapped confidence intervals.
S3 Additional Ethnicity Coding
While
Ethnea provides a large set of nationality-based ethnicity codings specifically tailoredto scientists, the library could potentially introduce artifacts in its labeling. As a robustnesscheck, we re-coded the ethnicities of all authors and journalists using two separate sources totest whether the observed bias persists. Specifically, we used the ethnicolr library ( https://pypi.org/project/ethnicolr/ ) to code ethnicity using either data derived from (i)the nationalities listed in Wikipedia infoboxes to infer nationality-based ethnicity, or (ii) self-reported ethnicity data associated with last names from the 2010 U.S. census. While these twonew sources of data use different definitions and granularities of ethnicity from
Ethnea , theynonetheless provide approximately-similar categories to
Ethnea that enable us to validate ourresults.
Ethnicity based on Wikipedia Data.
We used the Wikipedia infobox data to code au-thor and journalist ethnicity based on the first name and the last name (Ambekar et al., 2009;Sood and Laohaprapanon, 2018). To make the results comparable to that based on
Ethnea (Section S1.4), we placed 13 individual ethnicities defined in the Wikipedia data into 8 broad35ategories:• (1) African (
Africans ),• (2) British-origin (
British ),• (3) East Asian (
EastAsian , Japanese ),• (4) Eastern European (
EastEuropean ),• (5) Indian (
IndianSubContinent ),• (6) Middle Eastern (
Muslim , Jewish )• (7) Roman Language (
French , Hispanic , Italian ),• (8) Scandinavian & Germanic (
Germanic , Nordic ).Note that Chinese ethnicity (defined in
Ethnea ) is by default incorporated into the
EastAsian ethnicity in the Wikipedia data. We further placed the 8 categories into 4 groups for journalistethnicity due to insufficient data size: (1) Asian (East Asian, Indian), (2) British-origin, (3) Eu-ropean (Eastern European, Roman Language, Scandinavian & Germanic), (4) Other Unknown(African, Middle Eastern, Unknown). We fitted the specification of Model 5 using this codingscheme (British-origin and Male are still used as the reference categories).
Race in U.S. Census Data.
Similarly, we coded the race for authors and journalists usingthe 2010 U.S. Census data based on the last name (Ambekar et al., 2009; Sood and Laohapra-panon, 2018). The four race categories: (1) Asian ( api ; [note that api denotes Asian and PacificIslander]), (2) Black ( black ), (3) Hispanic ( hispanic ), (4) White ( white ), are directly used to fitthe specification of Model 5 with White and Male used as the reference categories.Fig. S4 shows the average marginal effects in mention rates for scholars of minority ethnicity(or race) compared to British-origin (or White) named authors. As neither tool infers gender,we thus report the result for gender here using
Ethnea ’s labels. Like the case of
Ethnea , we find36igure S4: The average marginal effects in mention probability for first authors’ demographicvariables, using (
Left ) Wikipedia data for coding ethnicity or (
Right ) U.S. Census data forcoding race based on author (or journalist) names. Note that the gender is stilled inferred using
Ethnea .strong anti-Asian biases in author mentions in science news, highlighting the robustness of ourfindings in the main text. 37 odel 1 Model 2 Model 3 Model 4 Model 5 F I R S T A U T HO R D E M OG . African − . ∗∗∗ − . ∗∗∗ − . ∗ − . − . ∗ non-Chinese East Asian − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Chinese . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Eastern European − . ∗ − . ∗∗ − . − . − . ∗ Indian − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Middle Eastern − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Scandinavian & Germanic − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗ − . ∗∗∗ Romance Language − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Unknown Ethnicity − . ∗∗∗ − . ∗∗∗ − . − . − . Female − . ∗∗∗ − . ∗∗∗ .
051 0 . ∗ . Unknown Gender − . ∗∗∗ − . ∗∗∗ . − . − . Author rank − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Affiliation rank − . ∗∗∗ − . ∗ − . ∗∗∗ − . ∗∗∗ Affiliation international (location) − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Affiliation unknown (location) .
179 0 .
192 0 . ∗ . Last name length − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Last name frequency .
002 0 .
001 0 .
001 0 . Is the paper solo authored? . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ L A S T A U T HO R D E M OG . African .
024 0 .
016 0 .
051 0 . non-Chinese East Asian . ∗∗ . ∗∗ . ∗∗∗ . Chinese − . ∗∗ − . ∗∗∗ − . − . ∗∗ Eastern European − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Indian − . ∗∗ − . ∗∗∗ − . ∗ − . Middle Eastern − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Scandinavian & Germanic . ∗∗∗ .
020 0 . ∗ . Romance Language . ∗ .
011 0 . − . Unknown Ethnicity − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Female . ∗ . ∗∗∗ . ∗∗∗ . ∗∗ Unknown Gender . ∗∗∗ . ∗∗∗ . ∗ − . Is last author mentioned? . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ Number of authors in the paper − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ J R N . D E M OG . Asian − . − . ∗∗∗ − . European . ∗∗ . ∗∗∗ − . Other Unknown Ethnicity . ∗∗∗ . ∗∗∗ − . Female − . ∗∗∗ − . ∗∗∗ − . Unknown Gender .
018 0 . − . A U T . - J R N . Scandinavian & Germanic: Asian .
274 0 . ∗ . ∗ Chinese: European .
123 0 .
115 0 . ∗ Romance Language: European .
078 0 .
077 0 . ∗ Chinese: Other Unknown . ∗∗ . ∗∗∗ . ∗∗∗ Scandinavian & Germanic: Other Unknown .
053 0 .
044 0 . ∗∗ Year of news story (mention year) . ∗ − . − . ∗∗∗ Year gap between story and paper − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Num. of papers mentioned in a story − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ News story length − . ∗∗∗ − . ∗∗∗ . ∗∗∗ Flesch-Kincaid score − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ Sentences per paragraph .
001 0 .
003 0 . ∗∗ Type-Token Ratio − . ∗∗∗ .
005 0 . ∗ Intercept − . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ . ∗∗∗ Fixed effects for paper keywords No No No
Yes Yes
Random effects for outlets and venues No No No No
Yes
Random effects for top 100 last authors No No No No
Yes
Akaike Information Criteria (AIC) 374,752.6 334,648.9 315,167.3 307,805.3 230,167.5
Table S5: Coefficients of five increasing-complexity regression models in predicting if thefirst author is mentioned using 285,708 observations. For author-journalist interactions (AUT.-JRN.), only significant terms are shown. All variables in Model 5, including 533 keywords, areprovided in Appendix Table S11. *** p < < < ender/Ethnicity U.S.-based non-U.S. p-value Female .
08 0 . − . ∗ − . ∗∗∗ − . ∗ − . ∗∗∗ − . − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . ∗∗∗ − . − . ∗∗∗ − . ∗∗ − . ∗ − . − . ∗ < < < First Author Name
Ethnea
U.S. Census Wikipedia
Alana Lelo African White Romance LanguageSamuel Lawn African White British-originSaka S Ajibola African Black East AsianMosi Adesina Ifatunji African Black AfricanSebastian Giwa African White AfricanOlabisi Oduwole African White AfricanChidi N. Obasi African White AfricanHabauka M. Kwaambwa African Asian AfricanEsther E Omaiye African White AfricanAurel T. Tankeu African White British-originTable S7: A random sample of 10 African authors predicted by
Ethnea (out of 613 in total inour data) and their ethnicity or race categories based on the U.S. census or the Wikipedia data.39 irst Author Name U.S. Census
Ethnea
Wikipedia
E. Robinson Black British-origin British-originMomar Ndao Black Romance Language AfricanAngela F Harris Black British-origin British-originDaddy Mata-Mbemba Black Romance Language AfricanA Bolu Ajiboye Black African AfricanLasana T. Harris Black British-origin British-originJohn M. Harris Black British-origin British-originEdwin S Robinson Black British-origin British-originEric A. Coleman Black British-origin British-originMp Coleman Black British-origin British-originTable S8: A random sample of 10 Black authors predicted based on the U.S. census data (outof 560 in total in our data) and their ethnicity categories based on
Ethnea or the Wikipedia data.40
Tables
Table S9: A random sample of 10 names for each of the24 individual ethnicities and the “Unknown” category. All 6MONGOLIAN names in our data are shown here.
Ethnicity Name Example Gender
AFRICAN Dora Wynchank FBenjamin D. Charlton MJ. Nwando Olayiwola unknownAyodeji Olayemi MElizabeth Gathoni Kibaru FChristopher Changwe Nshimbi MNaganna Chetty unknownBenjamin Y. Ofori MKhadijah Essackjee FJeanine L. Marnewick FHabtamu Fekadu Gemede MARAB Zaid M. Abdelsattar MAlireza Dirafzoon MAhmad Nasiri MSaleh Aldasouqi MIbrahim A. Arif MSameer Ahmed MA Elgalib unknownTaha Adnan Jan MMohsen Taghizadeh MBehnam Nabet MBALTIC Skirmantas Kriaucionis MAiridas Korolkovas MEgle Cekanaviciute FArunas L. Radzvilavicius MIeva Tolmane FAlberts B MGediminas Gaigalas MArmandas Balcytis unknownRuta Ganceviciene FAndrius Paukonis MCHINESE Chin Hong Tan unknownLi Yuan unknownYalin Li unknown41ian Adiconis unknownPhilip Sung-En Wang MXiaohui Ni unknownMinghua Li unknownFang Fang Zhang FLi-Qiang Qin MJian Tan unknownDUTCH Pieter A. Cohen MI. Vandersmissen unknownMarleen Temmerman FGerard ’t Hooft MA. Yool unknownG. A W Rook unknownFatima Foflonker FMirjam Lukasse FSander Kooijman MIzaak D. Neveln MENGLISH Isabel Hilton FGavin J. D. Smith MKatherine A. Morse FAndrew S. Bowman MT. M. L. Wigley unknownFrancis Markham MNeil T. Roach MBrooke Catherine Aldrich FVaughn I. Rickert MKellie Morrissey FFRENCH Lucas V. Joel MDaniel Clery MPierre Jacquemot MScott Le Vine MNathalie Dereuddre-Bosquet FStphane Colliac unknownAdelaide Haas FJulie M. D. Paye FJustine Lebeau FArnaud Chiolero MGERMAN Laure Schnabel FJeff M. Kretschmar ME. Homeyer unknownMaren N. Vitousek F42. Wild unknownHany K. M. Dweck ME. M. Fischer unknownPaul Marek MHans-Jrg Rheinberger MDaniel James Cziczo MGREEK Mary J. Scourboutakos FAnita P Courcoulas FElgidius B. Ichumbaki unknownStavros G. Drakos MNikolaos Konstantinides MConstantine Sedikides MMaria A. Spyrou FPanos Athanasopoulos MAristeidis Theotokis MAmy H. Mezulis FHISPANIC Mirela Donato Gianeti FJulio Cesar de Souza MPaulina Gomez-Rubio FJos A. Pons MArnau Domenech MNicole Martinez-Martin FMauricio Arcos-Burgos MRaquel Muoz-Miralles FAnnmarie Cano FMerika Treants Koday FHUNGARIAN Andrea Tabi FRbert Erdlyi MGabor G. Kovacs MXenia Gonda FErzsbet Bukodi unknownJulianna M. Nemeth FIan K. Toth MZoltan Arany MCory A. Toth MAshley N. Bucsek unknownINDIAN Sachin M. Shinde MGovindsamy Vediyappan MAshish K. Jha MTamir Chandra MHariharan K. Iyer M43hanpreet Singh unknownRavi Chinta MMadhukar Pai MLalitha Nayak FRavi Dhingra MINDONESIAN Dewi Candraningrum unknownRichard Tjahjono MT. A. Hartanto unknownJohny Setiawan MTruly Santika unknownChairul A. Nidom unknownChristine Tedijanto FAlberto Purwada MArdian S. Wibowo MAnna I Corwin FISRAELI Ron Lifshitz MMartin H. Teicher MRuth H Zadik FGil Yosipovitch MMor N. Lurie-Weinberger unknownJ. Tarchitzky unknownIlana N. Ackerman FB. Trakhtenbrot unknownYoram Barak MMendel Friedman MITALIAN Tiziana Moriconi FMarco Gobbi MMarco De Cecco MF. Govoni unknownTheodore L. Caputi MMark A Bellis MFernando Migliaccio MJulien Granata MJennifer M. Poti FBrendan Curti MJAPANESE Takuji Yoshimura MMaki Inoue-Choi FMasaaki Sadakiyo MMoeko Noguchi-Shinohara FNaoto Muraoka MShigeki Kawai M44oji Mikami MMasayoshi Tokita MNaohiko Kuno MSaba W. Masho FKOREAN Jih-Un Kim MHanseon Cho unknownHyung-Soo Kim MYun-Hee Youm FYoon-Mi Lee unknownSoo Bin Park FYungi Kim unknownWoo Jae Myung unknownKunwoo Lee unknownSandra Soo-Jin Lee FMONGOLIAN C. Jamsranjav unknownJigjidsurengiin Batbaatar unknownKhishigjav Tsogtbaatar unknownMigeddorj Batchimeg unknownTsolmon Baatarzorig unknownNORDIC Steven G. Rogelberg MKirsten K. Hanson FJan L. Lyche MMorten Hesse MKarolina A. Aberg FBritt Reuter Morthorst FKirsten F. Thompson FShelly J. Lundberg FG Marckmann unknownDavid Hgg MROMANIAN Afrodita Marcu FIulia T. Simion FLiviu Giosan MAlina Sorescu FLiviu Giosan MMircea Ivan MDana Dabelea FConstantin Rezlescu MChristine A. Conelea FR. A. Popescu unknownSLAV Nomi Koczka FMikhail G Kolonin M45ichard Karban MBranislav Dragovi MH Illnerov unknownMarte Bjrk FJacek Niesterowicz MJustin R. Grubich MMikhail Salama Hend MSnejana Grozeva FTHAI Piyamas Kanokwongnuwut unknownClifton Makate MNoppol Kobmoo unknownKabkaew L. Sukontason unknownAroonsiri Sangarlangkarn unknownYossawan Boriboonthana unknownEkalak Sitthipornvorakul unknownTony Rianprakaisang MApiradee Honglawan FWonngarm Kittanamongkolchai unknownTURKISH Iris Z. Uras FMetin Gurcan unknownMustafa Sahmaran MPinar Akman FJoshua Aslan MSelin Kesebir FTan Yigitcanlar unknownThembela Kepe unknownUlrich Rosar MSelvi C. Ersoy FVIETNAMESE Huong T. T. Ha unknownVu Van Dung MH ChuongKim unknownDaniel W. Giang MNhung Thi Nguyen unknownV. Phan unknownOanh Kieu Nguyen FPhuc T. Ha MBich Tran unknownOanh Kieu Nguyen FUnknown Gene Y. Fridman MJudith Glck FNoor Edi Widya Sukoco unknown46harlene Laino FBenot Brard unknownDavid Znd MKatarzyna Adamala FK.A. Godfrin unknownShadd Maruna MMariette DiChristina F47able S10: The 288 U.S.-based outlets are grouped into 3categories based on their topics of reports. Note that other135 U.S.-based outlets, which are not shown in this table, areexcluded in our analyses due to technical limitations in ac-cessing sufficient volumes of their content (e.g., view-limitedpaywalls or anti-crawling mechanisms).
Outlet Type
OnMedica Sci. & Tech.Huffington Post General NewsKiiiTV 3 General NewsCarbon Brief Sci. & Tech.PR Newswire Press ReleasesNutra Ingredients USA Sci. & Tech.The Bellingham Herald General NewsCNN News General NewsHealth Medicinet Press ReleasesHerald Sun General NewsEurekAlert! Press ReleasesAJMC Press ReleasesThe University Herald General NewsLincoln Journal Star General NewsCardiovascular Business Sci. & Tech.MinnPost General NewsCNET Sci. & Tech.Infection Control Today Sci. & Tech.Science 2.0 Sci. & Tech.Lexington Herald Leader General NewsStatesman.com General NewsNanowerk Press ReleasesThe San Diego Union-Tribune General NewsThe Daily Beast General NewsLab Manager Press ReleasesSDPB Radio General NewsNew Hampshire Public Radio General NewsHealth Day Press ReleasesRocket News General NewsKPBS General NewsTechnology.org Press ReleasesUPI.com General NewsWUWM General News48entral Coast Public Radio General NewsThe Hill General NewsThe Epoch Times General NewsBiospace Sci. & Tech.Minyanville: Finance General NewsNature World News Sci. & Tech.New York Post General NewsAction News Now General NewsWUNC General NewsFuturity Press ReleasesReason General Newsazfamily.com General NewsIdaho Statements General NewsGoogle News General NewsTri States Public Radio General NewsAmerican Physical Society - Physics Press ReleasesKTEP El Paso General NewsLiveScience Sci. & Tech.KUNC General NewsThe Daily Meal Sci. & Tech.AOL General NewsWomen’s Health Sci. & Tech.Prevention Sci. & Tech.ECN Sci. & Tech.Iowa Public Radio General NewsBecker’s Hospital Review Sci. & Tech.7th Space Family Portal Press ReleasesSpringfield News Sun General NewsEnvironmental News Network Press ReleasesSky Nightly Sci. & Tech.Quartz Sci. & Tech.Benzinga General NewsHeadlines & Global News General NewsThe Denver Post General NewsScience Daily Press ReleasesThe Advocate General NewsABC News General NewsNewswise Press Releaseshellogiggles.com General NewsWLRN General NewsEarthSky Sci. & Tech.49ecker’s Spine Review Sci. & Tech.MIT News Press ReleasesMarketWatch General NewsArstechnica Sci. & Tech.Journalist’s Resource Sci. & Tech.Northern Public Radio General NewsEveryday Health Sci. & Tech.Star Tribune General NewsTCTMD Sci. & Tech.The Verge General NewsShe Knows General NewsSeedQuest Sci. & Tech.Tech Times Sci. & Tech.Witchita’s Public Radio General NewsOncology Nurse Advisor Sci. & Tech.Delmarva Public Radio General NewsMedical Daily Sci. & Tech.Homeland Security News Wire General NewsDiscover Magazine Sci. & Tech.Washington Post General NewsMSN General NewsHawaii News Now General NewsThe Daily Caller General NewsNews Tribune General NewsThe Fresno Bee General NewsKing 5 General NewsStar-Telegram General NewsCNBC General NewsSalon General NewsWJCT General NewsWVPE General NewsKTEN General NewsWired.com General NewsDaily Kos General NewsUSA Today General NewsMen’s Health Sci. & Tech.Boise State Public Radio General NewsVoice of America General NewsPR Web Press ReleasesGeorgia Public Radio General NewsFiveThirtyEight General News50ublic Radio International General NewsHarvard Business Review General NewsInverse General NewsDoctors Lounge Sci. & Tech.North East Public Radio General NewsThe Charlotte Observer General NewsNational Geographic Sci. & Tech.Pharmacy Times Sci. & Tech.Popular Science Sci. & Tech.ABC Action News WFTS Tampa Bay General NewsNews Channel General NewsThe University of New Orleans Public Radio General NewsMic General NewsHealth Canal Sci. & Tech.KOSU General NewsRaleigh News and Observer General NewsThe Atlantic General Newsnewsmax.com General NewsYahoo! Finance USA General NewsGovernment Executive General NewsInternational Business Times General NewsEmaxhealth.com Press ReleasesNewsweek General NewsFOX News General NewsThe New York Observer General NewsSign of the Times General NewsThe Inquisitr General NewsABC News 15 Arizona General NewsParent Herald General NewsThe ASCO Post Sci. & Tech.Clinical Advisor Sci. & Tech.Slate Magazine General NewsNPR General NewsHealth Sci. & Tech.Dayton Daily News General NewsGuardian Liberty Voice General NewsBelleville News-Democrat General NewsYahoo! News General NewsWCBE General NewsBuzzfeed General NewsSci-News Sci. & Tech.51he Seattle Times General NewsPhilly.com General NewsRenal & Urology News Sci. & Tech.Arizona Public Radio General NewsInterlochen Public Radio General News12 News KBMT General NewsNew York Magazine General NewsMedium US General NewsKPCC : Southern California Public Radio General News2 Minute Medicine Sci. & Tech.Pediatric News Sci. & Tech.redOrbit Sci. & Tech.Insurance News Net General NewsDrug Discovery and Development Sci. & Tech.USNews.com General NewsYahoo! General NewsThe Body Sci. & Tech.GEN Sci. & Tech.Pacific Standard General NewsNorthwest Indiana Times General NewsPsychology Today Sci. & Tech.Oregon Public Broadcasting General NewsMother Nature Network Sci. & Tech.Pressfrom General NewsPhysician’s Weekly Sci. & Tech.Pettinga: Stock Market General NewsWinona Daily News General NewsRunner’s World Sci. & Tech.Bio-Medicine.org Press ReleasesAlternet General NewsMother Jones General NewsThe Wichita Eagle General NewsCornell Chronicle Press ReleasesPolitico Magazine General NewsEquities.com General NewsWBUR General NewsABC 7 WKBW Buffalo General NewsBillings Gazette General NewsMy Science Sci. & Tech.The Week General NewsBioTech Gate Sci. & Tech.52ansas City Star General NewsThe Deseret News General NewsPBS General NewsSpace.com Sci. & Tech.Astrobiology Magazine Sci. & Tech.Outside General NewsValue Walk General NewsWYPR General NewsBustle General NewsScience World Report Sci. & Tech.Inside Science Sci. & Tech.Science Alert Sci. & Tech.Breitbart News Network General NewsSt. Louis Post-Dispatch General NewsHowStuffWorks General NewsWyoming Public Radio General NewsUBM Medica Sci. & Tech.Fight Aging! Sci. & Tech.MIT Technology Review Sci. & Tech.WVXU General NewsThe Ecologist Sci. & Tech.Alaska Despatch News General NewsHealth Imaging Sci. & Tech.Kansas City University Radio General NewsChristian Science Monitor General NewsMedicinenet Sci. & Tech.WTOP General NewsBusiness Insider General NewsReal Clear Science Sci. & Tech.Counsel & Heal Sci. & Tech.The Raw Story General NewsMedcity News Sci. & Tech.Drugs.com Sci. & Tech.Relief Web Press ReleasesSPIE Newsroom Sci. & Tech.New York Daily News General NewsNewser General NewsThe Sacramento Bee General NewsVice General NewsR&D Sci. & Tech.KCENG12 Sci. & Tech.53nc. General NewsScience/AAAS Sci. & Tech.The Atlanta Journal Constitution General NewsBrookings General NewsCommon Dreams General NewsPhysician’s Briefing Press ReleasesKERA News General NewsSpace Daily Sci. & Tech.Tech Xplore Sci. & Tech.US News Health Sci. & Tech.KUOW General NewsWRKF General NewsTIME Magazine General NewsSmithsonian Magazine Sci. & Tech.Herald Tribune General NewsLifehacker General NewsFast Company General NewsKansas Public Radio General NewsOmaha Public Radio General NewsNew York Times General NewsTechnology Networks Sci. & Tech.Elite Daily General NewsCentre for Disease Research and Policy Sci. & Tech.Business Wire General NewsKUNM General NewsCBS News General NewsScientific American Sci. & Tech.NBC News General NewsSun Herald General NewsKRWG TV/FM General NewsTODAY General NewsRadio Acadie General NewsThe Columbian General NewsHouston Chronicle General NewsWABE General NewsThe Modesto Bee General NewsAmerican Council on Science and Health Sci. & Tech.WKAR General NewsPsych Central Sci. & Tech.WebMD News Sci. & Tech.Green Car Congress Sci. & Tech.54BC News WMUR 9 General NewsHealthline Sci. & Tech.Mongabay Sci. & Tech.Vox.com General NewsWPTV 5 West Palm Beach General NewsPopular Mechanics Sci. & Tech.PM 360 Sci. & Tech.SFGate General NewsSeed Daily Sci. & Tech.55 able S11: The coefficients of all independent variables (including 533keywords) in Model 5 in predicting whether the first author is mentionedor not by name in a news story referencing their research papers. Randomeffects for 100 top last authors, 288 outlets, and 8,268 publication venuesare also included in the model. Note that “FA” denotes the first authorand “J” denotes the journalist.
Dependent variable:
First author mentionedFA African − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − A Indian:J OtherUnknown 0.093 ( − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − olecular biology − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − ody mass index − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − ocial relation 0.469 (0.113, 0.824) p = 0.010Chromatin − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − lassical mechanics − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − mmunology − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − llele 0.071 ( − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − eart disease − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − eight gain − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − ocus group 0.716 (0.140, 1.291) p = 0.015Regimen 0.461 ( − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − ntioxidant − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − estational age − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − −114,476.700Akaike Inf. Crit. 230,167.500Bayesian Inf. Crit. 236,579.100