Heavy-tailed distribution of the number of publications within scientific journals
HHeavy-tailed distribution of the number of publications within scientific journals
Robin Delabays a,b and Melvyn Tyloo a a School of Engineering, University of Applied Sciences of Western Switzerland (HES-SO), CH-1951 Sion, Switzerland; b Center for Control, Dynamical Systems, and Computation,University of California at Santa Barbara (UCSB), Santa Barbara, 93106-9560 California, USA (Dated: November 11, 2020)The community of scientists is characterized by their need to publish in peer-reviewed journals,in an attempt to avoid the ”perish” side of the famous maxim. Accordingly, almost all researchersauthored some scientific articles. Scholarly publications represent at least two benefits for the studyof the scientific community as a social group. First, they attest of some form of relation betweenscientists (collaborations, mentoring, heritage,...), useful to determine and analyze social subgroups.Second, most of them are recorded in large data bases, easily accessible and including a lot ofpertinent information, easing the quantitative and qualitative study of the scientific community.Understanding the underlying dynamics driving the creation of knowledge in general, and of sci-entific publication in particular, in addition to its interest from the social science point of view,can contribute to maintaining a high level of research, by identifying good and bad practices inscience. In this manuscript, we attempt to advance this understanding by a statistical analysis ofpublications within peer-reviewed journals. Namely, we show that the distribution of the number ofarticles published by an author in a given journal is heavy-tailed, but has lighter tail than a powerlaw. Moreover, we observe some anomalies in the data that pinpoint underlying dynamics of thescholarly publication process.
INTRODUCTION
One of the core mechanism in the practice of science isthe self examination of a field of research. The validationof a scientific result is always collective, in the sense thatit has been scrutinized, criticized, and (hopefully) vali-dated by a sufficient number of peers. Furthermore, anyscientific result is always subject to new evaluation andmight eventually be replaced by a more accurate work.At the level of a community, scientists are then used tocriticize the work of colleagues and to have their workcriticized by them. It is then not surprizing that somescientists started to study (and thus somehow criticize)the scientific community itself [1]. The study of the thescientific community, sometimes refered to as
Science ofScience [2, 3], is a key step to unravel the underlying be-haviors of its members and draw some lessons about it.In the last decades, such an investigation has been signif-icantly eased by the emergence of large data bases of sci-entific publications (Web of Sience, PubMed, arXiv,...).It allowed for instance to build the time-evolving collab-oration network of scientists [4].Such an approach and associated tools has the poten-tial to help maintaining the quality of research, and thusa good use of public funding. Indeed, in the current con-text of increasing number of scientific publications [5, 6]in parallel to the ubiquitous presence of predatory jour-nals [7, 8], distinguishing bad practices from honest workin scientific publishing becomes more and more challeng-ing. Understanding the underlying dynamics of scientificpublication will be instrumental in this task.A scientist’s work is commonly evaluated by two dif-ferent, but related, quantities, namely, their number ofpublications and the number of citations thereof. Thesequantities are summarized in the criticized, but widely spread, h -index [9, 10]. Naturally, a vast majority ofinvestigations about the scientific publication process isfocussed on the citation side. It mostly aims at describ-ing how the citation network impacts the number of ci-tations a given paper is (and therefore its authors are)likely to recieve. In particular, evidence suggests thatcitations follow a rich-get-richer or preferential attach-ment process, where the more citations a scientist has,the more likely they are to get new citations [11], lead-ing to a power law distribution of citations [12, 13] orother heavy-tailed distributions [14]. Indeed, preferen-tial attachement has been proven to lead to heavy-taileddistributions [15], with some refinements to account forthe life-time of a publication [16].Compared to the number of citations that an articleor a scientist gets, the number of articles published bya scientist has been much less investigated, even thoughpublishing papers is a sine qua none to get cited. In thismanuscript, we focus on the distribution of articles pub-lished by a scientist within a given peer-reviewed journal.As interestingly pointed out by Sekara et al. [17], publish-ing in a peer-reviewed journal (especially in high-impactones) is more likely if one author of the manuscript al-ready published in the same journal. Such a process canbe viewed as some sort of preferential attachment, andan expected outcome of such an observation is a highrepresentation of a few authors in a given journal [15].Furthermore, a scientist whose field of research is well-aligned with a journal topic is likely to publish a largeproportion of their work in this journal, leading again toa high representation of a few specialized authors in agiven journal.We support these expectations, showing that in a se-lection of fourteen journals (listed in Table I) the distri-bution of the number of articles published by an author a r X i v : . [ c s . D L ] N ov within a journal has a heavy tail. It appears howeverthat this distribution have a tail weaker than a powerlaw. We argue that this distribution can be explained bya preferential attachement process, which is backed up byevidence. On top of that, in some of the selected jour-nals, we observe some interesting anomalies for which wegive an explanation. RESULTS
For each journal in Table I, we consider the list of allauthors who published in it and the number of articlespublished by each of them up to 2017. From this, one canplot the empirical distribution of the number of articlespublished by an author in a given journal (Fig. 1). Onthese data, we fit three heavy-tailed distributions, namelya power law (Eq. 2), a power law with cutoff (Eq. 3),and a
Yule-Simon distribution (Eq. 4), using a MaximumLikelihood Estimator (see Methods). We then assess thegoodness-of-fit of our fitting following [18], which is en-coded in a p -value (see Methods). The results of eachfit and goodness-of-fit tests are presented in Table II,and the resulting distributions together with the dataare shown in Figs. 1, 3, and 5. Clearly, the power lawdistribution is a poor fit for all data, its p -value beingzero for all journals. This can be seen in Figs. 1, 3, and5, where for most of the journals, the tail of the data setis lighter than the tail of its power law fit (black dashedlines). For three journals (namely SCI, PLC, CHA), the p -value of the power law with cutoff is larger than 5%and it seems to be a rather good fit, and for two others(NEM and SIA), the Yule-Simon distribution cannot beexcluded. General explanation
We propose the following explanation for this heavy-tailedness. Many social processes are ruled by the socalled preferential attachment [19]. Scientific collabora-tions [20] and citations [12] are apparently no exceptionto the rule. Namely, the probability that an author willcreate a new scientific collaboration at time t is propor-tional to the number of scientific collaboration they have.It is reasonable to assume that the evolution of the num-ber of articles published by an author in a given journalis described by a similar preferential attachment process.In other words, it would mean that the probability thata new article published in a given journal is signed by anauthor is proportional to the number of articles publishedby this author in the very same journal.Heuristically, our argument is that if an author pub-lished a lot of article in a journal, it means (i) that theywrite a lot of papers, and (ii) that their research topic iswell-aligned with the topics covered by the journal (forspecialized journals), or that the scientific impact of thisauthor’s research matches the standards of the journal (for interdisciplinary journals). Assumptions (i) and (ii)together imply that this author is likely to publish againin this journal.This intuition can be made more rigorous. For threejournals (SCI, LAN, and PRL), we refined the data toaccount for the time evolution of the number of articlespublished by each author. It turns out that, on average,the number of articles published during a year, by anauthor having already published k articles, is close tobe proportional to k (details are found in the Methodssection). According to [15], if it was exactly proportional k , the final distribution would be a power law. The factthat the relation is not exactly proportional, but closeto be, probably explains the lighter-than-power-law tailsobserved in Figs. 1, 3, and 5. Observations
Aside of these general considerations, we note threeinteresting observations in the data. First, for journalswith a large number of authors and published articles, thetail of the histogram drops dramatically. Second, someauthors apprear to be stronger than the power law. Andthird, some very large groups of authors can be identifiedeven in long term aggregated data.
Decay in long-life journals.
We observe in Figs. 1,3, and 5 that for old journals where a lot of articles arepublished, the tail of the histogram has a rather fast de-cay after a heavy-tailed regime (this is particularly strik-ing in PRL and PRD, Fig. 3). We explain this by the factthat the number of pulications of a given author dependson two parameters, namely their publication rate and thelength of their career. Both these quantities are boundedin practice and even if it is possible to publish a very largenumber of articles in a given journal, there is a practi-cal limit to this number. We hypothesize that the decayin the histograms of long-living journals comes from thefiniteness of publication rates and career lengths.
Key players.
The general distribution of the numberpapers per author is quite clear in our analysis, it seemsto be somewhere between an exponential distribution anda power law. The power law having the heaviest tailof the four distributions considered (exponential, powerlaw, power law with cutoff, and Yule-Simon), we use itto estimate an upper bound on the number of articlespublished by an author for each journal, shown as thevertical dashed lines in Figs. 1, 3, and 5 (more details inthe Methods section). In some journals (see e.g., PNA,CHA, SIA, and AMA in Fig. 1, and NEM and ACS inthe Appendix, Fig. 5), it appears that, some authors,which we refer to as key players , publish significantlymore articles in a journal than what the power law wouldpredict.Note that we checked that these key players are notartifacts due to multiple authors having the same namewhich would count as the same person. In all cases pre-sented here, there is a unique person appearing in the
Label Journal name (red. year) ∗ (1950) 63’791 (3’374)PNA Proc. Natl. Acad. Sci. USA ∗∗ (1950) 55’849 (2’495)SCI Science ∗ (1940) 48’928 (4’788)LAN The Lancet ∗ (1910) 33’416 (3’015)NEM New England Journal of Medicine ∗ (1950) 27’078 (3’842)PLC Plant Cell (2000) 20’649 (4’712)ACS J. of the American Chemical Society ∗ (1930) 82’223 (5’301)TAC IEEE Trans. on Automatic Control (2000) 8’911 (3’603)ENE Energy (2005) 28’920 (4’491)CHA Chaos 7’409SIA SIAM Journal on Applied Mathematics 6’106AMA Annals of Mathematics 3’679PRD Physical Review D 64’922PRL Physical Review Letters ∗ a J with respect to the number of articles published, for the six journals,indicated in the insets. The grey dotted line is an exponential fit of the data, emphasizing that the distribution is heavy-tailed.We also show the best fit (MLE)f for a power law distribution (dashed black), power law with cutoff (dash-dotted black), andYule-Simon distribution (dotted black). The vertical dashed line indicates the theoretical maximal number of publications ifthe distribution was the fitted power law (see Eq. 8). The same plots for the other journals are available in Fig. 3 and in theAppendix, Fig. 5. authors’ list of a very large number of papers.In order to make the data more comparable, we restrictour investigation to the early years between 1900 (earli-est possible in WoS) and the year in parenthesis in thesecond column of Table I for our first nine journals in thetable. This yields a number of authors comparable to the following three journals in the table (reduced number ofauthors is given in parenthesis in the third column of Ta-ble I). The resulting distributions are depicted in Fig. 2and in the Appendix, Fig. 6, and the fitted parametersare detailed in Table III. It appears from Figs. 2 and 6that for such reduced number of authors, the overshoot PL PLwC Y-S α p [%] β γ p [%] ρ p [%]NAT 2 .
58 0 . .
11 0 .
07 0 . .
10 0 . .
53 0 . .
30 0 .
02 0 . .
83 0 . .
68 0 . .
30 0 .
06 16 .
64 3 .
28 0 . .
47 0 . .
09 0 .
05 0 .
18 2 .
90 0 . .
76 0 . .
36 0 .
07 0 . .
43 8 . .
30 0 . .
92 0 .
10 13 .
42 3 .
01 0 . .
11 0 . .
95 0 .
01 0 . .
32 0 . .
08 0 . .
84 0 .
04 0 . .
51 0 . .
36 0 . .
12 0 .
06 0 .
12 3 .
15 0 . .
47 0 . .
28 0 .
05 80 .
84 3 .
43 0 . .
49 0 . .
20 0 .
08 2 .
24 3 .
49 9 . .
26 0 . .
72 0 .
14 0 .
18 2 .
95 0 . .
49 0 . .
24 0 .
005 0 .
02 1 .
55 0 . .
73 0 . .
52 0 .
005 0 .
12 1 .
80 0 . p -value of the goodness-of-fit for power law (PL), power law with cutoff (PLwC), andYule-Simon (Y-S) distributions. No set of data is well-fittedby a power law distribution. However, the power law withcutoff seems to be a good fit for three journals (SCI, PLC,CHA), and the Yule-Simon distribution seems to correctly fitthe distribution of NEM and SIA. For the other journals, noneof the distributions seem to fit the data appropriately. of some authors is more systematic, suggesting that inthe early years of scientific journals, there is usually afew very prolific authors publishing in it at a rather highrate.Considering the resluts of the fitting, in Table III, weobserve better agreements than for the full data sets.This probably indicates that the sample size is not largeenough to accurately fit heavy-tailed distributions, whichobviously need large samples. The fact that NAT andPNA are well-fitted by two distributions, also indicatesthat the reduced data sets are not large enough to beconclusive. Peaks in PRL and PRD.
In Fig. 3, we observe twopeaks in the empirical distributions of PRL (around 66and 96) and PRD (around 77 and 104). Crossing the listsof authors for each number of articles between 63 and 102for PRL (resp. 72 and 111 for PRD), we get the rightpanel of Fig. 3. The fact that the authors composinga peak in PRL are also the ones composing one of thepeaks in PRD suggests that these authors are all part ofa large group publishing together.A quick search, indicates that the peaks correspondto the research groups of the experiments ATLAS andCMS at the CERN. These two experiments are so bigand gather so many authors that they can be seen, evenin the data used in our analysis, aggregated throughoutthe whole history of PRL (since 1958) and PRD (since1970).
PL PLwC Y-S α p [%] β γ p [%] ρ p [%]NAT 2 .
32 29 . .
23 0 .
016 6 . .
98 0 . .
10 0 . .
96 0 .
02 15 . .
55 6 . .
44 0 . .
13 0 .
09 72 . .
37 4 . .
25 0 . .
81 0 .
11 30 . .
91 2 . .
27 0 . .
06 0 .
04 4 . .
91 0 . .
59 0 . .
12 0 .
16 0 . .
82 54 . .
06 0 . .
89 0 .
02 0 . .
46 64 . .
32 0 . .
06 0 .
06 23 . .
04 0 . .
69 0 . .
50 0 .
06 94 . .
06 0 . p -value of the goodness-of-fitfor power law (PL) and power law with cutoff (PLwC), andYule-Simon (Y-S) distributions. We see that the only datathat are well-approximated by the power law are for NATwhen reduced to the first 3374 entries of WoS. The power lawwith cutoff, however, seems to be a good fit for the reduceddata of six journals (NAT, PNA, SCI, LAN, TAC, and ENE).ENE is particularly well-fitted by the power law with cutoff.Finally, the Yule-Simon distribution seems to correctly fit thedistribution of PAN, PLC, and ACS. For the other journals,none of the distributions seem to fit the data appropriately.Remark that the reduced data of NAT and PNA are correctlyfitted for two distributions indicating that the amount of datais probably not sufficient for a good fit. DISCUSSION
Our analysis reveals a series of interesting, even thoughnot surprizing, dynamics ruling the process of publica-tion within scientific journals. The main observation isthe heavy-tailed shape of the distribution of publications,which we explain by a preferential attachment process.We showed that the preferential attachment dynamics isheuristically meaningful, in the sense that if an authorpublishes a lot of papers and if their profile aligns withthe journal’s profile, they are likely to publish in thisjournal and at the same time they are likely to have al-ready published in the same journal. Moreover, we alsobacked up the preferential attachment process by data-based evidence, where we show that the proportion of ar-ticles published in a journal by the authors with already k articles (in this journal) is approximately proportionalto k . An exact proportionality would lead, according toRef. [15], to a power law distribtion. Of course, in thelong run, scientists cannot published an unbounded num-ber of articles, due to finiteness of their careers. Thistranslates, in our analysis, as a drop in the tail of thedistribution for older journals, which then do not followa power law. Apparently, a power law with cutoff or aYule-Simon distribution are better suited to describe thedata.On top of this general dynamics, our analysis displayssome interesting anomalies that point towards specificunderlying dynamics. First, the data show that in theearly decades of existence of a journal, a small number Figure 2. Histograms of the number of authors a J with respect to the number of articles published, for the first three journalsof Table I, with data restricted to the years between 1900 (earliest possible in WoS) and the years indicated in the insets. Thenumber of authors covered is given in parenthesis in the third column of Table I. We show the best fit for a power law distribution(dashed black), power law with cutoff (dash-dotted black), and Yule-Simon distribution (dotted black). The vertical dashedline indicates the theoretical maximal number of published papers if the distribution was the fitted power law (see Eq. 8). Weobserve an almost systematic exceeding of the number of articles published by some authors. The same plot for other journalsis available in the Appendix, Fig. 6. ATLAS CMS
Figure 3. Analysis of PRL and PRD. Left and center: Same figures as in Fig. 1 for PRD and PRL respectively. Thearrows indicate the increased number of authors corresponding to the ATLAS and CMS experiments at the CERN. Right:Two-dimensional, color-coded histogram of the number of authors with respect to the number of articles published in PRL(horizontal axis) and PRD (vertical axis). The peak centered at (96,77) is the CMS experiment and the one at (66,104) is theATLAS experiment, both at the CERN. of authors are extremely influencial. This translates assome authors having much more publications than whata power law distribution would predict, given that thepower law already has an heavier tail than our data. Suchauthors, which we refer to as key players, are likely to besome very influencial scientists in the topic(s) covered bythe journal.Second, we realized that some huge scientific projectcan impact the distribution of publications even on largescale aggregated data. In our samples, this is seen for thejournals Physical Review Letters (PRL) and Physical Re-view D (PRD), which publish the outcomes of the largeexperiments ATLAS and CMS at the CERN, gatheringthousands of scientists. Our approach was then able topinpoint further dynamics taking place in nowadays sci-ence.As seen in Table II, the fitting of the data by a powerlaw with cutoff or a Yule-Simon distribution is not per-fect. More advanced fitting techniques might be able toidentify a common distribution for all journals, providedthat one exists. From a social science point of view, a more refined explanation of the approximate preferentialattachment taking place in scientific publishing could un-ravel with more certainty the source of the distributionsobserved in this manuscript. This is work for a futureresearch.
MATERIALS AND METHODSData sets
We consider an arbitrary selection of 14 peer-reviewedjournals (see Table I), whose data are available on theWeb of Science data base (WoS). The selected journalsvary in age (from a few decades to more than a century)but are not too young, in order to have sufficiently manypublications available, and all of them are still publish-ing nowadays. We denote by J := { NAT , PNA , ...,
PRL } the set of journals considered (see Table I for the list oflabels).Within each journal J ∈ J , we index authors by aninteger and for each author i = 1 , ..., N J , we count thenumber n Ji of articles published by i in J up to year 2017,which gives the set of data A J = { n Ji } . We restrict ourinvestigation to publications labelled as “Article” in theWoS data base, to focus on peer-reviewed articles and todiscard editorial material for instance. For some journals,the number of authors was too large to be downloadedfrom the WoS data base. As a consequence, the authorshaving published only one or two articles in these journalshad to be removed from the data (e.g., NAT, PNA, orSCI, indicated by asteriscs in Table I). Note also that wedo not take into account articles published anonymously,which represent a large number of articles in medicinejournals in particular.From the data set A J we can compute the proportionof authors who published n ∈ N articles a J ( n ) := |{ i : n Ji = n }| N J . (1)These values are represented in logarithmic scales inFigs. 1, 3, and 5, each panel corresponding to a differ-ent journal. Distribution fitting
For each empirical distribution in Figs. 1, 3, and 5, wefit an exponential distribution (grey dotted lines) to em-phasize their heavy-tailed behavior. With this observa-tion, it is tempting to fit a power law distribution (blackdashed lines), P pl ( a J = n ) = C · n − α , (2)with α > C ∈ R normalizing the distribution.However, as pointed out in [18], fitting a heavy-taileddistribution is not trivial and should be done carefully,the risk being to derive spurious conclusions [21]. Fol-lowing recommendations in Ref. [18], we also try to fitother heavy-tailed distributions, such as the power lawwith cutoff (black dash-dotted lines), P plc ( a J = n ) = C · n − β e − γn , (3)with β > γ >
0, and normalizing constant C ∈ R ,and the Yule-Simon distribution (black dotted lines), P ys ( a J = n ) = C · ( ρ − n, ρ ) , (4)with ρ > C ∈ R is the normalizing constant, andwhere B( x, y ) is the Euler beta function . We performthe distribution fitting by optimizing the parameters α , β , γ , and ρ with a Maximum Likelihood Estimator [18].Other distribtions (such as log-normal, L´evy, Weibull)were tested and discarded because they were far frommatching the data. Goodness-of-fit
To evaluate the goodness of our fitting, we again followthe recommendations of [18]. We generate 5000 sets ofsynthetic data ˜ A i , i = 1 , ..., | ˜ A i | = N J and following the distributionwhose goodness-of-fit is to be tested. For each of thesedata sets, we define its associated empirical cumulativedistribution function (CDF) S i ( k ) := |{ x ∈ ˜ A i : x ≤ k }|| ˜ A i | , (5)and denote by S J the empirical CDF of A J . We denoteby P i the CDF of the best fitted distribution associatedto ˜ A i ( P J for A J ). The p -value of the goodness-of-fit isthen given by p := |{ i : d KS ( S i , P i ) > d KS ( S J , P J ) }| , (6)where the Kolmogorov-Smirnov distance between twoCDFs Q and Q is defined as the maximum differencebetween them, i.e., d KS ( Q , Q ) := max k | Q ( k ) − Q ( k ) | . (7)Namely, p is the proportion of synthetic data setsthat are further from the theoretical distribtion (in theKolmogorov-Smirnov sense) than the data set investi-gated. The fit is rejected if p < good otherwise [see [18] for more details]. Maximum number of articles
Based on Eq. 2, one can compute x n , the number ofauthors with n publications in J if the distribution fol-lowed a power law. Setting this number to x n = 1, themaximal number of articles is given by x n ≈ N J C n − α = ⇒ n max ≈ ( N J C ) α . (8)This determines a theoretical upper bound on the num-ber of articles published by an author for each journal,shown as the vertical dashed lines in Figs. 1, 3, and 5. Number of articles published every year
For three journals (SCI, LAN, and PRL) we comparethe number of authors having published k articles at thebegining of year t with the number of articles publishedby these authors during year t . We define: • N k ( t ): the number of authors who have published k articles on December 31st of year t − • m k ( t ): the number of articles published during year t by all the authors with k articles on December31st of year t − m k ( t ) /N k ( t ) with re-spect to k for years t ∈ { , ..., } for SCI, LAN,and PRL. Note that, for each year considered, we do nottake into account authors who did not publish, becausethe majority of those are not active anymore (retired ordead). For each of the three journals, these values have alinear correlation coefficient larger than 0 .
7, supportinga fairly good linear dependence, m k ( t ) ∼ k · N k ( t ) . (9)The probability that a new paper is signed by an au-thor with k publications is then close to be proportionalto k . According to [15], if it was exactly proportional,after a long enough time, the distribution of N k wouldfollow a power law. The fact that the relation 9 is notexact and that our samples are limited to a finite timehorizon, explains that we do not obtain exactly a powerlaw. However, the good correlation between m k ( t ) /N k ( t )and k tells us that the distribution should not be too faraway from a power law, in agreement with our observa-tion of Table II. DATA AVAILABILITY
The data are available from WoS. The study used nospecial computer code.
ACKNOWLEDGMENTS
RD and MT were supported by the Swiss NationalScience Foundation under grant number 2000020 182050.RD was supported by the Swiss National Science Foun-dation under grant number P400P2 194359.
APPENDIX
We show here the figures not displayed in the Resultssection. [1] D. J. de Solla Price,
Little Science, Big Science (Columbia University Press, 1963).[2] A. Clauset, D. B. Larremore, and R. Sinatra, Science , 477 (2017).[3] S. Fortunato, C. T. Bergstrom, K. B¨orner, J. A. Evans,D. Helbing, S. Milojevi´c, A. M. Petersen, F. Radic-chi, R. Sinatra, B. Uzzi, A. Vespignani, L. Waltman,D. Wang, and A.-L. Barab´asi, Science , eaao0185(2018).[4] M. E. J. Newman, Proc. Natl. Acad. Sci. USA , 404(2001).[5] D. J. de Solla Price, Science , 510 (1965).[6] L. Bornmann and R. Mutz, J. Assoc. Inf. Sci. Tech. ,2215 (2015).[7] J. Bohannon, Science , 60 (2013).[8] P. Sorokowski, E. Kulczycki, A. Sorokowska, andK. Pisanski, Nature , 481 (2017).[9] J. E. Hirsch, Proc. Natl. Acad. Sci. USA , 16569(2005).[10] G. Siudem, B. ˙Zoga´la Siudem, A. Cena, andM. Gagolewski, Proc. Natl. Acad. Sci. USA , 13896 (2020).[11] D. de Solla Price, J. Am. Soc. Inf. Sci. , 292 (1976).[12] Y.-H. Eom and S. Fortunato, PLoS ONE , e24926(2011).[13] L. Waltman, N. J. van Eck, and A. F. J. van Raan, J.Am. Soc. Inf. Sci. Tech. , 72 (2012).[14] M. Thelwall, J. Infometr. , 336 (2016).[15] P. L. Krapivsky, S. Redner, and F. Leyvraz, Phys. Rev.Lett. , 4629 (2000).[16] P. Parolo, R. K. Pan, R. Ghosh, B. A. Huberman,K. Kaski, and S. Fortunato, J. Infometr. , 734 (2015).[17] V. Sekara, P. Deville, S. E. Ahnert, A.-L. Barab´asi,R. Sinatra, and S. Lehmann, Proc. Natl. Acad. Sci. USA , 12603 (2018).[18] A. Clauset, C. R. Shalizi, and M. E. J. Newman, SIAMReview , 661 (2009).[19] H. Jeong, Z. N´eda, and A. L. Barab´asi, Europhys. Lett. , 567 (2003).[20] A. L. Barab´asi, H. Jeong, Z. N´eda, E. Ravasz, A. Schu-bert, and T. Vicsek, Physica A , 590 (2002).[21] A. D. Broido and A. Clauset, Nature Comm. , 1 (2019). Figure 4. Average number of publication within year t for authors with k publication at the begining of year t , with respectto k , for years t ∈ { , ..., } and for the three journals SCI, LAN, and PRL. The Pearson correlation coefficients arerespectively r SCI ≈ . r LAN ≈ . r PRL ≈ . .