[PDF] Detecting citation cartels in journal networks

Abstract

The ever-increasing competitiveness in the academic publishing market incentivizes journal editors to pursue higher impact factors. This translates into journals becoming more selective, and, ultimately, into higher publication standards. However, the fixation on higher impact factors leads some journals to artificially boost impact factors through the coordinated effort of a "citation cartel" of journals. "Citation cartel" behavior has become increasingly common in recent years, with several instances of cartels being reported. Here, we propose an algorithm -- named CIDRE -- to detect anomalous groups of journals that exchange citations at excessively high rates when compared against a null model that accounts for scientific communities and journal size. CIDRE detects more than half of the journals suspended by Thomson Reuters due to cartel-like behavior in the year of suspension or in advance. Furthermore, CIDRE detects a large number of additional anomalous groups, which reveal a variety of mechanisms that may help to detect citation cartels at their onset. We describe a number of such examples in detail and discuss the implications of our findings with regard to the current academic climate.

Full PDF

DDetecting citation cartels in journal networks

Sadamori Kojaku a , Giacomo Livan b,c , and Naoki Masuda d,e,* a Luddy School of Informatics, Computing, and Engineering. Indiana University, Bloomington, Indiana, 47408, USA b Department of Computer Science, University College London, London, WC1E 6EA, UK c Systemic Risk Centre, London School of Economics and Political Science, London, WC2A 2AE, UK d Department of Mathematics, University at Buffalo, State University of New York, Buffalo, New York, USA e Computational and Data-Enabled Science and Engineering Program, University at Buffalo, State University of NewYork, Buffalo, New York, 14260-2900, USA * [email protected] ABSTRACT

The ever-increasing competitiveness in the academic publishing market incentivizes journal editors to pursue higher impactfactors. This translates into journals becoming more selective, and, ultimately, into higher publication standards. However,the ﬁxation on higher impact factors leads some journals to artiﬁcially boost impact factors through the coordinated effort of a“citation cartel” of journals in addition to self-citations. “Citation cartel” behavior has become increasingly common in recentyears, with several instances of cartels being reported. Here, we propose an algorithm—-named CIDRE—to detect anomalousgroups of journals that exchange citations at excessively high rates when compared against a null model that accounts forscientiﬁc communities and journal size. CIDRE detects more than half of the journals suspended by Thomson Reuters dueto cartel-like behavior in the year of suspension or in advance. Furthermore, CIDRE detects a large number of additionalanomalous groups, which reveal a variety of mechanisms that may help to detect citation cartels at their onset. We describe anumber of such examples in detail and discuss the implications of our ﬁndings with regard to the current academic climate.

The volume of published research is growing at exponential rates , creating a pressing need to devise fast and fair methods toevaluate research outputs. Measuring academic impact is a controversial and challenging task . Yet, the evaluation of researchhas increasingly been operationalized in terms of the citations received by research papers and citation-based bibliometricindicators, such as the h -index and the journal impact factor (JIF), which are widely used to evaluate individual researchers,academic institutions, and the research output of entire nations .Editors and academic publishers are under increasing pressure to ensure that their journals achieve and sustain high valuesof JIF and other bibliometric indicators. Such indicators are widely recognized as proxies for a journal’s quality and prestige ,and have a considerable impact on the journal’s readership numbers and subscription base. This fact, in turn, incentivizeseditors to devise strategies aimed at increasing citation numbers. Such strategies may ultimately result in publications of ahigher quality. However, there have been multiple reports of malicious practices merely aimed at boosting citation numbers.Editors of some journals have generated citations for their journals by coercing the authors of submitted papers or bywriting editorial reviews . Such self-citations are relatively easy to spot because they involve only one journal. Concerns havegrown for less detectable forms of manipulation which involve the coordinated effort of a number of journals, a practice knownas citation cartels. Such a practice—also referred to as citation stacking—consists of groups of journals exchanging citationsat excessively high rates . For example, one instance of a citation cartel attracted attention in 2011. In this example, twopapers published in different journals provided a number of citations to a single journal, increasing its JIF of the journal by25% . Since then, new instances of citation cartels have been reported every year .Journal editors may set up citation cartels by informally agreeing with other journal editors and colleagues to coercecitations . Such citation cartels are easy to launch and hard to detect. Thomson Reuters (TR) has attempted to tacklethis issue by deploying an algorithm that ﬂags pairs of journals in which at least one of the two journals cites the other at anexcessively high rate, supplemented by some additional criteria . As of 2019, TR has suspended from its annual journal ranking46 pairs of journals—-featuring 55 journals in total—due to excessive pairwise citations . Alternatively, a previous studyproposed an algorithm to detect citation cartels as groups of densely interconnected nodes (i.e., communities) in journal citationnetworks . However, the approach based on network communities may suffer from false positives because communities are thenorm rather than the exception in journal citation networks: journals tend to cite other journals in the same research ﬁeld, whichforms densely connected communities . To the best of our knowledge, no empirically validated tool has yet been proposedto identify groups of journals whose citation practices can be regarded—-with statistical conﬁdence—as cartel behavior.We propose an algorithm to detect citation cartel candidates in journal citation networks, which we refer to as the CItation a r X i v : . [ phy s i c s . s o c - ph ] S e p onors and REcipients (CIDRE) algorithm. CIDRE distinguishes between healthy and malicious citation behavior by means ofa null network model. The null model accounts for the citation rates that can be expected under healthy citation practices due tojournals’ ( i ) proximity (in terms of research areas) and ( ii ) size (in terms of citation volumes, both given and received). Then,CIDRE ﬁnds groups of journals with excessive within-group citations relative to the null model.We apply the algorithm to a citation network of 48,821 journals across various disciplines constructed from MicrosoftAcademic Graph . CIDRE detects more than half of the instances suspended by TR in the year of suspension or earlier.Furthermore, CIDRE identiﬁes a number of additional anomalous journal groups, including 7 groups in 2019 whose journalsreceived more than 30% of their incoming citations from other members of the group. In the absence of a ground truthvalidation—such as the one provided by comparisons against the list of journals banned by TR—we shall refrain from outrightidentifying these groups as citation cartel candidates. However, through extensive examples we will demonstrate that thesegroups are interpretable and composed of different patterns of anomalous citation behaviors, suggesting a breadth of mechanismsunderlying the genesis of possible citation cartels.Our results reveal a large number of journals that receive a disproportionate amount of their citations from a tiny groupof publication venues, which account for a substantial fraction of these journals’ impact factors (in excess of 50% in somecases). In our ﬁnal remarks, we will discuss how these ﬁndings should encourage a critical approach to the use of bibliometricindicators. The Python code for CIDRE is available at . Results

Data

We use a snapshot of Microsoft Academic Graph (MAG) released on January 30th, 2020 to construct citation networks ofjournals . The data set contains bibliographic information including citations among 231,926,308 papers published from48,821 journals in various research ﬁelds. The bibliographic information includes the journal name, publication year, references,and author names. We construct a directed weighted network of journals for each year t between 2000 and 2019, in which anode represents a journal, and an edge indicates citations between journals. We deﬁne the weight W i j of the edge from journal i to journal j in year t by the number of citations from papers published in i to papers published in j made in the time windowused for calculating the JIF, i.e., last two years t (cid:48) ∈ [ t − , t − ] . We use term effective citation to refer to a citation reﬂected inthe calculation of the JIF (i.e., a citation to a paper published in the last two years). Unless stated otherwise, the citations in thefollowing text refer to effective citations.The data set does not contain some papers retracted due to excessive citations to some journals. Because these retractedpapers may belong to journals that are part of a citation cartel, we added back the citations made by ﬁve papers described in aprevious study in our analysis. Detecting citation cartels

A citation cartel is considered to be a group of journals that excessively cite papers published in other journals within the group.Speciﬁcally, we assume that a citation cartel is composed of donor journals and recipient journals. A donor journal providesexcessive citations to papers published in recipient journals in the previous two years i.e., the time window for the JIF. In caseswhere two journals exchange citations at excessively high rates, they simultaneously behave as both donors and recipients.Although donor journals have no apparent direct beneﬁt in providing citations to recipient journals, we consider them as amember of citation cartel because some previously identiﬁed instances contain journals giving excessive citations to particularjournals, which often share the publishers or editors .We identify excessive citations between journals using a null model for citation networks. Speciﬁcally, we use the degree-corrected stochastic block model (dcSBM) as the null model. The dcSBM generates randomized networks that preservethe number of citations between groups of journals (i.e., blocks), and the outgoing and incoming citations of each journal onaverage. We determine the blocks by ﬁtting the dcSBM using a non-parametric Bayesian method . Community detectionmethods for networks including the dcSBM have been shown to provide reasonable partitionings of journal citation networksinto research ﬁelds . Therefore, the networks generated by the dcSBM are considered to be random networks that roughlypreserve the patterns of citations within and across research ﬁelds.CIDRE removes from the given network all the edges that are statistically compatible with the null model and then computesa donor score and a recipient score for all journals based on the residual edges in the network (see the Materials and Methodssection). In the following, we refer to the weights on such edges as excessive citations. Consider a journal group, denoted by U ,that contains journal i . Journal i ’s donor score, denoted by x d ( i , U ) , is the fraction of excessive citations that journal i providesto the other journals in U . Journal i ’s recipient score, denoted by x r ( i , U ) , is the fraction of excessive citations that i receivesfrom other journals in U . CIDRE considers a journal as a donor journal and a recipient journal if x d ( i , U ) and x r ( i , U ) are largerthan a prescribed threshold θ = .

15, respectively (see the Discussion section for the choice of the θ value).

00 `02 `04 `06 `08 `10 `12 `14 `16 `18

Year N u m be r o f de t e c t ed c a r t e l s (a) `00 `02 `04 `06 `08 `10 `12 `14 `16 `18 Year N u m be r o f j ou r na l s i n a c a r t e l (b) Figure 1.

Statistics of citation cartels suggested by CIDRE. (a) Number of cartels. (b) Number of journals in a cartel. Adiamond indicates an outlier that does not fall in the range in [ q . − . ∆ IQR , q . + . ∆ IQR ] , where q . and q . are theﬁrst and third quartiles, respectively, and ∆ IQR = q . − q . .To ﬁnd candidates for cartels, CIDRE initializes U to be the set of all nodes in the network. Then, CIDRE removes from U the journals that are neither a donor nor a recipient and recomputes the donor and recipient scores for the journals remainingin U . CIDRE iterates the removal of nodes and recomputation of scores until no journal is further removed. We partition U into disjoint groups U (cid:96) ( (cid:96) = , , . . . ), where each U (cid:96) is the maximal weakly connected component in the network consistingof nodes belonging to U and the residual edges. We regard each weakly connected component U (cid:96) with more than θ w = Overlap with the journal groups suspended by TR

Since 2007, TR has suspended 227 journals due to excessive citations, of which 173 journals are suspended due to excessiveself-citations, 55 journals due to excessive citations between two journals, and one journal due to both self-citations andpairwise citations . Although TR does not disclose its precise algorithm, they have released some criteria for suspensions.Their criteria include the fraction of citations that the recipient journal receives from the donor journal, akin to the recipientscore, together with the year since the ﬁrst publications and ranking of journals . TR identiﬁed 46 pairs of donor and recipientjournals for excessive pairwise citations. Some journal pairs suspended by TR share a journal. We merge such overlappingjournal pairs suspended in year t into one group, denoted by U TR (cid:96) , and consider that TR detects U TR (cid:96) in year t − (cid:96) = , . . . , U suspended in 2011 contains theﬁve retracted papers that we added back to the MAG data.We measure the extent to which the groups detected by CIDRE match the groups suspended by TR due to pairwisecitations. For each group of journals detected by CIDRE, we compute the overlap with a group suspended by TR as O (cid:96),(cid:96) (cid:48) = (cid:12)(cid:12) U TR (cid:96) (cid:84) U CI (cid:96) (cid:48) (cid:12)(cid:12)(cid:14) | U CI (cid:96) (cid:48) (cid:12)(cid:12) , where U CI (cid:96) (cid:48) is the (cid:96) (cid:48) th group of journals detected by CIDRE. We consider that CIDRE detects U TR (cid:96) if and only if there exists (cid:96) (cid:48) such that O (cid:96),(cid:96) (cid:48) ≥ . | U TR (cid:96) (cid:84) U CI (cid:96) (cid:48) | ≥ O (cid:96),(cid:96) (cid:48) ≥ . Anomalous journal groups in 2010–2018

CIDRE detected 152 groups of journals between 2010 and 2018 that are not among the journals suspended by TR due toexcessive pairwise citations. In a majority of these groups that CIDRE detected, a small number of papers or authors dominatecitations within the group. Speciﬁcally, more than 30% of within-group citations between different journals are made by asingle paper in 43 groups (28%; pattern (a)) and made to a single paper in 8 groups (5%; pattern (b)). Otherwise, more than30% of within-group citations between different journals are made by one author in 32 groups (21%; pattern (c)) and made topapers written by one author in 6 groups (4%; pattern (d)). Note that if a group is of types (c) and (d) simultaneously, a singleauthor excessively cites his/her papers. We have classiﬁed this case as type (c). Finally, we did not ﬁnd any concentration

06 `07 `08 `09 `10 `11 `12 `13 `14 `15 `16 `17 `18

Year I D o f t he g r oup s u s pended b y T ho m s on R eu t e r s , Thomson Reuters CIDREThomson Reuters CIDRE 0.40.60.81.0

Overlap

Figure 2.

Years when the journal groups are detected. The crosses and circles indicate the year when the groups are detectedby TR and CIDRE, respectively. The color of the circles indicates overlap O (cid:96),(cid:96) (cid:48) , i.e., the probability that the journals in detectedgroup (cid:96) (cid:48) belong to group (cid:96) suspended by TR due to pairwise citations.of citations due to individual papers or individual authors for the remaining 63 groups (41%; pattern (e)). Below we closelyinspect the largest group for each pattern (a)–(e) (Figs. 3(a)–(e)), with the exception of pattern (e) for which we choose thesecond largest group because the largest group consists of many journals (i.e., 73 journals).Group 1, which is a case of pattern (a) and detected in 2018, consists of 17 journals on anthropology (Fig. 3(a)). Two reviewpapers published in donor journals, American Anthropologist and

Social Anthropology , provided 233 citations in total to thejournals in group 1, of which 230 citations (99%) were made to the papers published in the time window for the JIF. Removingthe citations from the two review papers decreases the JIFs for the 4 recipient journals,

Anthropological Quarterly , CulturalAnthropology , Focaal , and

Journal of the Royal Anthropological Institute , by more than 26%.Group 2, which is a case of pattern (b) and detected in 2017, consists of 4 journals on crystallography (Fig. 3(b)). Mostof the within-group citations were made to a single paper published in a recipient journal,

Acta Crystallographica Section C (denoted by R , ). In fact, the paper received 594 citations from the two donor journals, IUCrData (denoted by D , ) and ActaCrystallographica Section E (denoted by D , ), which accounts for 94% of citations that R , received from D , and D , .Removing the within-group citations to the single paper decreases the JIF of R , by 22%. The paper is titled “Crystal structurereﬁnement with SHELXL”, which describes a software commonly used in crystallography. The donor journals, D , and D , ,required the software users to cite the paper in their submission guidelines.Group 3 (pattern (c)) detected in 2014 consists of 5 journals on engineering. Most of the within-group citations are attributedto self-citations across different journals by a single author (Fig. 3(c)). The author wrote approximately one-third of papers(23 out of 63 papers) contributing to the within-group citations. These papers provided 313 citations to the author’s paperspublished in the recipient journals in 2012 and 2013. The author was on the editorial board for International Journal ofIntelligent Systems and Applications , which serves as both a donor and recipient journal in this group.As is the case for group 3, a single author made excessive self-citations in group 4 detected in 2010 (Fig. 3(d)). Group 4,which is a case of pattern (d), is composed of 4 journals on veterinary science. One author wrote 33 papers published in a donorjournal,

Journal of Veterinary Medicine (denoted by D , ), in 2010. These papers provided 74 citations to 45 papers written bythe same author published in a recipient journal, Journal of Animal Physiology and Animal Nutrition (denoted by R , ), in 2008and 2009. The self-citations made by the author accounts for 35% of citations that R , receives from D , .Group 5 (pattern (e)) detected in 2011 consists of two journals on laser science, in which the donor journal, Laser Physics ,provided 1984 citations to the recipient journal,

Laser Physics Letters (Fig. 3(e)). We did not ﬁnd any concentration of citations;neither a single paper nor a single author provided or received more than 8% of citations within the group. In 2011, the numberof citations from the donor journal to the recipient journal increased more than double, from 987 citations in 2010 to 1984citations in 2011. CIDRE identiﬁed the increase in the citations to be excessive and detected this group.

Anomalous journal groups in 2019

CIDRE detected 7 anomalous groups for the network in 2019 (Figs. 3(f)–( (cid:96) )). Group 6 consists of 3 journals on surgery(Fig. 3(f)). As is the case for group 2, most of the within-group citations pointed to a single paper published in the sole recipient i t a t i on s f r o m B t o CC i t a t i on s f r o m A t o C Journal AJournal B Citations from outside the group Citations from a citing journalDonor Recipient Journal C

Figure 3.

Citation groups detected by CIDRE. The circles indicate journals. The color of edges indicates the source ofcitations. The width of edges is proportional to the number of citations made between two journals. Self-citation edges areomitted. The pie in color within each circle indicates the share of citations from the citing journal from the viewpoint of thecited journal. The white pie within each circle indicates the share of citations from outside the group. The journals on the leftand right arcs indicate the donor and recipient, respectively. The journals that are simultaneously donor and recipient are locatedwhere the arcs overlap. The journals outside the cartel are agglomerated into a grey-colored circle at the bottom of each panel.journal,

International Journal of Surgery (denoted by R ). The paper received 483 citations from the two donor journals, Annals of Medicine and Surgery (denoted by D , ) and International Journal of Surgery Case Reports (denoted by D , ), whichaccount for 82% of the citations (i.e., 592) that R receives from D , and D , . Removing these citations decreases the JIF of R by 20%. The paper is titled “The SCARE 2018 statement: Updating consensus Surgical CAse REport (SCARE) guidelines”,which is a guideline for surgical reports. In the guideline for the authors, the donor journals request the authors to cite thepaper as a condition for submission. Furthermore, the author of the SCARE paper is the managing and executive editor of D , and R . The three journals conducted a similar citation practice in the previous two years. In fact, CIDRE detected agroup composed of D , and R in 2017, in addition to the present group in 2018. In 2017 and 2018, D , and D , requestedthe authors to cite the previous version of the SCARE guideline paper written by the same author published in R in 2016.There were 559 and 554 citations from the donor journals to the paper in 2017 and 2018, respectively. The new guideline paperentered the time window for the JIF when the old guideline paper exited the time window.Group 7 consists of 5 donor journals and 1 recipient journal (Fig. 3(g)). A donor journal, Finance Research Letters (denotedby D , ), provided 267 citations to the recipient journal, Economics Letters (denoted by R ). Each of the other donor journalsprovided less than 21 citations to R . The genesis of group 7 may be a consequence of the changes in the publication policy of D , . In fact, D , increased the number of publications more than 7 times in a year, from 63 papers in 2018 to 458 papers in2019. The number of citations from D , to R also increased more than 7 times in a year, from 27 citations in 2018 to 267citations in 2019. roup 8 is composed of two journals, where the donor journal, Journal of Low Frequency Noise, Vibration and ActiveControl , provided 160 citations to the recipient journal,

Thermal Science (Fig. 3(h)). A single author received 74 out of the 160citations (46%) from 34 papers published in the donor journal. The 29 out of the 34 papers are included in a special issue ofwhich the author was the guest editor. The special issue consists of 74 papers.Group 9 is composed of 7 journals on anthropology (Fig. 3(i)). A single review paper published in a donor journal,

SocialAnthropology , cited 95 papers published in the recipient journals, all of which were published in the time window for the JIF. Ifone removes the citations from that review paper, the JIF of each of the 5 recipient journals decreases by more than 18%. Inthe review paper, the author acknowledged the editors of the two recipient journals,

Social Analysis and

Focaal owned by apublisher, Berghahn Journals, for granting access.Group 10 consists of 2 journals on crystallography (Fig. 3(j)). A single paper published in the donor journal,

CrystallographyReviews , cited 124 papers published in the recipient journal,

IUCrData , all of which were published in the time window for theJIF. If one removes these 124 citations, the JIF of the recipient journal decreases by 57%. One editor of the donor journal servesas the co-editor of the recipient journal.Group 11 is composed of 2 journals on political science (Fig. 3(k)). The donor journal,

Regulation and Governance (denotedby D ), provided 95 citations to the recipient journal, Annals of American Academy of Political and Social Science (denoted by R ). Removing the citations from the donor decreases the JIF of the recipient by 26%. The 89 out of 95 (93%) citations from D to R pointed to the papers included in a special issue of the recipient journal, i.e., “Regulatory Intermediaries in the Ageof Governance”. The special issue consists of 16 papers, each of which received less than 2 citations on average from journalsoutside group 11 in 2019. The special issue was edited by 3 guest editors who are on the editorial board of the donor journal.The three editors wrote a paper in the special issue. The paper received 26 citations in 2019, of which 13 citations (50%) camefrom the donor journal. The paper was highlighted as the most cited paper in the last three years in the recipient journal in 2019.Group 12 consists of 2 mathematical journals (Fig. 3( (cid:96) )). The donor journal, Journal of Mathematical Sciences andCryptography (denoted by D ), published 150 papers, of which 52 papers cited 36 papers published in the recipient journal, Journal of Information and Optimization Science (denoted by R ) in the time window for the JIF. We did not ﬁnd a singleauthor or a single paper that exclusively cited or was cited within the group. The 52 papers published in D were written by126 authors, of which 107 authors (84%) had never cited R before. Journals D and R have the same chief editor. Discussion

In this paper, we put forward an algorithm—named CIDRE—to identify groups of journals that cite each other at excessivelyhigh rates. CIDRE detects a majority of journal groups suspended by TR. Notably, in several cases, it does so years in advance.In addition, it detects a number of anomalous groups, whose members increased their JIFs by 17–130% via within-groupcitations The inspection of such groups reveals a variety of mechanisms leading to such inﬂation. Speciﬁcally, more than halfof the anomalous groups are due to one paper or one author that singlehandedly provides or receives many citations within thegroup.The algorithm’s practical value lies in that it is deterministic and scalable to large networks, which makes it possible to applyit in an online fashion to incoming streams of new citation data. Furthermore, it can be applied to different types of networks.For instance, CIDRE could be applied to bipartite author–journal networks, where a directed edge indicates a publication by anauthor in the journal, in order to detect potential predatory practices, such as the publication of papers with little peer review .CIDRE could also be applied in different contexts, e.g., to detect the manipulation of ratings in e-commerce platforms andsocial media .One should be careful when drawing conclusions from the application of CIDRE. The comparison against the ground-truthdata provided by TR, and the manual inspection of the groups detected by CIDRE support that the groups ﬂagged by CIDREwarrant consideration as potential citation cartel candidates. That being said, we ought to acknowledge that some of suchcandidates may arise due to unintended biases such as geographical proximity , reciprocity between peers , and editorialpreferences , rather than to outright malicious citation practices. In this respect, CIDRE should not be considered as a toolfor automated decision-making or a substitute for expert judgment, but rather a support tool to extract interpretable informationfrom the complexity of journal citation networks.CIDRE has a parameter—the threshold θ —that sets the minimum fraction of excessive citations that the donor/recipientjournals provide/receive within their group. Changing the value of θ induces a hierarchical onion-like structure on the detectedjournal groups. The inner cores that survive with a larger θ value are considered to be tighter citation groups, which aremore plausible citation cartel candidates. In this study, we set θ = .

15 to allow for a fair comparison with TR; all recipientjournals suspended by TR received at least 15% of their incoming citations from donor journals . Then, we manually inspectedeach group detected by CIDRE to pinpoint individual papers, authors, editors, and speciﬁc journals associated with excessivecitations. However, manual inspection is a costly task and hard to scale up when dealing with large numbers of groups. Thisproblem will manifest itself when one analyzes citation cartels composed of authors because an author network can be much arger than a journal network. Therefore, in practice, it may be useful to prioritize journal groups that survive with higherthresholds. With CIDRE, one can easily determine the ranking of journal groups according to this criterion because graduallyincreasing θ to reveal onion-like structure is straightforward and not computationally too costly.Regardless of the conclusions that one may draw on speciﬁc anomalies, our ﬁndings reveal the widespread presence ofjournals whose JIFs are substantially hoisted by the citations received from a small group of other journals. It would be hardnot to relate this with the ever-increasing emphasis on citations and bibliometric indicators, and the pressure it puts on journaleditors to boost growth in such numbers. We believe our ﬁndings to be a rather direct consequence of this environment, whereactors are incentivized to act on the very same metrics according to which they are ranked, in a feedback loop that closelyechoes Goodhart’s Law: “when a measure becomes a target, it ceases to be a good measure” . In this respect, we believe thatour results should encourage a more critical and nuanced approach to the use and interpretation of citation-based bibliometricindicators. Detection of citation cartels

We assume that a citation cartel is composed of journals that act as donors, recipients, or both. A donor journal gives excessivecitations to the journals in the same cartel. A recipient journal receives excessive citations from the journals in the same cartel.Algorithm CIDRE ﬁnds groups of journals, U , composed of the donor and recipient journals, which are suspected citationcartels. We quantify the extent to which a journal i acts as donor or recipient within group U using the donor score x d and therecipient score x r , respectively. They are deﬁned by x d ( i , U ) : = s out i ∑ j ∈ U , j (cid:54) = i W i j h ( i , j ) , (1) x r ( i , U ) : = s in i ∑ j ∈ U , j (cid:54) = i W ji h ( j , i ) , (2)where s out i : = ∑ Nj = W i j and s in i : = ∑ Nj = W ji are the out-strength and in-strength of journal i , respectively, and N is the number ofnodes. Function h ( i , j ) is an indicator function, where we set h ( i , j ) = i to journal j are excessiverelative to the null model; otherwise h ( i , j ) =

0. The donor and recipient scores range in [ , ] . A large donor score for journal i , i.e., x d ( i , U ) , implies that i cites papers in other journals in U more often than expected for a null model; similar for therecipient score.The citations from journal i to journal j are deemed to be excessive if and only if they satisfy the following two conditions.First, more than half of citations made to papers published in any previous years from i to j were made to papers published inthe last two years (i.e., effective citations). Second, the number of citations, W i j , is larger than that expected for a null model.Speciﬁcally, for each directed edge from node i to node j , we compute the p -value as the probability p i j that the null modelassigns a weight w that is larger than or equal to the actual weight of edge ( i , j ) in the given network, i.e., W i j . One obtains p i j = − W ij − ∑ w = P null i j (cid:16) w ; ˆ λ i j (cid:17) , (3)where ˆ λ i j is a parameter for the null model. We describe the null model in the next section.We perform a statistical test for each edge at the signiﬁcance level of α = .

01, with the Benjamini-Hochberg correction to suppress the false positives due to the multiple comparison problem. In other words, one regards m edges with the smallest p -values as signiﬁcant (i.e, h ( i , j ) =

1) and other edges as insigniﬁcant (i.e., h ( i , j ) = m is given by the largestinteger (cid:96) for which p ( (cid:96) ) ≤ (cid:96) α / M , where p ( (cid:96) ) is the (cid:96) th smallest p -value and M is the number of edges in the network.After removing the insigniﬁcant edges, we seek groups of journals that have a donor or recipient score larger than aprescribed threshold θ . To this end, we use the following algorithm, akin to the k -core decomposition algorithm . First, weprune the network by keeping only the edges with h ( i , j ) =

1. Second, we initialize U = { , . . . , N } , and compute the donor andrecipient scores for each node. Third, we remove a node i from U if x d ( i , U ) < θ and x r ( i , U ) < θ . Then, we recompute the donorand recipient score for all neighbors of i . We repeat the third step until no node is removed. Fourth, we partition U into disjointgroups U (cid:96) ( (cid:96) = , , . . . ), where each U (cid:96) is a maximal weakly connected component in the edge-pruned network composed ofthe nodes in U . We expect that citation cartels contain sufﬁciently many within-group citations. Therefore, we remove U (cid:96) if thesum of the weight of edges within U (cid:96) except self-loops is less than θ w . We set θ = .

15 and θ w =

50. We note that CIDRE is aspecial case of the generalized core decomposition algorithm with vertex property function f ( i , U ) = max ( x d ( i , U ) , x r ( i , U )) . ull model We employ the dcSBM as a null model. The dcSBM consists of blocks, where each block is a group of journals. ThedcSBM places an edge from node i to j ( i , j = , , . . . , N ) with a probability determined by the block memberships, out-strength s out i of node i in the original network, and in-strength s in j of node j . The generated networks preserve the expectation of s out i and s in i for each node i , and the expected number of edges between and within the blocks of the given network.With the dcSBM, one assumes that the weight of the edge from node i to j obeys a Poisson distribution given by P null i j ( w ; λ i j ) = λ wi j exp ( − λ i j ) w ! , (4)where P null i j ( w ; λ i j ) is the probability that the dcSBM assigns weight w ( w = , , , . . . ). Parameter λ i j is equal to the meanfor the Poisson distribution, i.e., the expected number of citations for the null model. We set λ i j to the maximum likelihoodestimator conditioned on the blocks, which is given by λ i j = s out i s in j Λ g i , g j S out g i S in g j , (5)where g i is the ID of the block to which node i belongs, Λ uv is the number of directed edges from block u to block v , S out u = ∑ N (cid:96) = s out (cid:96) δ ( g (cid:96) , u ) and S in u = ∑ N (cid:96) = s in (cid:96) δ ( g (cid:96) , u ) are the sum of out-strength and in-strength of nodes in block u , respectively,and δ ( · , · ) is Kronecker delta .One may be tempted to use the λ i j value given by Eq. (5) to compute the p -value using Eq. (3). However, if λ i j is smallerthan one, even the edges with the smallest weight W i j = W i j to be large for journal i to be regarded to excessively cite journal j . Therefore, weuse a clipped value, ˆ λ i j , to compute the p -value using Eq. (3), whereˆ λ i j = max ( , λ i j ) . (6)We ﬁnd the blocks by ﬁtting the dcSBM to the journal citation networks. Speciﬁcally, we ﬁrst construct an aggregatednetwork, in which the weight of the edge from node i to node j , denoted by W i j , is given by the sum of the weight over thenetworks between 2000 and 2019, i.e., W i j = ∑ t = W ( t ) i j , where W ( t ) i j is the weight of the edge from node i to node j in thenetwork in year t . Then, we identify the blocks of the aggregate network using a non-parametric Bayesian method withouthierarchical structure . Note that we use the aggregated network WWW to ﬁnd the blocks of journals. Then, with the detectedblocks, we compute λ i j given by Eq. (5) for each yearly network WWW ( t ) . This is because the number of citations monotonicallyincreases over time. Therefore, recent yearly citation networks tend to have more excessive citations than older networks if oneuses λ i j computed for the aggregated network. Author contributions

N. M. conceived the research. S. K. performed the numerical analysis. S. K., G. L. and N. M. contributed to the development ofthe algorithm and wrote the manuscript.

Competing interests

The authors declare no conﬂict of interest.

Acknowledgements

GL acknowledges support from an EPSRC Early Career Fellowship (Grant No. EP/N006062/1). NM acknowledges supportfrom AFOSR European Ofﬁce (Grant No. FA9550-19-1-7024).

References Bornmann, L. & Mutz, R. Growth rates of modern science: A bibliometric analysis based on the number of publicationsand cited references. J. Assoc. Inf. Sci. Technol. , 2215–2222 (2015). Barnes, C. The h-index debate: An introduction for librarians. J. Acad. Librariansh. , 487–494 (2017). Garﬁeld, E. & Welljams-Dorof, A. Citation data: Their use as quantitative indicators for science and technology evaluationand policy-making. Sci. Public Policy , 321–327, DOI: 10.1093/spp/19.5.321 (1992). . Adam, D. The counting house. Nature , 726–729, DOI: 10.1038/415726a (2002). King, D. A. The scientiﬁc impact of nations. Nature , 311 (2004). Bornmann, L. & Bauer, J. Which of the world’s institutions employ the most highly cited researchers? An analysis of thedata from highlycited.com. J. Assoc. Inf. Sci. Technol. , 2146–2148, DOI: 10.1002/asi.23396 (2015). Garﬁeld, E. The history and meaning of the journal impact factor. JAMA , 90–93, DOI: 10.1001/jama.295.1.90 (2006).https://jamanetwork.com/journals/jama/articlepdf/202114/jco50055.pdf. Saha, S., Saint, S. & Christakis, D. A. Impact factor: A valid measure of journal quality? J. Med. Libr. Assoc. , 42(2003). Wilhite, A. W. & Fong, E. A. Coercive citation in academic publishing. Science , 542–543, DOI: 10.1126/science.1212540 (2012).

Foo, J. Y. A. Impact of excessive journal self-citations: A case study on the Folia Phoniatrica et Logopaedica journal.Sci. Eng. Ethics , 65–73, DOI: 10.1007/s11948-009-9177-7 (2011). Franck, G. Scientiﬁc communication–A vanity fair? Science , 53–55, DOI: 10.1126/science.286.5437.53 (1999).https://science.sciencemag.org/content/286/5437/53.

Fister, I. J., Fister, I. & Perc, M. Toward the discovery of citation cartels in citation networks. Front. Phys. , 49, DOI:10.3389/fphy.2016.00049 (2016). Davis, P. The emergence of a citation cartel. (2012). Available at https://scholarlykitchen.sspnet.org/2012/04/10/emergence-of-a-citation-cartel/. [Accessed 26th Jun. 2020].

Van Noorden, R. Brazilian citation scheme outed. Nature , 510–511, DOI: 10.1038/500510a (2013).

Davis, P. When a journal sinks, should the editors go down with the ship? (2014). Available at https://scholarlykitchen.sspnet.org/2014/10/06/when-a-journal-sinks-should-the-editors-go-down-with-the-ship. [Accessed 26th Jun. 2020].

Thomson Reuters. Title Suppressions. (2019). Available at http://help.prod-incites.com/incitesLiveJCR/JCRGroup/titleSuppressions. [Accessed 26th Jun. 2020].

Watch, R. Another editor resigns from journal hit by citation scandal. (2017). Available at http://retractionwatch.com/2017/04/07/another-editor-resigns-journal-hit-citation-scandal/. [Accessed 26th Jun. 2020].

Lancichinetti, A. & Fortunato, S. Consensus clustering in complex networks. Sci. Rep. , 336, DOI: 10.1038/srep00336(2012). 1203.6093. Hric, D., Kaski, K. & Kivelä, M. Stochastic block model reveals maps of citation patterns and their evolution in time.J. Informetrics , 757–783, DOI: 10.1016/j.joi.2018.05.004 (2018). Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal community structure.Proc. Natl. Acad. Sci. USA , 1118–1123, DOI: 10.1073/pnas.0706851105 (2008). 0707.0609.

Sinha, A. et al. An overview of microsoft academic service (MAS) and applications. InProceedings of the 24th International Conference on World Wide Web, WWW’ 15 Companion, 243–246, DOI:10.1145/2740908.2742839 (Association for Computing Machinery, New York, NY, USA, 2015).

Kojaku, S. Python code for CIDRE (2020). Available at https://github.com/skojaku/journal-citation-cartels/. [Accessed26th Jun. 2020].

Karrer, B. & Newman, M. E. J. Stochastic blockmodels and community structure in networks. Phys. Rev. E , 016107(2011). Peixoto, T. P. Nonparametric bayesian inference of the microcanonical stochastic block model. Phys. Rev. E , 012317,DOI: 10.1103/PhysRevE.95.012317 (2017). Thomson Reuters. Web of Science master journal list (2019). Available at https://mjl.clarivate.com/home/. [Accessed 26thJun. 2020].

Beall, J. What I learned from predatory publishers. Biochem. Medica , 273–278 (2017). Livan, G., Caccioli, F. & Aste, T. Excess reciprocity distorts reputation in online social networks. Sci. Rep. , 3551, DOI:10.1038/s41598-017-03481-7 (2017). Katz, J. S. Geographical proximity and scientiﬁc collaboration. Scientometrics , 31–43, DOI: 10.1007/BF02018100(1994). Pan, R. K., Kaski, K. & Fortunato, S. World citation and collaboration networks: uncovering the role of geography inscience. Sci. Rep. , 902, DOI: 10.1038/srep00902 (2012). Li, W., Aste, T., Caccioli, F. & Livan, G. Reciprocity and impact in academic careers. EPJ Data Sci. , 20, DOI:10.1140/epjds/s13688-019-0199-3 (2019). Yoon, A. H. Editorial bias in legal academia. J. Leg. Analysis , 309–338, DOI: 10.1093/jla/lat005 (2013). https://academic.oup.com/jla/article-pdf/5/2/309/2555830/lat005.pdf. Perez, O., Bar-Ilan, J., Cohen, R. & Schreiber, N. The network of law reviews: Citation cartels, scientiﬁc communities, andjournal rankings. Mod. Law Rev. , 240–268, DOI: 10.1111/1468-2230.12405 (2019). Davis, P. Reverse engineering JCR’s self-citation and citation stacking threshold (2017). Available at https://scholarlykitchen.sspnet.org/2017/06/05/reverse-engineering-jcrs-self-citation-citation-stacking-thresholds/. [Accessed26th Jun. 2020].

Manheim, D. & Garrabrant, S. Categorizing variants of Goodhart’s law. arXiv:1803.04585 (2018).

Benjamini, Y. and Hochberg, Y. Cntrolling the false discovery rate: A practical and powerful approach to multiple testing.J. Royal Stat. Soc. Ser. B , 289–300 (1995). Batagelj, V. & Zaveršnik, M. Fast algorithms for determining (generalized) core groups in social networks.Adv. Data Analysis Classif. , 129–145, DOI: 10.1007/s11634-010-0079-y (2011)., 129–145, DOI: 10.1007/s11634-010-0079-y (2011).