Quantifying Success in Science: An Overview
Xiaomei Bai, Hanxiao Pan, Jie Hou, Teng Guo, Ivan Lee, Feng Xia
DDate of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
Quantifying Success in Science: AnOverview
XIAOMEI BAI , HANXIAO PAN , JIE HOU , TENG GUO , IVAN LEE , FENG XIA Computing Center, Anshan Normal University, Anshan 114007, China School of Software, Dalian University of Technology, Dalian 116620, China STEM, University of South Australia, Adelaide SA 5001, Australia School of Engineering, IT and Physical Sciences, Federation University Australia, Ballarat, VIC 3353, Australia
Corresponding author: Xiaomei Bai (e-mail: [email protected]).This work was partially supported by Liaoning Provincial Key R&D Guidance Project (2018104021) and Liaoning Provincial NaturalFund Guidance Plan (20180550011).
ABSTRACT
Quantifying success in science plays a key role in guiding funding allocations, recruitmentdecisions, and rewards. Recently, a significant amount of progresses have been made towards quantifyingsuccess in science. This lack of detailed analysis and summary continues a practical issue. The literaturereports the factors influencing scholarly impact and evaluation methods and indices aimed at overcomingthis crucial weakness. We focus on categorizing and reviewing the current development on evaluationindices of scholarly impact, including paper impact, scholar impact, and journal impact. Besides, wesummarize the issues of existing evaluation methods and indices, investigate the open issues and challenges,and provide possible solutions, including the pattern of collaboration impact, unified evaluation standards,implicit success factor mining, dynamic academic network embedding, and scholarly impact inflation. Thispaper should help the researchers obtaining a broader understanding of quantifying success in science, andidentifying some potential research directions.
INDEX TERMS success in science, scholarly impact, evaluation indices.
I. INTRODUCTION
Success in science refers to scientists’ achievements in theiracademic careers. Quantifying success in science has de-veloped into a very important part of bibliometrics andscientometrics. An influential publication or scholar alwaysbrings much to the followers to carry out their research.Therefore, the ability of bibliography retrieval is very im-portant for researchers, including mining, managing, andexamining scholarly big data to identify the successful papersand scholars [1]–[5]. In addition, quantifying scholar impacthas special significance in funding allocation and recruitmentdecisions. Quantifying the impact of paper and journal canhelp scientists know the frontier of science development.Therefore, quantifying success in science provides usefulguidance to the scientific community, such as offering candi-dates to university, recommending scientists for promotion,and distribution for research funds [6], [7].Quantifying success in science mainly focuses on quan-tifying the current impact of academic entities, includingpaper, scholar, journal, scholarly team, and institution [8]–[11]. Because the research on the impact of paper, author, and journal is very rich, this paper mainly introduces quantifiedsuccess in science from these three aspects. Generally, thenumber of citations is used as an evaluation indicator, whichis derived from its easy availability. Lots of factors influencea paper’s success, such as paper’s visibility [12], [13] andpaper’s age [14]. A common method to judge the successof a scholarly paper is to use evaluation indicators, whichmay take into several important factors. The counting-basedand network-based evaluation methods are frequently usedto quantify success in science. The counting-based methodsare the most direct representation of evaluating, such ascitations, author’s h-index [15], and Journal Impact Factor(JIF) [16]. Different academic entities form different kindsof academic networks, such as citation network, co-authornetwork, and co-citation network [17]. Currently, the HITS-type and PageRank-type algorithms can mine the complexscholarly relationship based on different scholarly networksand give reasonable evaluation. The features of scholarlynetworks are also critical to evaluate paper impact. Further,based on these features, many researchers have improvedPageRank [18] or HITS [19] algorithms to make them more
VOLUME 4, 2016 a r X i v : . [ c s . D L ] A ug uitable to measure the impact of paper.Same as quantifying the impact of paper, scholar impact isalso influenced by many factors. Lots of methods and indicesto measure scholar impact are proposed, such as h-index[20], g-index [21], and hg-index [22]. These indices can beunfair for some young researchers because the quality andquantity of a scholar’s publications are associated with theiracademic ages. The methods based on the network can avoidthis situation to a certain extent.Evaluating journal impact is an important part of quan-tifying success in science. Many network-based evaluationmethods and indices are used to quantify the impact ofpaper and author, which can also be used to evaluate journal[23]–[26]. These methods are based on PageRank, HITS, orconsider the structural position of a journal in the journalcitation network. In addition, Journal Citation Reports (JCR)is very popular for ranking journals.Even though the existing research provides a tool to quan-tify success in science, it still has some limitations. Everyindicator to quantify scientific impact has its shortages. Inparticular, in quantifying scientific success research, one ofthe most challenging problems stems from the heterogeneousattribute and the dynamic nature of big scholarly data. Atpresent, in most of quantifying scientific success methods,implicit features and implicit relationships have attracted theattention of researchers [27].This paper presents a review of recent developments inquantifying success in science and this review complementsrelevant work in the past: Wildgaard et al. [28] present areview on author impact evaluation. One limitation of thisreview is that it does not consider paper and journal impactevaluation research. Bai et al. [9] offer a review of theliterature on paper impact evaluation. This overview coverskey techniques and paper impact metrics. The limitationof this work is that authors have not consider author andjournal impact evaluation. In addition, factors influencingscholarly impact have not analyzed. Therefore, in this paper,the progress of impact evaluation of the paper, author and,journal is described in detail.FIGURE 1 shows the framework of quantifying success inscience. Quantifying success in science includes the follow-ing parts: data collection, data pre-processing, relationshipanalysis, evaluation method and evaluation indices. Severalpublic accessible data sets are used to quantify success inscience, including American Physical Society (APS) , Dig-ital Bibliography & Library Project (DBLP) , and MicrosoftAcademic Graph (MAG) . Data pre-processing in quanti-tative scientific success studies is very important becauseit relates to the accuracy. The homogenous and heteroge-neous scholarly networks are used to research the scholarlyrelationships such as citation relationships, co-author rela-tionships, and paper-journal relationships. Spearman’s rank http://publish.aps.org https://dblp.uni-trier.de/ http://aka.ms/academicgraph correlation coefficient, Discounted Cumulative Gain, and RIcan be used as evaluation metrics for quantifying successin science [29], [30]. Specially, the heterogeneous scholarlynetwork structures have increased the challenges in scholarlynetwork analysis.To retrieve the papers of quantifying success in science,based on Google Scholar, we enter search terms such asthe success of science, paper impact, scholar impact, journalimpact, etc. We first search for the related papers recentlypublished in top journals and top conferences, and then lookfor their references, and the papers cite these papers to obtainmore related papers. Search for papers in a step-by-stepmanner, then filter and classify from three aspects: paperimpact, scholar impact and journal impact, and retain therepresentative related papers. Based on the above work, wemark the publication years of these papers, read through thesepapers by year, analyze and summarize the following aspects:the features that influence scholarly impact, evaluation meth-ods and indices. For example, in terms of these features ofevaluation paper impact, we classify these features, includingreference, references, selected features, statistical feature,network feature, explicit feature, implicit feature, and eval-uating paper impact. By analyzing and summarizing theseevaluation methods, we identify open issues and challenges,and provide possible solutions.The rest of this survey is organized as follows. In SectionII, we discuss the evaluation of paper impact. In Section III,we introduce the evaluation of author impact. The evaluationof journal impact is discussed in section IV. Open issues willbe discussed in Section V. Finally, we conclude this surveyin Section VI. II. EVALUATION OF PAPER IMPACT
In this section, we will make a detailed introduction to theevaluation methods and indices of paper impact. Besides,we will discuss the evolution of the existent methods andindices, showing their advantages and shortcomings. At first,we begin with the evaluation of paper impact, because manyassessment methods and indices of scholars and journalsare based on the assessment of their papers. Therefore, itis of great significance whether the quality of papers canbe quantified accurately. Although the value of a paper ismainly based on its content, the evaluation of its content iseasily influenced by subjective factors, and the evaluationefficiency cannot meet the demand of scholarly bid data.This phenomenon drives researchers to give some accurate,efficient automatic evaluation methods. One possible solu-tion is to construct a multi-dimensional metric in which theimportance of citation, social relationships of authors, therelationship between the impact of early citers and scholarlypaper impact, and citation inflation need to be explored.
A. FACTORS INFLUENCING THE IMPACT OF PAPERS
TABLE 1 shows an example of selected features for evalu-ating paper impact, including references, selected features, VOLUME 4, 2016 ata Sets
MAG APS DBLP
Data Pre-processing titlecitationsauthorinstitution research field
Relationships Analysis citation relationshipsco-author relationshipspaper-journal relationships
Factors Selection citationstimenumber of paperauthor prestigeJIF Evaluation Methodscounting-based network-based Evaluation MetricsSpearman’s rank correlationcoefficientDCGRI
FIGURE 1: Framework of quantifying success in science.statistical feature, network feature, explicit feature, implicitfeature, and evaluating paper impact.The number of citations has been used as a metric toevaluate paper impact for a long time [31]. Since the numberof citations is relatively easy to obtain, it is frequently manip-ulated such as self-citation, mutual citation, and friend’s ci-tation. Although some scholars can cite their papers, becausetheir research subjects can have several stages output and theformer results can be the foundation of the latter. But if aself-citation only means to increase the number of citations, itwill mislead the scholarly evaluation and bring unfair factorsto the evaluation system. For inappropriate citations, previ-ous researchers proposed corresponding methods to wakenthe influence of self-citation by relying on the higher-ordercitation network [27].Previous research shows that the impact of paper willdecay over time, which confirms that the age of a paper isa factor influencing its impact. Generally, an old paper hasmore citations than a new one, but its work was alreadycovered by new papers so that it could get fewer citationsin the future. Parolo et al. [14] showed that the decay ofthe attention paid to a paper is a universal phenomenon,and the decay rate is close to a power law. In some cases,papers can be forgotten more quickly so the attention decayis faster, which fits an exponential curve. The time factor, theprestige of a paper, and the prestige of the author were usedto evaluate scholarly paper impact [37]. Based on the threefactors, they evaluated scholarly paper impact by predictingthe number of citations of scholarly papers in the future.Wang et al. [33] considered the aging factor to evaluatepaper impact because it can capture the fact that new ideasare integrated in subsequent work. Wang et al. [38] firstdeveloped the three indices: the time-weighted citation count,the citation width, the citation depth. They then leveragedentropy to weight these indices to evaluate paper impact.Chan et al. [39] discussed that the impact of authors andaffiliations can influence on the impact of their papers. Intheir research, they argued that the reputation of authorsand the impact of their affiliations had the power to boostpaper impact in the early stages of publication, but this influence could decay fast and in the following stages. Chenet al. [34] found the scientific gems using Google’s PageRankalgorithm in the citation network. Zhang et al. [35] evaluatedthe impact of authors and papers based on the heterogenousauthor-citation academic networks.In addition to the factors mentioned above, some otherfactors were also used to evaluate paper impact, such asindividual, institutional and international collaboration, ref-erence impact, reference totals, keyword totals, and abstractreadability [40]. Preferential attachment, fitness, and agingfactors were used to quantify the long-term scientific impact,and the three factors can drive the citation history of schol-arly paper [33]. In this research, the preferential attachmentcaptures the fact that highly cited papers are more likely tobe cited again than less-cited papers. Fitness captures theinherent differences between scholarly papers. The aging hasbeen introduced before. It can be traced back to the journalimpact factor that was once used as a criterion for assessingthe impact of a paper [41]. Altmetrics evaluated scholarlyimpact based on the activities in the social media platforms,such as citations, blogs, tweets, download statistics, andattributions in research articles [36]. Altmetrics scores wereused to complement the evaluation of scholarly paper withnew insights [42]. Since we have already known most factorsthat influence the impact of paper, the evaluation methodsand the corresponding indices can be designed.
B. COUNTING-BASED EVALUATION METHODS ANDINDICES
TABLE 2 shows the comparison of counting-based evalua-tion methods and indices from the following aspects: methodand reference, selected factors, importance of each citation,advantage and disadvantage.Garfield et al. [47] first proposed using the number ofcitations to assess the impact of scholarly papers. Citationsare the simplest and most direct counting-based index ofpaper impact. However, citations as an evaluation metrichave some drawbacks. For example, it relies heavily on thetime of publication of the paper. The longer the time is, themore the citations are. Considering this drawback, previous
VOLUME 4, 2016 ABLE 1: An example of selected features for evaluating paper impact.
References Selected Features StatisticalFeature NetworkFeature ExplicitFeature Implicit Feature Evaluating paperimpact[14] citation rate of a paper,time yes no yes no the citation rate of apaper at a given time[27] a relative citation weight yes no no yes applying a relativecitation weight to thehigher-order quantumPageRank algorithm[30] collaboration times, thetime span ofcollaboration, citing timesand the time span of citing yes no yes no weakening therelationship ofConflict of Interest(COI) in the citationnetwork[31] number of citations yes no yes no using the number ofcitations[32] citations, authors,journals/conferences andthe publication timeinformation no yes yes yes integrating theselected features intoPageRank and HITSalgorithms[33] preferential attachment,aging, fitness yes no no yes identifying the threefundamentalmechanisms toevaluate long-termimpact[34] importance of paper no yes yes no applying the GooglePageRank algorithmto obtain the relativeimportance of allpublications[35] citation relevance andauthor contribution no yes yes no using the selectedfeatures to weightcitation network andauthorship network toevaluate paper impact[36] Altmetrics yes no yes no monitoring citations,blogs, tweets,download statisticsand attributions inresearch articles[37] prestige of a paper,prestige of author, time yes yes no yes using the citationnetwork, theauthorship networkand the publicationtime of the article forpredicting futurecitations[38] the time-weighted citationcount, the citation width,the citation depth yes no no yes using entropy weightthe three indices research used the journal impact factor to quantify the impactof paper [41]. The reason is that to a certain extent, journalimpact can characterize paper impact. However, Seglen etal. [41] summarized problems associated with the use ofjournal impact factors, and they found that the journal impactfactor is not representative of individual paper. It has beenrecognized that not all citations are equal importance andhence the importance of citation needs be distinguished [45].To distinguish the importance of citation, previous re-searchers have made many attempts. Wan et al. [44] dividedthe importance of citation into 5 levels, which was calledcitation strength. In their research, the importance of citationwas determined by the following features: occurrence times,located section, time interval, the average length of citingsentences, average density of citation occurrences, and self- cited. Then a SVR model was used to calculate every cita-tion’s importance level with giving some artificially labeleddata. The impact of a paper is calculated by summing upall the citation strengthes. Their experimental results showedthat ranking papers using citation strength fitted the groundtruth better. Zhu et al. [45] distinguished the importance ofcitation by identifying a set of four features that are useful todetermine the impact of a scholarly paper, including citationlocation in paper, semantic similarities between titles of citedpaper and the content of citing paper, cited frequency, numberof citations in a literature.Anfossi et al. [46] argued that it was more reasonable torank papers by combining the information of several indica-tors than using only one. In their paper, an evaluation toolwas proposed, which used paper’s normalized distribution of VOLUME 4, 2016
ABLE 2: An example of counting-based method comparison for evaluating paper impact.
Method and Reference Selected Factors Importance ofEach Citation Advantage Disadvantagecitations [43] citations equal Easy to obtained. Easy to be manipulated. Strongdespondence on paper’s age.impact factor [41] number of paper,citations ofpaper, time equal Easy to calculate. Easy to be manipulated. Hard tounify impact factors acrossdifferent disciplines.a SVR model [44] occurrence times,located section,time interval,self-cited unequal Distinguish theimportance ofcitation. Hard to calculate.a supervised machinelearning model [45] citation location,semanticsimilarities, citedfrequency,number ofcitations unequal It can distinguishthe importance ofcitation. Hard to calculate.paper’s normalizeddistribution of citation andJIF [46] distribution ofcitation, JIF equal It is feasible on ascale typical of anationalevaluationexercise. Easy to be manipulated. citation and JIF and located a paper in the (citation, JIF) spaceintuitively as a scatter plot. Then this space was divided intoregions by drawing thresholds as weighted linear combina-tions of the paper’s citation and JIF, shown in Function (1) , f n ( CIT, JIF ) =
Const n + a n · CIT + a n · JIF + a n · CIT · JIF + a n · CIT + a n · JIF + · · · (1)where const n is a constant that controls the segmentationof the region, and CIT indicates paper’s citation. The dif-ferent calibrations of the segmentation result in differentclassifications of articles. Before Anfossi’s work, Ancaiani etal. [48] performed an analysis of a large amount of researchoutcomes submitted by Italian universities and other researchbodies.Nowadays more and more research results or papers arespreading on social media, which is helpful to promote ascholar’s impact. The times of downloading, sharing, orcommenting of papers on the online social networks havealready been a group of metrics to evaluate the researchoutputs, which are known as Altmetrics [36]. The socialnetwork-based Altmetrics are used more and more widely asa new emerging evaluation metrics of paper. Xia et al. [49]performed an analysis on how the Twitter and Facebookusers impact the paper’s influence published on Nature . Theyfound that the users of Twitter are easier to spread the impactof papers published on
Nature . Although Altmetrics are ableto complement and improve the assessment of paper impact,Altmetrics are not authoritative as an evaluation indicator.Mainly because Altmetrics are easily manipulated as cita-tions. The method of quantifying academic impact based onAltmetrics needs further exploration.
C. NETWORK-BASED EVALUATION METHODS ANDINDICES
TABLE 3 shows the comparison on network-based eval-uation methods and indices from the following aspects:method and reference, selected factors, scholarly network,algorithms, advantage and disadvantage.A classical network-based evaluation method is PageRankalgorithm [18]. Another famous algorithm for evaluating theimportance of nodes in heterogeneous networks is HITS. Thetwo methods have been used to quantify the impact of papers.PageRank algorithm is used in a homogeneous scholarly net-work, and HITS is used a heterogeneous scholarly network.FIGURE 2 shows several typical scholarly networks forpaper impact evaluation, such as citation network, co-authornetwork, paper-author network, paper-journal network. Thefour scholarly networks are generated from randomly se-lected 10 authors for the computer science area in the MAGdataset. The different color nodes indicate different types ofacademic entities and the lines between them indicate theirscholarly relationships.Chen et al. [34] applied the Google PageRank algorithmon all publications in the Physical Review family of journalsfrom 1893 to 2003 to find out some exceptional papers.PageRank can find the linear relation among papers in thecitation network. Recently, London et al. [55] proposed alocal form of PageRank to evaluate the impact of paper onlyon a small set of nodes extracted from the whole citationnetwork. A paper that has more citations or has been citedby an important paper will be set a higher score through thealgorithm. But the classical PageRank algorithm is non-time-sensitive. This leads to an unreasonable result that an out-of-date paper may still get a high impact because of its citationsaccumulating long before, but its true value has alreadybeen replaced by many new publications. To overcome thisproblem, Walker et al. [50] introduced CiteRank, to weightwith time-based on PageRank to promote recent publications.
VOLUME 4, 2016 ABLE 3: An example of network-based method comparison for evaluating paper impact.
Method andReference ScholarlyNetwork HomogeneousNetwork HeterogeneousNetwork Algorithms Advantage DisadvantagePageRank [34] citation network yes no PageRank It begins to use astructured approachto quantify paperimpact. It does not considerattenuation of paperimpact over time.CiteRank [50] citation network yes no PageRank It promotes theimpact of recentpublications. It does not considerthe impact of authorand journal.nonlinearityPageRank [51] citation network yes no PageRank This method cancontrol the paper’sscore accumulation. It does not considerthe impact of authorand journal.PageRank-typemethod [52] co-authornetwork, citationnetwork,author-papernetwork,paper-text featurenetwork,author-text featurenetwork yes yes PageRank This method cancontrol the paper’sscore accumulation. It does not considerthe impact of authorand journal.HITS-typemethod [53] citationnetwork,co-authornetwork yes yes HITS This method canevaluate paper and itsauthor at the sametime. It does not considerthe impact of journal.Tri-Rank [54] citationnetwork,co-author network,venue citationnetwork yes yes HITS This method can rankauthors, papers andvenuessimultaneously inheterogeneousnetworks. It does not considerattenuation of paperimpact over time.Future-Rank [37] citation network,paper-authornetwork yes yes PageRank,HITS FutureRank cancombine informationabout citations,authors andpublication time torank papers. It does not considerthe impact of journal.CAJTRank [32] citation network,paper-authornetwork,paper-journal yes yes PageRank,HITS It can combineinformation aboutcitations, authors,journal andpublication time torank papers. Citation weights areequal.COI-Rank [30] citation network,paper-authornetwork,paper-journal yes yes PageRank,HITS It can distinguish theimportance of citationin heterogeneousscholarly network. COI relationshipcontains manyfactors, and it is noteasy to mine.higher-order weightedquantumPageRank [27] citation network yes no QuantumPageRank It can reveal theactual impact ofpapers, includingnecessaryself-citations. Time is costly.
The function of this method is as follows: T = I · ρ + (1 − α ) W · ρ + (1 − α ) W · ρ + · · · (2) T is a matrix of the final scores of all papers. W is thetransferring probability matrix where W ij = 1 /k outj if j cites i and 0 otherwise, where k outj is the out degree of the j thpaper. ρ i is the initial probability of selecting the i th paper inthe citation network, there given as ρ i = e − age i /τ dir , where age i indicates years the i th paper’s after published.Many efforts have been paid for updating the PageRank tomake it fit the characteristics of the academic network. Yao etal. [51] introduced nonlinearity to the PageRank algorithm byaggregating the score from downstream neighboring nodes in a nonlinearity way. The iteration function changes into thefollowing form correspondingly: s i ( t ) = α + (1 − α ) (cid:20) n (cid:88) j =1 N δ k outj , s ( t − θ +1 (cid:118)(cid:117)(cid:117)(cid:116) n (cid:88) j =1 A ij (1 − δ k outj , )( s j ( t − k outj ) θ +1 (cid:21) (3)By tuning the value of θ , this method can control the paper’sscore accumulation and make it more sensitive to the citer’simpact. This nonlinear method considers that the value of acitation from high impact paper is more important than theone from low-level paper. VOLUME 4, 2016
IGURE 2: Several typical scholarly networks for paper impact evaluation.Wang et al. [52] proposed a PageRank-type method thatused several scholarly networks to rank papers, including atime-aware co-author network ( M AA ), a time-aware papercitation network ( M P P ), an author-paper network ( M AP )indicating the paper’s authorship, a paper-text feature net-work ( M P T ) indicating the paper’s textual features and anauthor-text feature network ( M AT ). The iteration equationis: R t +1 = M R t , (4)where R = [ A _ P T , A _ A T , A _ F T ] T , and M = α p M P P Λ I β p (1 − α p ) M P A (1 − β p )(1 − α p ) M P T β α M AP α α M AA Λ I (1 − β α )(1 − α α ) M AT (1 − α f )Λ E M P T α f Λ E M T A Λ . Λ I and Λ E are both diagonal matrixes with the diagonalelements Λ ii = 1 and Λ ii = E i , respectively. Λ is a zeromatrix. Vectors A _ P T , A _ A T and A _ F T are authority ofpaper, author and text features respectively.Jiang et al. [56] took this dynamic evolution of citationnetwork into account and put forward a method with thesame idea of PageRank. The method integrates three factorsin scientific development, including knowledge accumulationby individual papers, knowledge diffusion through citationbehavior, and knowledge decay with time elapse. Then it usesa random walk process on the citation network to describethese three factors. The dynamically evolving process issimulated by dividing all papers according to their publishingtime and adding into the citation network partially with thetime sequence. Another type of method is based on HITS [19]. Zhou etal. [53] performed the HITS algorithm on paper’s citationnetwork and co-author network, which were connected byauthorship. In both citation network and co-author network,nodes’ scores were first calculated by PageRank, and then aHITS was performed on the bipartite graph to get the finalscores of papers and authors. So this method can evaluatethe impact of authors and their papers at the same time. Theiteration function is as follows: a t +1 = (1 − λ )( (cid:101) A T ) m a t + λDA T ( AD T DA T ) k d t d t +1 = (1 − λ )( (cid:101) D T ) n a t + λAD T ( DA T AD T ) k a t , (5)where matrix A and D are the transferring probability matrixof co-author network and citation network correspondingly.And (cid:101) A is the iteration matrix in the PageRank process on co-author network, which is given by (cid:101) A = (1 − α ) A + αn A I ,where I is a matrix with all elements equaling 1. (cid:101) D is thesame meaning. Vector a storages the scores of all authors andvector d storages the scores of all papers. A similar method isthe Tri-Rank algorithm proposed by reference [54] in 2014,which took the paper’s publication information into accountand performed a HITS-type method on three linked networks,adding a venue citation network on the two networks usedbefore.In addition, some methods that combine PageRank andHITS to evaluate the impact of papers. A typical one isFutureRank, proposed by reference [37]. Different from othermethods, FutureRank ranks the impact of papers and authorsby predicting their future PageRank scores. PageRank algo-rithm is first used to rank papers via the citation network, andthen the HITS algorithm is used to calculate the authority VOLUME 4, 2016 core of papers and hub score of authors based on the hybridnetwork. After calculating the PageRank score of papers, theauthority score of papers, and the hub score of authors, thefinal result of the evaluation is finally obtained by weightingto these scores, seeing function (6). S ( P i ) = α ∗ P ageRank ( P i )+ β ∗ Authority ( P i )+ γ ∗ Hub ( P i )+(1 − α − β − γ ) ∗ /n, (6)where n is the number of nodes in the network. Wanget al. [32] proposed a similar method that added a jour-nal/conference network to show where the paper was pub-lished. The evaluation method’s form is the same as Futur-eRank but it can rank journals/conferences together. Usingthe HITS algorithm can also evaluate paper and author’squality. Based on their work, Bai et al. [30] ranked scholarlypapers by investigating the citation relationships to weakenthe relationship of Conflict of Interest in the citation network.To a certain extent, this method weakens the impact ofself-citation. Besides, Bai et al. [27] quantified the impactof scholarly papers based on the higher-order weighted ci-tations. In this research, a higher-order weighted quantumPageRank algorithm is developed to reflect the multi-stepcitation behavior. One advantage of the method is that it canweaken the effect of manipulated citation activities. III. EVALUATION OF SCHOLAR IMPACT
The evaluation of scholars always relates to their papers.Many methods can evaluate paper together with its authors,such as Co-rank [53], Tri-Rank [54], FutureRank [37], s-index [57]. These network-based methods usually rank sev-eral academic entities together because using informationprovided by a single network is always not enough to givea reasonable evaluation. There are also some counting-basedevaluation methods like the famous h-index for quantify-ing author impact. In this section, we compare differentcounting-based methods, including method and reference,selected factors, importance of each citation, advantage,and disadvantage. We also compare different network-basedmethods based on the following several aspects: method andreference, scholarly network, homogeneous network, hetero-geneous network, algorithms, advantage, and disadvantage.Besides, we will discuss the evolution of the existent methodsand indices and summarize the issues of these methods. Onepossible solution is to explore the higher-order academic net-work analysis, author impact inflation, and academic successgene.
A. FACTORS INFLUENCING THE IMPACT OFSCHOLARS
The author impact evaluation has undergone a transitionfrom unstructured measure to structured measure [29]. Thefactors used by researchers to assess author impact rangedfrom simple statistical factors to structural factors, from explicit factors to implicit factors. Currently, the commonlyused factors influencing the impact of scholars can be di-vided into six categories including paper-related, author-related, venue-related, social-related, reference-related, andtemporal-related factors. TABLE 4 shows an example ofselected factors for evaluating author impact.In the scientific community, scholars can continuouslyaccumulate academic impact but to some extent, the inherentimpact of scholars determines their final research results.Since the papers published by scholars can represent theimpact of scholars, the paper-related factors are frequentlyused to measure the impact of scholars. These factors canbe selected primarily to consider the quality and quantityof the papers. However, these factors can lead to bias. Theacademic output of scholars is generally related to theiracademic age. Scholars with an old academic age may havemore output. In this way, simply evaluating scholar impact interms of output has a big bias for newcomers. Such biasesalso exist when evaluating scholar impact across researchfields. Scientists have made many attempts to eliminate theimbalance between disciplines in evaluating scholar impact.In addition, the allocation of contributions of co-authors ofa scholarly paper may also lead to bias in scholar impactevaluation. Shen et al. [68] developed a credit allocationalgorithm to capture the co-authors’ contributions.To a certain extent, author-related factors and venue-related factors can reflect the scholar’s impact. Dong etal. [66] found that two factors, the impact of scholars andvenues, played a key role in improving the h-index of leadauthors. Deville et al. [69] discussed the mobility patterns ofscientists at an institutional level and success in science intheir careers. They found that the consequence of scholarsswitching from high-impact institutions to low-impact insti-tutions is a decline in both research quality and output, sug-gesting that the academic environment has an impact on aca-demic outcomes. Scholars also use online platforms (GoogleScholar, Microsoft Academic Search), and social media toenhance their academic impact. Mas-Bleda et al. [70] foundthat although most highly cited scholars working in Europeaninstitutions had their institutional web pages, they rarelymaintained them. Most of them used other social media,which also accelerated the development of Altmetrics.In addition, reference-related factors and time-related fac-tors have attracted scholars’ attention. Dong et al. [66] re-searched scholar’s impact considering two reference-relatedfactors: the ratio of max-h-index citations of references tothe total number of references of the paper and the averagenumber of citations accumulated by references of the paper.Zhang et al. [67] considered academic innovations and as-sessed scholar impact by a Time-aware ranking algorithm,allocating more credits to the newly published papers ac-cording to the representative time functions. Based on theabove factors, many evaluation indices have been proposedto quantify scholar impact. In the following two subsectionswe introduce the counting-based evaluation methods andindices, and network-based evaluation methods and indices VOLUME 4, 2016
ABLE 4: An example of selected factors for evaluating author impact.
Factors Factor category Explicit Factor Implicit Factor Referencesnumber of citations paper-related yes no [58], [59]the number ofpublications paper-related yes no [59], [60]papers scores paper-related yes no [61]share keywords betweenauthor and paper paper-related yes no [62]PageRank paper-related no yes [63], [64]paper authority vector paper-related no yes [65]the number of author author-related yes no [59]Maximum Entropy author-related yes no [60]venues scores venue-related yes no [61]journal impact factor venue-related yes no [60]paper’s references beingcited by the author before,ratio of the paper’sreferences being cited bythe author before, paper’sreferences in the author’sprevious publications reference-related yes no [62], [66]times the author attend thepaper’s venue before, ratioof times the author attendthe paper’s venue before reference-related yes no [62]time time-related yes no [67] respectively.
B. COUNTING-BASED EVALUATION METHODS ANDINDICES
In 2005, Hirsch [20] proposed the famous h-index to evaluatescholar impact, which is the most famous metric widely usedin the whole scientific community. A scholar’s h-index meansthat he has at least h papers cited at least h times. Theadvantages of h-index include that it is easy to compute andthe definition combines quantity and quality of a scholar’soutputs. But there are still some scholars who argue that h-index has many shortcomings such as the unbalance betweendifferent disciplines, the allocation of co-authors’ impact, andthe impact of highly cited papers ignored. To keep the impactof highly cited papers from being ignored, Egghe et al. [21]proposed the g-index. If the citations of all papers publishedby an author are listed in descending order, the g-index is top g scholarly papers with g citations. Similar to the g-index,Jin et al. [71] proposed R-index and AR-index to overcomethe shortcomings of h-index. The R-index is defined as R − index = (cid:118)(cid:117)(cid:117)(cid:116) h (cid:88) i =1 cit i , (7)where h is the author’s h-index and cit i indicates the author’spapers that have been cited more than h times, also known asthe h-core papers. The AR-index takes age of publicationsinto account, which is calculated by AR − index = (cid:118)(cid:117)(cid:117)(cid:116) h (cid:88) i =1 cit i a i , (8)where a i denotes the i -th paper’s age. For the same purpose, Zhang [72] divided the author’scitation function into three parts: the h-squared representingthe information of the h-index itself, the excess representingthe information of papers having more citations than h-indexand the h-tail representing the information of papers withfewer citations. Then, a triangle mapping technique was usedto map these three parts to a regular triangle to make theanalysis easier. An author’s impact was mapped to threeparts correspondingly the excess (e-index) representing theresearch quality, the h-tail (t-index) representing the researchquantity and the h-square (h-index) representing the average.This method used three independent parts to quantify anauthor’s impact. In this paper, the authors are divided intotwo types. The first type of authors have published severalhigh-quality papers but these authors have lower H-index orhigher e-index; the second type of authors have published alarge number of low-quality papers, but these authors haverelatively high h-index, t-index, and lower e-index. Doro-govtsev et al. [73] developed the o-index to improve theimpact of most cited papers. An author’s o-index is defined as o = √ hm , where h is the author’s h-index, and m indicatesthe citations of his/her most cited paper(s).Another disadvantage of h-index is that it considers allauthors of a paper equally. Authors of a multi-authored paperalways don’t have equal contribution to the work, therefore,the h-index leads to bias. Many studies have tried to solvethis problem. Wang et al. [74] presented A-index to quantifythe relative contributions of co-authors. Based on A-index,Stallings et al. [60]developed a collaboration index, C-index,to quantify the author impact. C-index was defined by C − index = K (cid:88) k =1 A k , (9) VOLUME 4, 2016 here A k was the author’s A-index. The P-index was pro-posed to quantify researcher’s impact by considering thequality of publications, which was given by P − index = K (cid:88) k =1 A k JIF k , (10)where the JIF k was the impact factor of the journal wherethe k th paper was published. Besides, some researcherspointed out that even authors that had different citation pat-terns may get the same h-index. Farooq et al. [75] proposedthe DS-index, which is an extension of g-index and intend toprovide a distinctive ranking for authors with similar citationpattern. The DS-index is defined as DS − index = g (cid:88) k =1 cit k , (11)where g is the number of g-core papers and cit k is the k thg-core paper’s citation. Same as h-core papers, the g-corepapers are papers that are used to calculate the g-index ofthe author.The indices introduced above are all extension and im-provement of h-index. Using h-index can partly reflect thepublication behavior and the citation distribution of an au-thor. To more reasonably quantify scholar impact, Sinatra etal. [76] explored the citation distribution of physicists andfound that the highest-impact of a scholar was randomlydistributed in their academic careers. Based on this random-impact rule they proposed a stochastic model, in which aunique parameter Q was assigned to predict scholar impact.The Q-value of an author i is calculated by Q i = e (cid:104) log c iα (cid:105)− µ p (12)where Q i is the Q value of an author i . (cid:104) logc iα (cid:105) indicatesthe average logarithmic citations of all papers published byauthor i . α is the α -th paper of author i . µ p is the averageimpact of luck in the success of papers.Citation-based author impact evaluation methods showdifferences among disciplines. Waltman et al. [77] found thatusing the fractional counting method can give a more suitableresult for cross-field scholar evaluation. Radicchi et al. [78]proposed a universal variant h-index to solve this problem,named h f -index. In 2013, together with Radicchi, Kaur etal. [79] improved the h f -index and proposed a new methodto compare scientific impact across disciplinary boundaries.The new h s -index was introduced in their work, which wasa normalized h-index by the average h-index of all authorsin the same disciplines. Lima et al. [80] considered that apaper can belong to several research areas and the author’simpact in an area was calculated by the papers publishedin the area, which was used by the author’s percentile rank.Finally, the impact of an author was quantified by summingup impact across all areas. By this method, although the biasamong different disciplines can be reduced, the authors whoare active in a rapidly developing area can also get a higherscore than others in the basic disciplines. C. NETWORK-BASED EVALUATION METHODS ANDINDICES
Because counting-based evaluation methods are easily ma-nipulated in evaluating scholar impact, scholars explorethe structured methods to overcome the shortcomings. Thenetwork-based evaluation methods of scholars have evolvedfrom homogeneous scholarly networks to heterogeneousscholarly networks [32], [37], [52]–[54], [57], [65], [81]. Thescholarly networks are made up of academic entities, includ-ing scholars, papers, journals or conferences, and institutions.Ding et al. [82] used the PageRank algorithm to quantify theimpact of the author based on author co-citation network.Yan et al. [83] developed P-Rank, which used three differentnetworks, including citation network, authorship network,and publish-relationship network, to evaluate the impact ofauthors, papers and journals. A HITS-type method was firstperformed to update the scores of papers, authors, and jour-nals in the authorship network and publish-relationship net-work. Then these scores were used as nodes’ initial values torun a PageRank in the citation network to get the final scoresof papers. Because the HITS-type algorithm is more suitablefor heterogeneous academic networks, mining the academicrelationships of heterogeneous networks in depth can makethe HITS-type algorithm work better. Amjad et al. [84] con-sidered the topic distributions of scholarly entities that weregenerated by Latent Dirichlet Allocation (LDA) [85] andproposed a topic-based ranking method called Topic-basedHeterogeneous Rank (TH Rank). Because of the networkcomplexity and the cost of computing LDA, TH Rank is notan efficient algorithm. Li et al. [86] put forward a methodnamed QRank for the purpose to rank authors effectivelyand efficiently. Nykl et al. [87] used the PageRank algorithmtogether with several individual evaluation indices, includingh-index, publication count, citation count, and author countof a publication to rank scholars.Although the existing network-based evaluation methodshave achieved certain results, the existing evaluation methodsstill have the following problems: (1) most previous studiesquantify author impact based on the first-order academicnetworks; (2) the citation inflation influences the real impactof the author; (3) the origin of the academic success genesis unknown. Therefore, the higher-order academic networkanalysis, author impact inflation, and academic success geneneed to be explored.
IV. EVALUATION OF JOURNAL IMPACT
The impact of journal generates from papers published.Authors are more willing to publish papers on the journalwith a high impact. The evaluation of journals is associatedwith the evaluation of papers and authors. There are sev-eral famous publishing groups around the world. They areElsevier, Springer, Wiley, Wolters Kluwe, and Pearson. It isworth mentioning that the most famous journals, Lancet andCell, are published by the Elsevier, and Nature is publishedby the Macmillan. From 1975, Journal Citation Reports(JCR) started to provide the last year’s Impact Factor (IF) VOLUME 4, 2016 f journals, together with other evaluation indicators such asthe journals’ current rank, abbreviated journal title, Interna-tional Standard Serial Number (ISSN), total cites, immediacyindex, total article and cited half-life. Since JCR was taken asan important data resource for quantifying journals. The JCRmetrics have become the most popular indices to evaluatejournals, and several other metrics have been proposed bythe Thomson-Reuters, such as EigenFactor Score (EF), theyearly JCR, and CiteScore . Nowadays, many other evalu-ation methods and metrics have been proposed except theJCR metrics. In the following subsection, we will discuss theevolution of the existent methods and indices, and summarizethe issues of these journal evaluation methods. One possiblesolution is to explore journal impact inflation and the higher-order academic network analysis. A. FACTORS INFLUENCING THE IMPACT OFJOURNALS
Some classical high-impact journals always have been last-ing for many years, such as Nature and Science. Journal’squality is decided by the quality of papers published on it.Many metrics for evaluating journals are based on citations.The development of the Internet has promoted the paper’scitation, as well as the impact of journals. Therefore, openaccess journals may have a higher impact than the privateones.Journal impact is with strong discipline, that is, differentdisciplines have different authoritative journals. Besides, thejournal’s type may influence its impact factor. Some journalsprefer to publish review papers, and some others publishlong research papers and short papers. Generally, a reviewjournal impact factor is higher than other journals in the samediscipline.
B. THE JOURNAL CITATION REPORTS
The Journal Citation Report started in 1975. Now it providesmore than 10,000 high-quality journals rank every year andis released on the Web of Science (WoS). The evaluation ofjournal impact contains several usually used metrics, suchas journal’s total cites, journal impact factor, impact factorwithout journal self-citations, 5-year impact factor, immedi-acy index, cited half-life, citing half-life, Eigenfactor score,article influence score and number of citable items of thejournal and other metrics. This report is always seen as themost authoritative assessment of journals.Journal impact factor, which always refers to the 2-yearimpact factor, was proposed by Garfield in 1955 [47]. TheJIF of a journal in year n was defined as follows: − JIF n = P n − + P n − C n − + C n − , (13)where P n − is the number of papers published on this journalin year n − and C n − is the number of the journal’scitations in year n − . The computation of the 5-year journal impact factor is the same as the 2-year impact factor, whichis considering the number of papers and citations of thejournal in recent 5 years. The impact factor without journalself-citations eliminates the influence of the journal’s self-citations, which gives a more objective evaluation of thejournal’s impact. The cited half-life is years that are takento reach half of the total citations of the journal, whichindicates the persistence of a journal’s impact. The citinghalf-life is defined as the years for the number of referencesaccumulating to half of all, which indicates the novelty of thereferences.Other metrics, such as immediacy index, Eigenfactorscore, and article influence score, are to cover shortages ofthe impact factor. The immediacy index is defined as theaverage citation of papers published on journals in the givenyear, which can reflect the impact of the journal in thatyear. Eigenfactor score is calculated by the journal’s citationnetwork without self-citation, using a PageRank-type method[88]. C. ANALYSIS AND IMPROVEMENT OF THE JCR
Although the JCR metrics are used widely, it leads to biasif only using a single metric to assess journals. Many effortshave been paid to overcome the shortages and many othermetrics have been proposed, such as H-index for journals[15], SCImago Journal Rank (SJR) [89], Source NormalizedImpact per Paper (SNIP) [90]. In addition to using a singlemetric, it is found that the ranking result can be improvedby combining these common metrics in some ways, likecomputing their harmonic means [91] or using the NeuralNetwork to find a non-linear represent [92]. Serenko etal. [93] found that scholars always preferred to the familiarjournals and gave them a higher evaluation. It suggests thatintroducing personal opinions in the evaluation of journalsmay be helpful. Tsai et al. [94] studied the correlationbetween subjective evaluation (scholars’ personal opinions)and objective evaluation (journal rank by JIF and h-index)and used the Borda counting method to combine the tworanking results. Beets et al. [95] ranked accounting journalsreferencing the departmental journal lists, which were usedto evaluate faculty publications in several famous businessschools.There are also many scholars concern about the relation-ship among the different journal rank by these metrics [96]–[101]. Setti [99] argued that it was impossible to capturethe real impact of journals by any single indicator. Differentevaluation methods quantify journals from different views, sowhich metrics are more useful is always based on applicationscenarios. Sometimes it is meaningful to rank journals onlyby the percentage of highly cited publications of a journal[102]. Besides, the evaluation of journals in different dis-ciplines or different fields of the same subject also needsdiscussion [102], [103].Chatterjee et al. [104] studied the citation distribution andfound that a few high-cited papers had hold most citations inboth journals and institutions. Based on many research of the
VOLUME 4, 2016 itation distribution of journals, Kao et al. [105] proposeda stochastic dominance analysis based method to evaluatejournals. D. NETWORK-BASED EVALUATION METHODS ANDINDICES
The most used methods for evaluating nodes in network arePageRank and HITS. As discussed in the previous sections,HITS algorithm can be used to rank paper, author and jour-nal together. There are some PageRank-type methods beingdesigned for ranking journals, which have a basic form like r ( J i ) = (1 − λ ) x i + λ (cid:88) j [ r ( J j ) × w ( J j , J i ) sum k w ( J j , J k ) ] , (14)where x i indicates the adaptive damping factor and satisfies (cid:80) Ni =1 x i = 1 . Generally, the value of x i is set as N . r ( J i ) represents the importance score of journal i [106].Based on the PageRank algorithm, Chen [23] added the ex-pert judgments on the method as a weight part, and optimizedthe function by Particle Swarm Optimization (PSO). In thesame way, Lim et al. [107] used the relevance and importanceof the citations between journals to design the weightedPageRank. Zhang proposed the HR-PageRank algorithm toevaluate journal impact via weighted PageRank accordingto the author’s H-index, and relevance between citing andcited papers [108]. Bohlin et al. [109] studied the differentperformances of zero- (the classical Markov model), first-and second-order Markov model while ranking journals andfound that higher-order Markov models performed better andwere more robust.Some evaluation methods consider the structural positionof journals in the journal citation network. Zhang et al. [24]proposed an indicator named Quality-Structure Index (QSI),which ranked journals by the intrinsic popularity and struc-tural position of journals. The intrinsic popularity was quanti-fied by some frequently used metrics, such as JIF, Eigenfactorscore, PageRank score. Similarly, Leydesdorff [25] intro-duced the betweenness centrality of journals in the journalcitation network to the assessment task. Su [26] gave alink-based representation to some frequently used metricsfor journals, such as JIF, and proposed a link-based fusingmethod to fuse several metrics together according to the linksin and among paper citation network, authorship networkand paper publishing network. This method has found a newway to consider many metrics together to evaluate academicentities.Based on the above analysis, the existing journal evalu-ation methods still have the following problems: (1) mostprevious studies quantify journal impact based on the first-order academic networks; (2) the citation inflation influencesthe real impact of a journal. Therefore, researchers needto explore the higher-order academic network analysis andjournal impact inflation to resolve the challenging issues ofjournal evaluation. V. OPEN ISSUES AND CHALLENGES
In this section, several open issues and challenges are shownfor further research in this area, including the pattern ofcollaboration impact, unified evaluation standards, implicitsuccess factor mining, dynamic academic network embed-ding, and scholarly impact inflation.
A. PATTERN OF COLLABORATION IMPACT
A significant amount of work has been focused on quan-tifying the impact of scholarly papers, scholars, and jour-nals [27], [76], [108]. However, little is known about how theimpact of scientific collaboration evolves over time. Previousresearchers measure the impact of co-authors by citations,which are easy to be manipulated. The structured methods formeasuring the impact of co-authors are urgently needed in thescience community. With the available large-scale datasets oncitations and collaborations, it becomes possible to explorethe patterns of collaboration impact in scientific collabora-tive careers over time and their potential relationships withscientists’ success. Since the structured methods are neededto quantify the impact of co-authors, how to construct thenetwork to measure the collaborative impact and how tomodel remain the broader challenge. One possible solutionis to construct a heterogeneous academic network in whichthe impact of the co-authors are quantified. Based on this,researchers explore the pattern of collaboration impact.
B. UNIFIED EVALUATION STANDARDS
We have mentioned many automatic evaluation methods thattry to find high-quality papers from a mass of publications.But these methods can only give researchers suggestionsthat which paper may be useful, the contents of the papersrecommended are not concerned by the algorithm. Therefore,there is still a strong demand for efforts to find the papersyou need in the research process. Although there are manyautomatic evaluation methods, we can not find a unifiedevaluation standard to evaluate which method can outperformothers. A widely accepted ground truth is of great need in theevaluation systems. To solve this problem, the data set mustbe unified first.
C. IMPLICIT SUCCESS FACTOR MINING
In the past, more attention has been given to explicit successfactors. In author impact evaluation research, researchershave found some explicit success factors such as academicage, institution, research field, and country [110]. However,little is known about the mechanisms of the temporal evo-lution of success in science. Uncovering the origin of thesuccess factors in science is a challenging task. Success inscience may depend on exogenous factors, such as mentor-student relationship, learning habits, and education level,remains unknown. Actively exploring the relationship be-tween exogenous factors and academic success may providea method for implicit success factor mining. VOLUME 4, 2016 . DYNAMIC ACADEMIC NETWORK EMBEDDING
Many static network embedding methods have been pro-posed, however, academic networks evolve over time. Forexample, in citation networks, citing papers and cited papersalways dynamically change over time, e.g., new citations arecontinuously added to the citation networks when authorscite previous research work. To learn the representations ofnodes in dynamic scholarly networks, the existing academicnetwork embedding methods need to run repeatedly and taketime. Therefore, further study on dynamic scholarly networkembedding algorithms remains an open challenge in thisarea. To obtain the efficient representation, a deep featurelearning and the associated representation model supportedby dynamic academic data may need to be established.
E. SCHOLARLY IMPACT INFLATION
Scholarly impact inflation, which arises from the exponentialgrowth of scholarly papers, affects the real value of scholarlyimpact, therefore, impacting the comparative evaluation ofpapers, scholars, journals, institutions, and country outputacross different periods [111]. Scholars can increase theircitations by relying on their friends and co-authors, indicatingthat citations are easily manipulated. Many work has focusedon unraveling the dynamics of inflation for citations [30],[112]–[114]. Under the background of the inflation of cita-tions, how to construct the evaluation network of scholarlyimpact and how to model are surprisingly difficult, high-lighting the broader challenge of evaluating the scholarlyimpact in the science community. One possible solution is toweaken citation inflation through the higher-order academicnetworks.
VI. CONCLUSION
In this paper, we conduct a comprehensive review of theliterature in quantifying success in science, focusing onevaluation indices of scholarly impact. Two changes havetaken place in quantifying success in science research: (1)from unstructured evaluation indices to structured evaluationindices; (2) from single-disciplinary impact assessment tointerdisciplinary impact assessment. However, the literature-based analysis has led to the conclusion that despite a largenumber of evaluation indices have been used to resolve theproblems in quantifying success in science, the solutions ofsome potential issues remain unknowns, such as the patternof collaborative impact, implicit success factor mining, dy-namic academic network embedding, and scholarly impactinflation. To solve these challenging issues, researchers canexplore from the high-order scholarly network, heteroge-neous network analysis and modeling, and academic relation-ship identifying.
REFERENCES [1] F. Xia, W. Wang, T. M. Bekele, and H. Liu, “Big scholarly data: Asurvey,” IEEE Transactions on Big Data, vol. 3, no. 1, pp. 18–35, 2017.[2] J. Liu, X. Kong, F. Xia, X. Bai, L. Wang, Q. Qing, and I. Lee, “ArtificialIntelligence in the 21st century,” IEEE Access, vol. 6, pp. 34 403–34 421,2018. [3] W. Wang, J. Liu, F. Xia, I. King, and H. Tong, “Shifu: Deep learningbased advisor-advisee relationship mining in scholarly big data,” inProceedings of the 26th International Conference on World Wide WebCompanion. International World Wide Web Conferences SteeringCommittee, 2017, pp. 303–310.[4] W. Wang, J. Liu, Z. Yang, X. Kong, and F. Xia, “Sustainable collaboratorrecommendation based on conference closure,” IEEE Transactions onComputational Social Systems, vol. 6, no. 2, pp. 311–322, 2019.[5] L. He, H. Fang, X. Wang, W. Yuyong, H. Ge, C. Li, C. Chen, Y. Wan, andH. He, “The 100 most-cited articles in urological surgery: A bibliometricanalysis,” International Journal of Surgery, vol. 75, pp. 74–79, 2020.[6] F. Xia, H. Liu, I. Lee, and L. Cao, “Scientific article recommendation:Exploiting common author relations and historical preferences,” IEEETransactions on Big Data, vol. 2, no. 2, pp. 101–112, 2016.[7] F. Xia, Z. Chen, W. Wang, J. Li, and L. T. Yang, “Mvcwalker: Randomwalk-based most valuable collaborators recommendation exploiting aca-demic factors,” IEEE Transactions on Emerging Topics in Computing,vol. 2, no. 3, pp. 364–375, 2014.[8] X. Bai, I. Lee, Z. Ning, A. Tolba, and F. Xia, “The role of positiveand negative citations in scientific evaluation,” IEEE Access, vol. 5, pp.17 607–17 617, 2017.[9] X. Bai, H. Liu, F. Zhang, Z. Ning, X. Kong, I. Lee, and F. Xia, “Anoverview on evaluating and predicting scholarly article impact,” Infor-mation, vol. 8, no. 3, p. 73, 2017.[10] T. Amjad, Y. Rehmat, A. Daud, and R. Ayaz Abbasi, “Scientific impactof an author and role of self-citations,” Scientometrics, vol. 122, no. 2,pp. 915–932, 2020.[11] X. Bai, F. Zhang, J. Ni, L. Shi, and I. Lee, “Measure the impact ofinstitution and paper via institution-citation network,” IEEE Access,vol. 8, pp. 17 548–17 555, 2020.[12] N. A. Ebrahim, H. Salehi, M. A. Embi, F. H. Tanha, H. Gholizadeh, andS. M. Motahar, “Visibility and citation impact,” Social Science ElectronicPublishing, vol. 7, no. 4, pp. 120–125, 2014.[13] J. Liu, T. Tang, W. Wang, B. Xu, X. Kong, and F. Xia, “A survey ofscholarly data visualization,” IEEE Access, vol. 6, pp. 19 205–19 221,2018.[14] P. D. B. Parolo, R. K. Pan, R. Ghosh, B. A. Huberman, K. Kaski, andS. Fortunato, “Attention decay in science,” Journal of Informetrics, vol. 9,no. 4, pp. 734–745, 2015.[15] T. Braun, W. Glanzel, and A. Schubert, “A Hirsch-type index for jour-nals,” Scientometrics, vol. 69, no. 1, pp. 169–173, 2006.[16] E. Garfield, “Citation analysis as a tool in journal evaluation,” Science,vol. 178, no. 4060, pp. 471–479, 1972.[17] Y. Chen, Q. Jin, H. Fang, H. Lei, J. Hu, Y. Wu, J. Chen, C. Wang, andY. Wan, “Analytic network process: Academic insights and perspectivesanalysis,” Journal of Cleaner Production, vol. 235, pp. 1276–1294, 2019.[18] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citationranking: Bringing order to the web,” Stanford Digital Libraries WorkingPaper, vol. 9, no. 1, pp. 1–14, 1998.[19] J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,”Journal of the ACM, vol. 46, no. 5, pp. 604–632, 1999.[20] J. E. Hirsch, “An index to quantify an individual’s scientific researchoutput,” Proceedings of the National Academy of Sciences of the UnitedStates of America, vol. 102, no. 46, pp. 16 569–16 572, 2005.[21] L. Egghe, “Theory and practise of the g-index,” Scientometrics, vol. 69,no. 1, pp. 131–152, 2006.[22] S. Alonso, F. J. Cabrerizo, E. Herrera-Viedma, and F. Herrera, “hg -index:A new index to characterize the scientific output of researchers based onthe h - and g -indices,” Scientometrics, vol. 82, no. 2, pp. 391–400, 2010.[23] Y. L. Chen and X. H. Chen, “An evolutionary PageRank approach forjournal ranking with expert judgements,” Wireless Personal Communica-tions, vol. 54, no. 3, pp. 467–484, 2011.[24] C. Zhang, X. Liu, Y. Xu, and Y. Wang, “Quality-Structure index: A newmetric to measure scientific journal influence,” Journal of the AmericanSociety for Information Science and Technology, vol. 62, no. 4, pp. 643–653, 2011.[25] L. Leydesdorff, “Betweenness centrality as an indicator of the interdisci-plinarity of scientific journals,” Journal of the Association for InformationScience & Technology, vol. 58, no. 9, pp. 1303–1319, 2009.[26] P. Su and Q. Shen, “Link-based methods for bibliometric journal rank-ing,” Soft Computing, vol. 17, no. 12, pp. 2399–2410, 2013.[27] X. Bai, F. Zhang, J. Hou, I. Lee, X. Kong, A. Tolba, and F. Xia, “Quan-tifying the impact of scholarly papers based on higher-order weightedcitations,” PloS one, vol. 13, no. 3, p. e0193192, 2018.
VOLUME 4, 2016
28] L. Wildgaard, J. W. Schneider, and B. Larsen, “A review of the charac-teristics of 108 author-level bibliometric indicators,” Scientometrics, vol.101, no. 1, pp. 125–158, 2014.[29] F. Zhang, X. Bai, and I. Lee, “Author Impact: Evaluations, predictions,and challenges,” IEEE Access, vol. 7, pp. 38 657–38 669, 2019.[30] X. Bai, F. Xia, I. Lee, J. Zhang, and Z. Ning, “Identifying anomalouscitations for objective evaluation of scholarly article impact,” Plos One,vol. 11, no. 9, p. e0162364, 2016.[31] S. Lehmann, A. D. Jackson, and B. E. Lautrup, “Measures for measures,”Nature, vol. 444, no. 7122, pp. 1003–1004, 2006.[32] Y. Wang, Y. Tong, and M. Zeng, “Ranking scientific articles by exploitingcitations, authors, journals, and time information,” in AAAI Conferenceon Artificial Intelligence. AAAI Press, 2013, pp. 933–939.[33] D. Wang and A. Barabasi, “Quantifying long-term scientific impact,”Science, vol. 342, no. 6154, pp. 127–132, 2013.[34] P. Chen, H. Xie, S. Maslov, and S. Redner, “Finding scientific gems withGoogle’s PageRank algorithm,” Journal of Informetrics, vol. 1, no. 1, pp.8–15, 2006.[35] Y. Zhang, M. Wang, F. Gottwalt, M. Saberi, and E. Chang, “Ranking sci-entific articles based on bibliometric networks with a weighting scheme,”Journal of Informetrics, vol. 13, no. 2, pp. 616–634, 2019.[36] H. Piwowar, “Altmetrics: Value all research products,” Nature, vol. 493,no. 7431, pp. 159–159, 2013.[37] H. Sayyadi and L. Getoor, “FutureRank: Ranking scientific articles bypredicting their future PageRank,” in SIAM International Conference onData Mining. SIAM, 2009, pp. 533–544.[38] M. Wang, J. Ren, S. Li, and G. Chen, “Quantifying a paper academicimpact by distinguishing the unequal intensities and contributions ofcitations,” IEEE Access, vol. 7, no. 99, pp. 96 198–96 214, 2019.[39] H. F. Chan, M. Guillot, L. Page, and B. Torgler, “The inner quality of anarticle: Will time tell?” Scientometrics, vol. 104, no. 1, pp. 19–41, 2015.[40] F. Didegah and M. Thelwall, “Which factors help authors produce thehighest impact research? Collaboration, journal and document proper-ties,” Journal of Informetrics, vol. 7, no. 4, pp. 861–873, 2013.[41] P. O. Seglen, “Why the impact factor of journals should not be used forevaluating research,” BMJ British Medical Journal, vol. 314, no. 7079,pp. 498–502, 1997.[42] R. Costas, Z. Zahedi, and P. Wouters, “Do Altmetrics correlate withcitations? Extensive comparison of altmetric indicators with citationsfrom a multidisciplinary perspective,” Journal of the Association forInformation Science and Technology, vol. 66, no. 10, pp. 2003–2019,2015.[43] D. J. D. S. Price, “Networks of scientific papers,” Science, vol. 149, no.3683, pp. 510–515, 1965.[44] X. Wan and F. Liu, “Are all literature citations equally important?Automatic citation strength estimation and its applications,” Journal ofthe Association for Information Science & Technology, vol. 65, no. 9,pp. 1929–1938, 2014.[45] X. Zhu, P. Turney, D. Lemire, and A. Vellino, “Measuring academicinfluence: Not all citations are equal,” Journal of the Association forInformation Science & Technology, vol. 66, no. 2, pp. 408–427, 2015.[46] A. Anfossi, A. Ciolfi, F. Costa, G. Parisi, and S. Benedetto, “Large-scale assessment of research outputs through a weighted combination ofbibliometric indicators,” Scientometrics, vol. 107, no. 2, pp. 671–683,2016.[47] E. Garfield, “Science indexes for science: A new dimension in documen-tation through association of ideas,” Science, vol. 122, no. 3159, pp. 108–111, 1955.[48] A. Ancaiani, F. A. Anfossi, A. Barbara, S. Benedetto, B. Blasi, V. Carletti,T. Cicero, A. Ciolfi, F. Costa, and G. Colizza, “Evaluating scientificresearch in Italy: The 2004-10 research evaluation exercise,” ResearchEvaluation, vol. 24, no. 3, pp. 242–255, 2015.[49] F. Xia, X. Su, W. Wang, C. Zhang, Z. Ning, and I. Lee, “Bibliographicanalysis of nature based on Twitter and Facebook Altmetrics data,” PlosOne, vol. 11, no. 12, p. e0165997, 2016.[50] D. Walker, H. Xie, K. K. Yan, and S. Maslov, “Ranking scientific pub-lications using a simple model of network traffic,” Journal of StatisticalMechanics Theory & Experiment, vol. 6, no. 6, pp. p06 010–p06 010,2006.[51] L. Yao, W. Tian, A. Zeng, Y. Fan, and Z. Di, “Ranking scientificpublications: The effect of nonlinearity,” Scientific Reports, vol. 4, no.6663, pp. 1–6, 2014.[52] S. Wang, S. Xie, X. Zhang, Z. Li, and Y. He, “Coranking the futureinfluence of multiobjects in bibliographic network through mutual re- inforcement,” ACM Transactions on Intelligent Systems & Technology,vol. 7, no. 4, pp. 1–28, 2016.[53] D. Zhou, S. A. Orshanskiy, H. Zha, and C. L. Giles, “Co-ranking authorsand documents in a heterogeneous network,” in IEEE International Con-ference on Data Mining. IEEE, 2007, pp. 739–744.[54] Z. Liu, H. Huang, X. Wei, and X. Mao, “Tri-Rank: An authority rankingframework in heterogeneous academic networks by mutual reinforce,”in 2014 IEEE 26th International Conference on Tools with ArtificialIntelligence (ICTAI). IEEE, 2014, pp. 493–500.[55] A. London, T. Németh, A. Pluhár, and T. Csendes, “A local PageRankalgorithm for evaluating the importance of scientific articles,” AnnalesMathematicae Et Informaticae, vol. 44, pp. 131–140, 2015.[56] X. Jiang, C. Gao, and R. Liang, “Ranking scientific articles in a dy-namically evolving citation network,” in International Conference onSemantics, Knowledge and Grids. IEEE, 2017, pp. 154–157.[57] N. Shah and Y. Song, “S-index: Towards better metrics for quantifyingresearch impact,” arXiv preprint arXiv:1507.03650, pp. 1–10, 2015.[58] D. Bouyssou and T. Marchant, “Ranking authors using fractional count-ing of citations: An axiomatic approach,” Journal of Informetrics, vol. 10,no. 1, pp. 183–199, 2016.[59] T. Marchant, “Score-based bibliometric rankings of authors,” Journal ofthe American Society for Information Science and Technology, vol. 60,no. 6, pp. 1132–1137, 2009.[60] J. Stallings, E. Vance, J. Yang, M. W. Vannier, J. Liang, L. Pang, L. Dai,I. Ye, and G. Wang, “Determining scientific impact using a collaborationindex,” Proc Natl Acad Sci U S A, vol. 110, no. 24, pp. 9680–9685, 2013.[61] A. Usmani and A. Daud, “Unified author ranking based on integratedpublication and venue rank,” International Arab Journal of InformationTechnology (IAJIT), vol. 14, no. 1, pp. 111–118, 2017.[62] C. Zhang, L. Yu, X. Zhang, and N. V. Chawla, “Task-Guided andsemantic-aware ranking for academic author-paper aorrelation infer-ence,” in IJCAI, 2018, pp. 3641–3647.[63] U. Senanayake, M. Piraveenan, and A. Zomaya, “The PageRank-index:Going beyond citation counts in quantifying scientific impact of re-searchers,” PloS One, vol. 10, no. 8, p. e0134794, 2015.[64] M. Dunaiski, J. Geldenhuys, and W. Visser, “Author ranking evaluationat scale,” Journal of Informetrics, vol. 12, no. 3, pp. 679–702, 2018.[65] R. Liang and X. Jiang, “Scientific ranking over heterogeneous academichypernetwork,” in Thirtieth AAAI Conference on Artificial Intelligence.AAAI Press, 2016, pp. 20–26.[66] Y. Dong, R. A. Johnson, and N. V. Chawla, “Will this paper increase yourh -index?: Scientific impact prediction,” in Eighth ACM InternationalConference on Web Search and Data Mining. ACM, 2015, pp. 149–158.[67] J. Zhang, Z. Ning, X. Bai, X. Kong, J. Zhou, and F. Xia, “Exploring timefactors in measuring the scientific impact of scholars,” Scientometrics,vol. 112, no. 3, pp. 1301–1321, 2017.[68] H. W. Shen and A. L. Barabasi, “Collective credit allocation in science,”Proceedings of the National Academy of Sciences of the United States ofAmerica, vol. 111, no. 34, pp. 12 325–12 330, 2014.[69] P. Deville, D. Wang, R. Sinatra, C. Song, V. D. Blondel, and A. L.Barab ´ a si, “Career on the move: geography, stratification, and scientificimpact,” Scientific Reports, vol. 4, no. 1, pp. 4770–4770, 2014.[70] A. Mas-Bleda, M. Thelwall, K. Kousha, and I. F. Aguillo, “Do highlycited researchers successfully use the social web?” Scientometrics, vol.101, no. 1, pp. 337–356, 2014.[71] B. H. Jin, L. M. Liang, R. Rousseau, and L. Egghe, “The R- and AR-indices: Complementing the h-index,” Science Bulletin, vol. 52, no. 6,pp. 855–863, 2007.[72] C. T. Zhang, “A novel triangle mapping technique to study the h-indexbased citation distribution,” Scientific Reports, vol. 3, no. 1, pp. 1023–1023, 2013.[73] S. N. Dorogovtsev and J. F. F. Mendes, “Ranking scientists,” NaturePhysics, vol. 11, no. 11, pp. 882–884, 2015.[74] G. Wang and J. Yang, “Axiomatic quantification of co-authors’ relativecontributions,” arXiv preprint arXiv:1003.3362, pp. 1–17, 2010.[75] M. Farooq, H. U. Khan, S. Iqbal, E. U. Munir, and A. Shahzad, “DS-index: Ranking authors distinctively in an Academic network,” IEEEAccess, vol. 5, pp. 19 588–19 596, 2017.[76] R. Sinatra, D. Wang, P. Deville, C. Song, and A. L. Barabáasi, “Quanti-fying the evolution of individual scientific impact,” Science, vol. 354, no.6312, pp. aaf5239–aaf5239, 2016. VOLUME 4, 2016
77] L. Waltman and N. J. V. Eck, “Field-normalized citation impact indicatorsand the choice of an appropriate counting method,” Journal of Informet-rics, vol. 9, no. 4, pp. 872–894, 2015.[78] F. Radicchi, S. Fortunato, and C. Castellano, “Universality of citationdistributions: Toward an objective measure of scientific impact,” ProcNatl Acad Sci U S A, vol. 105, no. 45, pp. 17 268–17 272, 2008.[79] J. Kaur, F. Radicchi, and F. Menczer, “Universality of scholarly impactmetrics,” Journal of Informetrics, vol. 7, no. 4, pp. 924–932, 2013.[80] H. Lima, T. H. P. Silva, M. M. Moro, R. L. T. Santos, W. Meira,and A. H. F. Laender, “Aggregating productivity indices for rankingresearchers across multiple areas,” in ACM/IEEE-CS Joint Conferenceon Digital Libraries. ACM, 2013, pp. 97–106.[81] Y. B. Zhou, L. Lü, and M. Li, “Quantifying the influence of scientists andtheir publications: Distinguish prestige from popularity,” New Journal ofPhysics, vol. 14, no. 3, pp. 33 033–33 049, 2012.[82] Y. Ding, E. Yan, A. Frazho, and J. Caverlee, “PageRank for rankingauthors in co-citation networks,” Journal of the American Society forInformation Science and Technology, vol. 60, no. 11, pp. 2229–2243,2009.[83] E. Yan, Y. Ding, and C. R. Sugimoto, “P-Rank: An indicator measuringprestige in heterogeneous scholarly networks,” Journal of the AmericanSociety for Information Science & Technology, vol. 62, no. 3, pp. 467–477, 2011.[84] T. Amjad, Y. Ding, A. Daud, J. Xu, and V. Malic, “Topic-based heteroge-neous rank,” Scientometrics, vol. 104, no. 1, pp. 1–22, 2015.[85] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,”Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.[86] L. Li, X. Wang, Q. Zhang, P. Lei, M. Ma, and X. Chen, A quickand effective method for ranking authors in academic social network.Springer, 2014.[87] M. Nykl, M. Campr, and K. Jeek, “Author ranking based on personalizedPageRank,” Journal of Informetrics, vol. 9, no. 4, pp. 777–799, 2015.[88] C. Bergstrom, “Measuring the value and prestige of scholarly journals,”College & Research Libraries News, vol. 68, no. 5, pp. 314–316, 2007.[89] B. González-Pereira, V. P. Guerrero-Bote, and F. Moya-Anegón, “A newapproach to the metric of journals’ scientific prestige: The SJR indicator,”Journal of Informetrics, vol. 4, no. 3, pp. 379–391, 2010.[90] H. F. Moed, “Measuring contextual citation impact of scientific journals,”Journal of Informetrics, vol. 4, no. 3, pp. 265–277, 2010.[91] C. Chang and M. Mcaleer, “Ranking journal quality by harmonic meanof ranks: An application to ISI statistics & probability,” Statistica Neer-landica, vol. 67, no. 1, pp. 27–53, 2012.[92] S. Papavlasopoulos, M. Poulos, N. Korfiatis, and G. Bokos, “A non-linearindex to evaluate a journal’s scientific impact,” Information Sciences AnInternational Journal, vol. 180, no. 11, pp. 2156–2175, 2010.[93] A. Serenko and N. Bontis, “What’s familiar is excellent: The impact ofexposure effect on perceived journal quality,” Journal of Informetrics,vol. 5, no. 1, pp. 219–223, 2011.[94] C. F. Tsai, Y. H. Hu, and S. W. G. Ke, “A Borda count approach tocombine subjective and objective based MIS journal rankings,” OnlineInformation Review, vol. 38, no. 4, pp. 469–483, 2014.[95] S. D. Beets, A. S. Kelton, and B. R. Lewis, “An assessment of accountingjournal quality based on departmental lists,” Scientometrics, vol. 102,no. 1, pp. 315–332, 2015.[96] L. Zarifmahmoudi, J. Jamali, and R. Sadeghi, “Google Scholar journalmetrics: Comparison with impact factor and SCImago journal rankindicator for nuclear medicine journals,” Iranian Journal of NuclearMedicine, vol. 23, no. 1, pp. 8–14, 2015.[97] C. F. Tsai, “Citation impact analysis of top ranked computer sciencejournals and their rankings,” Journal of Informetrics, vol. 8, no. 2, pp.318–328, 2014.[98] L. Bornmann, W. Marx, and H. Schier, “Hirsch-Type index values fororganic chemistry journals: A comparison of new metrics with the Jour-nal Impact Factor,” European Journal of Organic Chemistry, vol. 2009,no. 10, pp. 1471–1476, 2010.[99] G. Setti, “Bibliometric indicators: Why do we need more than one?”IEEE Access, vol. 1, pp. 232–246, 2013.[100] M. R. Elkins, C. G. Maher, R. D. Herbert, A. M. Moseley, and C. Sher-rington, “Correlation between the Journal Impact Factor and three otherjournal citation indices,” Scientometrics, vol. 85, no. 1, pp. 81–93, 2010.[101] P. Jacsó, “Differences in the rank position of journals by Eigenfactormetrics and the five-year impact factor in the Journal Citation Reports andthe Eigenfactor Project web site,” Online Information Review, vol. 34,no. 3, pp. 496–508, 2010. [102] S. M. Gonzalez-Betancor and P. Dorta-Gonzalez, “An indicator of journalimpact that is based on calculating a journal’s percentage of highly citedpublications,” arXiv preprint arXiv:1510.03648, pp. 1–30, 2015.[103] G. F. Templeton and B. R. Lewis, “Fairness in the institutional valuationof business journals,” Mis Quarterly, vol. 39, no. 3, pp. 523–539, 2015.[104] A. Chatterjee, A. Ghosh, and B. K. Chakrabarti, “Universality of citationdistributions for academic institutions and journals,” Plos One, vol. 11,no. 2, p. e0148863, 2016.[105] E. H. Kao, C. H. Hsu, Y. Lu, and H. G. Fung, “Ranking of financejournals: A stochastic dominance analysis,” Managerial Finance, vol. 42,no. 4, pp. 312–323, 2016.[106] J. Bollen, M. A. Rodriquez, and H. V. D. Sompel, “Journal status,”Scientometrics, vol. 69, no. 3, pp. 669–687, 2006.[107] A. Lim, H. Ma, Q. Wen, Z. Xu, and B. Cheang, “Distinguishing citationquality for journal impact assessment,” Communications of the ACM,vol. 52, no. 8, pp. 111–116, 2008.[108] F. Zhang, “Evaluating journal impact based on weighted citations,”Scientometrics, vol. 113, no. 2, pp. 1–15, 2017.[109] L. Bohlin, A. Viamontes Esquivel, A. Lancichinetti, and M. Rosvall,“Robustness of journal rankings by network flows with different amountsof memory,” Journal of the Association for Information Science &Technology, vol. 67, no. 10, pp. 2527–2535, 2016.[110] L. Cai, J. Tian, J. Liu, X. Bai, I. Lee, X. Kong, and F. Xia, “Scholarlyimpact assessment: A survey of citation weighting solutions,” Sciento-metrics, vol. 118, no. 2, pp. 453–478, 2019.[111] R. K. Pan, A. M. Petersen, F. Pammolli, and S. Fortunato, “The memoryof science: Inflation, myopia, and the knowledge network,” Journal ofInformetrics, vol. 12, no. 3, pp. 656–678, 2018.[112] K. W. Higham, M. Governale, A. Jaffe, and U. Zülicke, “Unraveling thedynamics of growth, aging and inflation for citations to scientific articlesfrom specific research fields,” Journal of Informetrics, vol. 11, no. 4, pp.1190–1200, 2017.[113] A. M. Petersen, R. K. Pan, F. Pammolli, and S. Fortunato, “Methods toaccount for citation inflation in research evaluation,” Research Policy,vol. 48, no. 7, pp. 1855–1865, 2019.[114] S. Tarkhan-Mouravi, “Traditional indicators inflate some countries? Sci-entific impact over 10 times,” Scientometrics, vol. 123, no. 1, pp. 337–356, 2020.