[PDF] Measure the Impact of Institution and Paper via Institution-citation Network

Abstract

This paper investigates the impact of institutes and papers over time based on the heterogeneous institution-citation network. A new model, IPRank, is introduced to measure the impact of institution and paper simultaneously. This model utilises the heterogeneous structural measure method to unveil the impact of institution and paper, reflecting the effects of citation, institution, and structural measure. To evaluate the performance, the model first constructs a heterogeneous institution-citation network based on the American Physical Society (APS) dataset. Subsequently, PageRank is used to quantify the impact of institution and paper. Finally, impacts of same institution are merged, and the ranking of institutions and papers is calculated. Experimental results show that the IPRank model better identifies universities that host Nobel Prize laureates, demonstrating that the proposed technique well reflects impactful research.

Full PDF

DDate of publication xxxx 00, 0000, date of current version xxxx 00, 0000.

Digital Object Identiﬁer 10.1109/ACCESS.2017.DOI

Measure the Impact of Institution andPaper via Institution-citation Network

XIAOMEI BAI , FULI ZHANG , JIN NI , LEI SHI , IVAN LEE Computing Center, Anshan Normal University, Anshan 114007, China Information Center, Anshan Normal University, Anshan 114007, China Adult Education Institute, Anshan Normal University, Anshan 114007, China Science and Technology Department, Anshan Normal University, Anshan 114007, China School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide SA 5001, Australia

Corresponding author: Xiaomei Bai (e-mail: [email protected]).This work was partially supported by Liaoning Provincial Key R&D Guidance Project (2018104021) and Liaoning Provincial NaturalFund Guidance Plan (20180550011).

ABSTRACT

This paper investigates the impact of institutes and papers over time based on the heteroge-neous institution-citation network. A new model, IPRank, is introduced to measure the impact of institutionand paper simultaneously. This model utilises the heterogeneous structural measure method to unveil theimpact of institution and paper, reﬂecting the effects of citation, institution, and structural measure. Toevaluate the performance, the model ﬁrst constructs a heterogeneous institution-citation network based onthe American Physical Society (APS) dataset. Subsequently, PageRank is used to quantify the impact ofinstitution and paper. Finally, impacts of same institution are merged, and the ranking of institutions andpapers is calculated. Experimental results show that the IPRank model better identiﬁes universities that hostNobel Prize laureates, demonstrating that the proposed technique well reﬂects impactful research.

INDEX TERMS

Institution impact, paper impact, institution-citation network.

I. INTRODUCTION S CIENTIFIC impact is evaluated at different levels, rang-ing from high level at national and institutional scalesto low level at researcher and paper scales [1]–[3]. Manystudies focus on scientiﬁc impact measure, scholarly networkanalysis, and success of science [4]–[8]. While many of thesestudies explore scientiﬁc impact at a particular timeframe,there’s a growing interest in understanding the evolutionof scientiﬁc impact in "science of science" [9], [10]. Forscientiﬁc impact measurement, citation network is a oftenused technique [11], [12], whereas heterogeneous scholarlynetwork has attracted growing attention recently [13], [14].Quantifying scientiﬁc impact in the heterogeneous scholarlynetwork is closely related to structural measure, citation anal-ysis and behavioral complexity. A subset of heterogeneousscholarly networks is the evolving network of institutionand paper over time, which forms the structural founda-tion for advancing scientiﬁc discoveries, gauging scientists’performances, ranking universities, and allocating funding.A heterogeneous scholarly network relationship is shownin Figure 1. I - I represent research institutes and P - P represent papers. In Figure 1, paper P cites the two paperspaper P and paper P , and the link between two papers points to its reference. The signed institutions of paper P include institution I and institution I , the bi-directionallinks represent the relationship between paper and institution,indicating that the institution publishes the paper and thepaper belongs to the institution. Quantifying paper impactis longstanding point of research [3], [15]–[18]. Previousstudies have mainly focused on unstructured measures orstructured measures [19]. Unstructured measures rely oncitations of scholarly papers or Altmetrics, including down-loads, views, shares, and citations [20]. Citations attractedby a scholarly paper can sometimes be correlated to itsage, which favors older publications. Altmetrics are suitablefor quantifying the impact of paper in the early stage ofpublication. However, both metrics are easily manipulatedby scholars who can artiﬁcially increase the number of cita-tions. Compared to unstructured metrics, structured metricsmore adequately quantify the impact of paper. The mostrepresentative structured measures are PageRank and HITSalgorithms [21]–[24]. PageRank algorithm is often used inhomogeneous network such as citation network and co-author network [25]. HITS algorithm is used in heteroge-neous scholarly network such as paper-author network andpaper-journal network [12]. VOLUME 4, 2016 a r X i v : . [ c s . D L ] A ug I P I I I P P I P P P I I P I P I P I P I FIGURE 1: An example of heterogeneous institution-citation network.Quantifying institution impact has always been the focusof scientiﬁc researchers [9], [10], [26]–[30]. Currently, quan-tifying institutional impact is limited to unstructured metricsand homogeneous structured metrics. Several unstructuredmethods are widely recognized such as Academic Rankingof World University (ARWU), QS World University Ranking(QS), Times Higher Education World University Ranking(THE) and Performance Ranking of Scientiﬁc Papers forWorld Universities (NTU) [10], [31]. However, unstructuredmetrics rely heavily on the number of bibliometric indicators.To develop a structured quantitative method to measure theinstitutional impact, Massucci et al. [11] integrated PageRankinto the citation network of institutions. However, despitethese signiﬁcant efforts, the correlation between institutionalimpact and paper impact in heterogeneous scholarly networkremains unclear. Possible reasons include: institution impactevaluation is moving from unstructured to structured; com-pared to evaluating the institution impact in a homogeneousnetwork, evaluating the institution impact in a heterogeneousnetwork is a more complicated task.Therefore, we develop a quantitative model, IPRank, toimprove the understanding of institution and paper impactin the heterogeneous scholarly network. With the unprece-dented expansion of publications and the availability of large-scale datasets on publications, institutions and citations, theanalysis of institution and paper network and their quantiﬁ-cation in heterogeneous network are now possible. In thispaper, we address two main questions. First, we constructa heterogeneous institution-citation network and derive thestatistical model of institution-citation network, making itpossible to simultaneously quantify the impact of institu-tion and scholarly paper. Second, we develop a structuredmeasurement based on the institution-citation network byutilizing PageRank to quantify the impact of institution andscholarly paper.The rest of this paper is organized as follows. Section IIsummarizes recent work on the evaluation of institution and paper impact. Section III introduces the proposed IPRankmodel framework in detail. The experimental results areshown and discussed in Section IV. Section V draws con-cluding remarks of the study.

II. RELATED WORK

Quantifying the impact of scholarly papers has been exten-sively investigated. Early studies are mostly based on thenumber of citations. Garﬁeld proposed using citation countsas the measure of scholarly paper impact [32], and he alsodeveloped Journal Impact Factor (JIF) as the measure ofjournal impact [33]. Although, citation-based approach hascertain limitations, such as the impact factor of differentdisciplines cannot be uniﬁed. Citations as a metric to measurethe impact of paper have been controversial, especially due tothe existence of questionable citations [34].To resolve this problem, on-going research has been con-ducted to explore structured metrics to quantify the paper im-pact [14], [21], [35]–[37]. These studies are mostly based onscholarly networks, including homogeneous networks (cita-tion network of paper, citation network of institution, and co-author network) and heterogeneous networks (paper-authornetwork, paper-venue network, and author-venue network).Chen et al. [21] found scientiﬁc gems with Google’s PageR-ank algorithm via citation network. The reason behind it isthat important papers attract more citations, including citingpaper with high importance, which increase the importanceof the cited papers. On the basis of this work, Jiang et al.[38] integrated mutual reinforcement relationships based onthe three homogeneous networks and the three heterogeneousnetworks by applying PageRank and HITS algorithm. Subse-quently, Wang et al. [12] measured the impact of paper byexploiting citations, authors, journals and time informationvia homogeneous scholarly network and the heterogeneousscholarly network mentioned above. Compared to the workof Jiang et al., Wang et al. [12] introduced time feature toevaluate the impact of paper, and favored recent scholarlypapers to higher scores. Inspired by the work of Wang et VOLUME 4, 2016 l. [12] and Ioannidis [39], Bai et al. [2] proposed COIRankto measure the impact of paper by identifying anomalouscitation patterns to adjust citation weights. Liang et al. [14]proposed a novel mutual ranking algorithm based on theheterogeneous academic hypernetwork by employing themutual reinforcement relationship. Bai et al. [40] developeda higher-order weighted quantum PageRank algorithm basedon the behaviour of multiple step citation ﬂow. The citationdynamics with higher-order dependencies reveal the actualimpact, and better distinguish the impact from self-citation.Compared to the evaluation of paper impact, quantiﬁcationof institutional impact is more complicated [26], [41]–[43].Previous metrics are mainly based on statistics of features,including researcher-based features (staff winning NobelPrizes, number of highly cited researchers, international col-laboration), paper-based features (article published in Natureand Science, article index, number of publications, highquality publications, normalized impact, excellence rate, co-publications), institution-based features (university-industryco-publications), and other features such as availability ofresearch funding and graduation rates [28], [44], [45]. Thesefeatures are relatively easy to obtain, and they reﬂect theimpact of institution. However, these quantitative indicatorshave certain drawbacks. Therefore, the structured metricsare investigated to quantify the impact of institution [11],[46]. Bai et al. [46] ﬁrst explored the conﬂict of interest(COI) relationships to discover negative citations and weakenthe associated citation strength. Furthermore, PageRank andHITS algorithms were utilized to measure the impact ofpapers based on citation network, paper-author network andpaper-journal network. Finally, the institutional impact wascalculated by the impact of all publications in this institution.Massucci et al. [11] studied the citation patterns amonguniversity and used the PageRank algorithm based on thecitation network between institutions. In their study, thecitation relationships between papers are converted into thecitation relationships between signed institutions of papers.However, the citation relationships between any two papersis one to one, and since a paper can signed by multipleinstitutions, the citation relationships between institutions aremore complicated.

III. METHODS

A. DATA SOURCES AND DATA PRE-PROCESSING

Our experiments are based on the American Physical Society(APS) dataset, which consists of all papers published inPhysical Review from 1894 to 2013, spanning across thefollowing journals: Physical Review A, B, C, D, E, I, L, STand Review of Modern Physics. This dataset includes title ofpaper, author’s name, author’s afﬁliations, date of publicationinformation, and a list of cited papers.In this study, we consider papers and institutions that meetthe following criteria: (1) Paper and institution details arecomplete and in the right format. (2) At least one institutionis found for a paper. (3) The ﬁrst institution associated to eachauthor is retained. (4) Each institution retains to the ﬁrst-level unit. For example, we retain Sloane Physics Laboratory, Yaleuniversity as Yale university. (5) Institutions with same namemerge. For example, University of California at Berkeley andCalifornia University at Berkeley are merged into Universityof California, Berkeley. It is worth mentioning that before1952, the University of California at Berkeley was called theUniversity of California. Therefore, in our research, thesetwo names were uniﬁed as the University of California,Berkeley.Through the above pre-processing, a summary of the basicstatistics of the APS dataset from 1894 to 2013 is given inTable 1. The entire APS dataset from 1894 to 2013 is usedto quantify the long-term impact of institution and paper.Correspondingly, for examining the short-term impact ofinstitution and paper, we summarize the information of theAPS dataset during different time periods, also as shownin Table 1. We choose a ﬁve-year period to quantify theimpact of institution and paper, mainly referring to the GlobalRanking of Academic Subjects (ARWU-GRAS) ranking in-stitutions [11]. Except for counting the number of papers, thenumber of institutions, the number of links between papers,and the number of links between papers and institutions, wecount the number of references of papers published, includ-ing papers published from 1894 to 2013. These referencesare also used to quantify the short-term impact of institutionand paper. The reason is that the literature cited at any time isattributed to the impact of institutions during this period. Forinstance, to quantify the impact of an institution from 2009to 2013, we need to construct an institution-citation network,which contain papers published during this time period,references of these papers, and related institutions. A detailedintroduction of institution-citation network is covered in thenext section.

B. IPRANK MODEL FRAMEWORK

In this section, we introduce the IPRank model (see Figure2), which is a PageRank based model for quantifying theimpact of institution and scholarly paper. The frameworkﬁrstly constructs the institution-citation network. PageRankalgorithm is then used to quantify the impact of institutionand paper. Finally, we merge the impact of institutions, andrank institutions and papers.

1) Constructed institution-citation network

There is a good deal of literature in information sciencedealing with the citation network between papers [2], [21]and the citation network between institutions [11] to quantifythe impact of paper and the impact of institution. However,to our knowledge, no detailed construction of an actualinstitution-citation network has been attempted in the past. Inthis paper, the institution-citation network is a heterogeneousand directed scholarly network, consisting of two categoriesof nodes: institution and paper. In additions, there are twotypes of links: one is the citation link between scholarlypapers, the other is the link between institution and paper.

VOLUME 4, 2016 ABLE 1: Statistical summary of the APS dataset for different time periods. → 𝐴 = ( 0 0 0 1 1 0 01 0 0 0 1 1 01 0 0 0 0 0 11 0 01 1 00 1 00 0 1 ) → 𝐵 = ( 0 0 0 1 1/2 0 01/4 0 0 0 1/2 1 01/4 0 0 0 0 0 11/4 0 01/4 1/2 00 1/2 00 0 1 ) → 𝑃𝑅(𝑖) = (1 − 𝛼) + 𝛼 ∑ 𝐵 × 𝑃𝑅(𝑗) 𝑗∈𝐼𝑁(𝑖) → 𝑃𝑅(1) = (1 − 𝛼) ( 1/71/71/71/71/71/71/7) + 𝛼 ( 0 0 0 1 1/2 0 01/4 0 0 0 1/2 1 01/4 0 0 0 0 0 11/4 0 01/4 1/2 00 1/2 00 0 1 ) ( 1/71/71/71/71/71/71/7) I P I I I P P

0 0 0

FIGURE 2: IPRank model.Given a set of institutions I = I , I , ..., I m and a setof scholarly papers P = P , P , ..., P n . Let E P P denotethe citations between scholarly papers, E P I denote the re-lationship between papers and institutions. The heteroge-neous institution-citation network can be represented as agraph G = ( I (cid:83) P, E

P P (cid:83) E P I ) . For an institution-citationnetwork with m institutions and n papers, graph G can berepresented by adjacency matrix A : (cid:18) A P P A P I A IP (cid:19) (1)where A P P represent the citation matrix between papers, A P I and A IP represent the links between institutions andpapers. A P I = A TIP , since the links between institutions andpapers are symmetric.

2) IPRank Model

The motivation of our method is described as follows: (1)If a scholarly paper is cited by many other publications, itmeans that the paper has high importance. (2) If a scholarlypaper with a high importance is linked to other papers, theimportance of the linked papers will increase accordingly.(3) If an institution publishes many papers and these papersare cited by many other papers, it means that the institutionhas high importance. (4) If a scholarly paper with a highimportance is linked to an institution, the importance of thelinked institution will increase accordingly.Figure 2 illustrates IPRank model framework by examinethe simple situation: given three papers P , P and P ,paper P with two institutions, I and I , paper P with two institutions, I and I , paper P with an institution I . Paper P cites paper P and paper P , therefore, a simple citationnetwork can be constructed, which is an unweighted directedgraph. According to the relationship between P , P , P and I , I , I and I , the links between them can be added to thecitation network, thus, a simple institution-citation network(graph G ) are constructed.Let A denote the adjacency matrix of G , and let B denotethe transition probability matrix of A . The institution-citationnetwork can be represented by a stochastic matrix P R . Fora source i , the PageRank vector P R is deﬁned as the uniquesolution of the following formula:

P R ( i ) = (1 − α ) 1 N + α (cid:88) j ∈ IN ( i ) B × P R ( j ) (2)where P R ( i ) represents the importance of the node i in theinstitution-citation network, α (the teleport probability) is aconstant between 0 and 1, and is set as 0.85 in our experi-ments. The value of α parameter refers to the original GooglePageRank algorithm [21]. N represents the number of nodesin institution-citation network. j is the adjacent node of i ,and j ∈ IN ( i ) indicates that node j is the indegree of node i . P R ( j ) represents the importance of the node j . The linearalgebraic deﬁnition of PageRank is equivalent to simulating arandom walk. Start from the source i , with probability (1- α ),skip to a same chosen neighbor of the current node, or withprobability α stop at the current node. According to Equation(2), we ﬁnally obtain the prestige scores of institutions andpapers in the heterogeneous network. The pseudocode ofIPRank model is listed in ALGORITHM 1. VOLUME 4, 2016

LGORITHM 1: Rank institution and paperInput: Matrix A PP ∈ R n × n , Matrix A PI ∈ R n × m ,Matrix A IP ∈ R m × n Output: Scores of PR(i)Initialize Matrix A ;Compute transition probability matrix B ;Initialize scores of PR(i);for node i in institution-citation network dostep 1: Calculate scores of PR(i) according to Eq.(2);step 2: Update scores of PR(i);endIterate step 1 and step 2 until convergence;Return scores of PR(i); The importance of institution and the importance of schol-arly paper are their

P R values in the institution-citationnetwork. As expected, papers P and P are cited by paper P , and paper P only belongs to institution I , Comparedto paper P , paper P belongs two institutions: I and I ,therefore, I is the most inﬂuential institution among the fourinstitutions. Only paper P is not cited by other papers inthe three papers, therefore, the prestige score of paper P isthe lowest in the three papers. Since institution I only linkspaper P , and paper P with a low prestige score, therefore,the score of the institution I is the lowest among four insti-tutions. Paper P and paper P belong to two institutions, andthey share a same institution I . Since paper P cites paper P , the importance of paper P is higher than the importanceof paper P . Similarly, paper P and paper P are also citedby paper P , since paper P signs two institutions I and I , and paper P only signs one institution I , therefore, theimportance of institution I is higher than the importance ofinstitution I . IV. RESULTS

We compare the similarity of institution ranking betweenIPRank and IRank [11]. Both algorithms can be classiﬁedas structured metrics; however, the IPRrank is based onthe heterogeneous institution-citation network whereas theIRank is based on the homogeneous citation network be-tween institutions. Table 2 shows the Spearman correlationcoefﬁcient between IPRank and IRank.According to Table 2, we observe a high correlation be-tween IPRank and IRank for top 10 - top 100 institutions.In terms of the long-term impact of the institutions, theSpearman correlation coefﬁcient between IPRank and IRankranges from 0.73 to 0.88 for top 10 - top 100 ranked in-stitutions. Especially, for top 10 institutions, the Spearmancorrelation coefﬁcient between IPRank and IRank is thehighest reaching 0.88. In terms of the short-term impact ofthe institutions, the Spearman correlation coefﬁcient of thetwo algorithms changes relatively little, and ranges from 0.87to 0.93 between 1994 and 1998. Compare to the period from1994 to 1998, in the two time periods: 1999 to 2003 and2004 to 2008, the correlation coefﬁcient changed relativelylarge, from 0.62 to 0.92 and 0.71 to 0.93, respectively. In theﬁve years between 2009 and 2013, the Spearman correlationcoefﬁcient between IPRank and IRank is the highest for top 10 institutions reaching 0.99, and the lowest for top 90institutions reaching 0.84.We also compare the similarity of paper ranking betweenIPrank algorithm and IRank algorithm (see Table 3). Interms of long-term paper impact, the correlation coefﬁcientbetween the two algorithms is generally on the rise for top10 - top 100 papers, and ranges from -0.30 to 0.79. Duringthe period from 1994 to 1998, the correlation coefﬁcientbetween them is higher than 0.58, and they are all positiverelated. Between 1999 and 2003, for top 10 - top 50 papers,the correlation coefﬁcient between the IPRank and IRankalgorithms is positive related, and they are higher than 0.68.During the same period, for top 60 - top 100 papers, thecorrelation coefﬁcient between is low, and ranges from 0.35to 0.49. Between 2004 and 2008, the correlation coefﬁcientbetween the IPRank and IRank algorithms shows an upwardtrend, and ranges from -0.18 to 0.73 for top 10 - top 100papers. Between 2009 and 2013, the correlation coefﬁcient isless than or equal to 0.5. It can be seen that the correlationcoefﬁcient at different periods is not regular.To test whether IPRank model correlates with outstandingimpact, we rank 35 Nobel Prize papers from 1930 to 2013on the basis of IPRank and PageRank. To validate of theIPRank model, we compare the rankings based on IPRankand PageRank. Experimental results indicate 80% NobelPrize papers rank higher by IPRank than by PageRank. Thetop ranked Nobel Prize papers are shown in Table 4, and itindicates that IPRank model has a higher correlation withoutstanding impact.Similarly, we check the rankings of the Nobel Prize in-stitutions between 1930 and 2013, which are derived fromNobel Prize papers. Table 5 shows the rankings of ten NoblePrize institutions based on IPRank and IRank algorithms. Itshould be noted that since 1952, University of California hasgradually separated from the University of California, Berke-ley as an administrative system, no longer as a university.Therefore, for the institution entry University of California,we also renamed it to the University of California, Berkeley.According to Table 5, we observe that several institutionshave the same ranking order, and several other institutionshave slightly different rankings. The reason behind it is thatthe importance of institution is related to the importance of itspublished scholarly papers. Simultaneously, the importanceof an institution will increase if papers published by the in-stitution are cited by other papers. In general, each institutionhas a large number of linked papers, and the number of linkedpapers is different for different institutions. Therefore, theranking difference based on IPRank and IRank algorithmsis small for institution ranking. Compared with institutionalrankings, the ranking of a paper depends on its impactof citing papers and institution. Therefore, the rankings ofpapers ranked by the IPRank and PageRank algorithms arequite different.Figure 3 compares IPRank and PageRank in terms of therecall rates of retrieving 35 Nobel Prize papers among top N papers. It is observed that the IPRank algorithm consistently VOLUME 4, 2016 ABLE 2: Spearman correlation coefﬁcient between IPRank and IRank for top N institutions. top N TABLE 3: Spearman correlation coefﬁcient between IPRank and IRank for top N papers. top N TABLE 4: Comparing the ranking of IPRank and PageRankalgorithms for ten Nobel Prize papers.

DOI of papers IPRank PageRankPhysRev.108.1175 2 4PhysRevLett.45.494 11 40PhysRev.70.460 31 34PhysRev.73.679 35 46PhysRev.131.2766 38 52PhysRevLett.30.1346 66 115PhysRevLett.30.1343 69 107PhysRevLett.75.3969 74 198PhysRev.76.769 90 83PhysRevB.4.3174 99 118

TABLE 5: Comparing the ranking of IPRank and PageRankalgorithms for ten Nobel Prize institutions.

Institution IPRank PageRankUniversity of California, Berkeley 1 1Harvard University 2 2Princeton University 3 3University of Chicago 4 6Cornell University 5 4Stanford University 6 5Columbia University 7 13University of Illinois 8 8University of Pennsylvania 10 7Massachusetts Institute of Technology 19 35 yields higher recall rates than the PageRank algorithm. Thus,the IPRank algorithm better reﬂects the impact of Nobel Prizepapers.Figure 4 compares IPRank and IRank in terms of the recallrates of identifying Nobel Prize universities and among top N universities. For top 1 to top 3, top 6 and top 9 universities,both IPRank and IRank contain the same number of NobelPrize universities. For top 4, top 5, top 7, top 8 and top10 universities, the IPRank consistently yields higher recall R e c a ll Top N Papers IPRank IRank

FIGURE 3: Recall performance for retrieving Nobel Prizepapers among top N papers.rates than that of IRank, indicating that the IPRank algorithmbetter reﬂects the impact of Nobel Prize institutions.Figure 5 compares IPRank and IRank in terms of theprecision rates of retrieving Nobel Prize universities andamong top N universities. From top 1 to top 8 universities,the probability of the number of Nobel Prize universities ofIPRank is 1. For top 9 and top 10 universities, the probabilityof the number of Nobel Prize universities of IPRank is lessthan 1, and they are 0.88 and 0.90 respectively. In contrast,the probability of the number of Nobel Prize universitiesof IRank ﬂuctuates greatly and ranges from 0.80 to 0.89.The probability of the number of Nobel Prize universities ofthe IPRank algorithm is found greater than or equal to theprobability using the IRank algorithm. VOLUME 4, 2016 R e c a ll Top N Universities IPRank IRank

FIGURE 4: Recall performance for retrieving Nobel Prizeuniversities among top N universities. P r e c i s i on Top N Universities IPRank IRank

FIGURE 5: Precision performance for retrieving Nobel Prizeuniversities among top N universities. V. CONCLUSION

This paper investigated a data-driven method to quantifythe impact of institution and paper from heterogeneousinstitution-citation network. Unlike most prior studies thatutilised citation network to measure the impact of institu-tion or paper, this paper proposed IPRank to simultaneouslyquantify the impact of institution and paper in a heteroge-neous scholarly network. Experimental results showed thatthe IPRank model was more representative of the outstandingimpact of institution and paper. Compared to the rankingof IPRank and PageRank algorithms for Nobel Prize papersand institutions, IPRank model produced a higher rankingin most cases for identifying Nobel Prize-winning papersand institutions, making it an adequate tool for institutionalimpact assessment.

REFERENCES [1] L. Bornmann, F. de Moya Anegón, and R. Mutz, “Do universities orresearch institutions with a speciﬁc subject proﬁle have an advantageor a disadvantage in institutional rankings? A latent class analysis withdata from the SCImago ranking,” Journal of the American Society for Information Science and Technology, vol. 64, no. 11, pp. 2310–2316,2013.[2] X. Bai, F. Xia, I. Lee, J. Zhang, and Z. Ning, “Identifying anomalouscitations for objective evaluation of scholarly article impact,” PloS one,vol. 11, no. 9, p. e0162364, 2016.[3] X. Bai, F. Zhang, and I. Lee, “Predicting the citations of scholarly paper,”Journal of Informetrics, vol. 13, no. 1, pp. 407–418, 2019.[4] P. Bao and J. Wang, “Identifying your representative work based on creditallocation,” in Companion of the The Web Conference 2018 on The WebConference 2018. International World Wide Web Conferences SteeringCommittee, 2018, pp. 5–6.[5] T. Bol, M. de Vaan, and A. van de Rijt, “The Matthew effect in sciencefunding,” Proceedings of the National Academy of Sciences, vol. 115,no. 19, pp. 4887–4890, 2018.[6] H. Lee, “Uncovering the multidisciplinary nature of technology manage-ment: journal citation network analysis,” Scientometrics, vol. 102, no. 1,pp. 51–75, 2015.[7] J. Iacovacci, C. Rahmede, A. Arenas, and G. Bianconi, “Functional mul-tiplex PageRank,” EPL (Europhysics Letters), vol. 116, no. 2, p. 28004,2016.[8] B. Renoust, V. Claver, and J.-F. Bafﬁer, “Multiplex ﬂows in citationnetworks,” Applied network science, vol. 2, no. 1, p. 23, 2017.[9] W. B. Rouse, J. V. Lombardi, and D. D. Craig, “Modeling researchuniversities: Predicting probable futures of public vs. private and largevs. small research universities,” Proceedings of the National Academy ofSciences, vol. 115, no. 50, pp. 12 582–12 589, 2018.[10] A. Fernández-Cano, E. Curiel-Marin, M. Torralbo-Rodríguez, andM. Vallejo-Ruiz, “Questioning the Shanghai ranking methodology as atool for the evaluation of universities: An integrative review,” Scientomet-rics, vol. 116, no. 3, pp. 2069–2083, 2018.[11] F. A. Massucci and D. Docampo, “Measuring the academic reputationthrough citation networks via PageRank,” Journal of Informetrics, vol. 13,no. 1, pp. 185–201, 2019.[12] Y. Wang, Y. Tong, and M. Zeng, “Ranking scientiﬁc articles by exploitingcitations, authors, journals, and time information,” in Twenty-seventhAAAI conference on artiﬁcial intelligence. AAAI Press, 2013, pp. 933–939.[13] D. Zhou, S. A. Orshanskiy, H. Zha, and C. L. Giles, “Co-ranking authorsand documents in a heterogeneous network,” in Seventh IEEE internationalconference on data mining (ICDM 2007). IEEE, 2007, pp. 739–744.[14] R. Liang and X. Jiang, “Scientiﬁc ranking over heterogeneous academichypernetwork,” in Thirtieth AAAI conference on artiﬁcial intelligence.AAAI Press, 2016, pp. 933–939.[15] D. Wang, C. Song, and A.-L. Barabási, “Quantifying long-term scientiﬁcimpact,” Science, vol. 342, no. 6154, pp. 127–132, 2013.[16] S. Wang, S. Xie, X. Zhang, Z. Li, P. S. Yu, and X. Shu, “Future inﬂuenceranking of scientiﬁc literature,” in Proceedings of the 2014 SIAM Interna-tional Conference on Data Mining. SIAM, 2014, pp. 749–757.[17] Q. Ke, E. Ferrara, F. Radicchi, and A. Flammini, “Deﬁning and identifyingsleeping beauties in science,” Proceedings of the National Academy ofSciences, vol. 112, no. 24, pp. 7426–7431, 2015.[18] C. Stegehuis, N. Litvak, and L. Waltman, “Predicting the long-term cita-tion impact of recent publications,” Journal of informetrics, vol. 9, no. 3,pp. 642–657, 2015.[19] F. Battiston, V. Nicosia, and V. Latora, “The new challenges of multiplexnetworks: Measures and models,” The European Physical Journal SpecialTopics, vol. 226, no. 3, pp. 401–416, 2017.[20] H. Piwowar, “Altmetrics: Value all research products,” Nature, vol. 493,no. 7431, p. 159, 2013.[21] P. Chen, H. Xie, S. Maslov, and S. Redner, “Finding scientiﬁc gems withGoogle’s PageRank algorithm,” Journal of Informetrics, vol. 1, no. 1, pp.8–15, 2007.[22] A. Halu, R. J. Mondragón, P. Panzarasa, and G. Bianconi, “MultiplexPageRank,” PloS one, vol. 8, no. 10, p. e78293, 2013.[23] U. Senanayake, M. Piraveenan, and A. Zomaya, “The PageRank-index:Going beyond citation counts in quantifying scientiﬁc impact of re-searchers,” PloS one, vol. 10, no. 8, p. e0134794, 2015.[24] D. Zeitlyn and D. W. Hook, “Perception, prestige and PageRank,” PloSone, vol. 14, no. 5, p. e0216783, 2019.[25] N. Ma, J. Guan, and Y. Zhao, “Bringing PageRank to the citation analysis,”Information Processing & Management, vol. 44, no. 2, pp. 800–810, 2008.[26] A. Molinari and J.-F. Molinari, “Mathematical aspects of a new criterionfor ranking scientiﬁc institutions based on the h-index,” Scientometrics,vol. 75, no. 2, pp. 339–356, 2008.

VOLUME 4, 2016

27] D. Docampo, “On using the Shanghai ranking to assess the researchperformance of university systems,” Scientometrics, vol. 86, no. 1, pp. 77–92, 2011.[28] N. Kapur, N. Lytkin, B.-C. Chen, D. Agarwal, and I. Perisic, “Rankinguniversities based on career outcomes of graduates,” in Proceedings of the22nd ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining. ACM, 2016, pp. 137–144.[29] F. Feng, L. Nie, X. Wang, R. Hong, and T.-S. Chua, “Computational socialindicators: A case study of chinese university ranking,” in Proceedings ofthe 40th International ACM SIGIR Conference on Research and Develop-ment in Information Retrieval. ACM, 2017, pp. 455–464.[30] M. R. Z. Banadkouki, M. A. Vahdatzad, M. S. Owlia, and M. M. Lotﬁ,“Ranking Iranian universities: An interpretative structural modeling ap-proach,” Scientometrics, vol. 117, no. 3, pp. 1493–1512, 2018.[31] M. Dobrota, M. Bulajic, L. Bornmann, and V. Jeremic, “A new approach tothe QS university ranking using the composite I-distance indicator: Uncer-tainty and sensitivity analyses,” Journal of the Association for InformationScience and Technology, vol. 67, no. 1, pp. 200–211, 2016.[32] E. Garﬁeld, “Citation index for science,” Science, vol. 4, pp. 67–70, 1955.[33] ——, “Citation analysis as a tool in journal evaluation,” Science, vol. 178,no. 4060, pp. 471–479, 1972.[34] C. Catalini, N. Lacetera, and A. Oettl, “The incidence and role of negativecitations in science,” Proceedings of the National Academy of Sciences,vol. 112, no. 45, pp. 13 823–13 826, 2015.[35] J.-F. Molinari and A. Molinari, “A new methodology for ranking scientiﬁcinstitutions,” Scientometrics, vol. 75, no. 1, pp. 163–174, 2008.[36] C. Su, Y. Pan, Y. Zhen, Z. Ma, J. Yuan, H. Guo, Z. Yu, C. Ma, and Y. Wu,“PrestigeRank: A new evaluation method for papers and journals,” Journalof Informetrics, vol. 5, no. 1, pp. 1–13, 2011.[37] A. London, T. Németh, A. Pluhár, and T. Csendes, “A local PageRankalgorithm for evaluating the importance of scientiﬁc articles,” in AnnalesMathematicae et Informaticae, vol. 44. szte, 2015, pp. 131–141.[38] X. Jiang, X. Sun, and H. Zhuge, “Towards an effective and unbiased rank-ing of scientiﬁc literature through mutual reinforcement,” in Proceedingsof the 21st ACM international conference on Information and knowledgemanagement. ACM, 2012, pp. 714–723.[39] J. P. Ioannidis, “A generalized view of self-citation: Direct, co-author, col-laborative, and coercive induced self-citation,” Journal of psychosomaticresearch, vol. 78, no. 1, pp. 7–11, 2015.[40] X. Bai, F. Zhang, J. Hou, I. Lee, X. Kong, A. Tolba, and F. Xia, “Quan-tifying the impact of scholarly papers based on higher-order weightedcitations,” PloS one, vol. 13, no. 3, p. e0193192, 2018.[41] A. Perianes-Rodriguez and J. Ruiz-Castillo, “Multiplicative versus frac-tional counting methods for co-authored publications. The case of the 500universities in the Leiden Ranking,” Journal of Informetrics, vol. 9, no. 4,pp. 974–989, 2015.[42] M. Dobrota and M. Dobrota, “ARWU ranking uncertainty and sensitivity:What if the award factor was excluded?” Journal of the Association forInformation Science and Technology, vol. 67, no. 2, pp. 480–482, 2016.[43] A. Perianes-Rodriguez and J. Ruiz-Castillo, “The impact of classiﬁcationsystems in the evaluation of the research performance of the LeidenRanking universities,” Journal of the Association for Information Scienceand Technology, vol. 69, no. 8, pp. 1046–1053, 2018.[44] C. Daraio, A. Bonaccorsi, and L. Simar, “Rankings and university perfor-mance: A conditional multidimensional approach,” European Journal ofOperational Research, vol. 244, no. 3, pp. 918–930, 2015.[45] K. Frenken, G. J. Heimeriks, and J. Hoekman, “What drives universityresearch performance? An analysis using the CWTS Leiden Rankingdata,” Journal of Informetrics, vol. 11, no. 3, pp. 859–872, 2017.[46] X. Bai, I. Lee, Z. Ning, A. Tolba, and F. Xia, “The role of positiveand negative citations in scientiﬁc evaluation,” IEEE Access, vol. 5, pp.17 607–17 617, 2017.8