A large-scale comparison of social media coverage and mentions captured by the two altmetric aggregators- Altmetric.com and PlumX
AA large-scale comparison of social media coverage and mentions captured by the two altmetric aggregators- Altmetric.com and PlumX
Mousumi Karmakar a , Sumit Kumar Banshal b & Vivek Kumar Singh a,1 Department of Computer Science, Banaras Hindu University, Varanasi-221005, India Department of Computer Science, South AsianUniversity, New Delhi-110021, India.
Abstract:
The increased social media attention to scholarly articles has resulted in efforts to create platforms & services to track and measure the social media transactions around scholarly articles in different social platforms (such as Twitter, Blog, Facebook) and academic social networks (such as Mendeley, Academia and ResearchGate). Altmetric.com and PlumX are two popular aggregators that track social media activity around scholarly articles from a variety of social platforms and provide the coverage and transaction data to researchers for various purposes. However, some previous studies have shown that the social media data captured by the two aggregators have differences in terms of coverage and magnitude of mentions. This paper aims to revisit the question by doing a large-scale analysis of social media mentions of a data sample of 1,785,149 publication records (drawn from multiple disciplines, demographies, publishers). Results obtained show that PlumX tracks more wide sources and more articles as compared to Altmetric.com. However, the coverage and average mentions of the two aggregators vary across different social media platforms, with Altmetric.com recording higher mentions in Twitter and Blog, and PlumX recording higher mentions in Facebook and Mendeley, for the same set of articles. The coverage and average mentions captured by the two aggregators across different document types, disciplines and publishers is also analyzed.
Keywords:
Academic Social Networks, Altmetrics, Altmetric aggregators, Altmetric.com, PlumX, Social Media Platforms.
Introduction
The newer forms of social media metrics (aka altmetrics) about scholarly articles, collected from different social media platforms and academic social networks, present useful insight about the importance and impact of the articles. Altmetrics are now being collected and analyzed for a variety of purposes ranging from early impact assessment to measure the correlations between altmetrics and citations. Several studies have tried to propose that altmetrics could be an alternative to citations for assessment of the impact of research (such as Costas, Zahedi, & Wouters, 2015; Huang, Wang, & Wu, 2018; Thelwall & Nevil, 2018; Thelwall, 2017, 2018; Haustein et al., 2014). Owing to the increased attention on social media Corresponding Author. Email: [email protected] ata around scholarly articles, there are now efforts to track and measure them. Relevant data for scholarly articles from popular social platforms (such as Twitter, Blog, Facebook) and academic social networks (such as Mendeley, Academia) are being captured and utilized in different ways. Altmetric.com and PlumX are two popular aggregators that track social media activity around scholarly articles from a variety of social platforms and provide the coverage and transaction data to researchers for various purposes.
Some recent studies have tried to analyze the coverage and data quality of the two altmetric aggregators for same set of papers. While some studies (such as Zahedi, Fenner, & Costas, 2015; Zahedi & Costas, 2018a, 2018b; Ortega, 2018a; Bar-Ilan, Halevi, & Milojević, 2019) focused on analyzing the agreement/ disagreement as well as differences in values of the social metric scores across different altmetric data providers; several others (such as Meschede & Siebenlis, 2018; Ortega, 2018b, 2018c) tried to understand the correlations among metric values from different aggregators. Two recent studies (Ortega 2019a, 2020b) tried to analyze the altmetric biases with respect to country, language and subjects, while another study (Ortega, 2019b) focused on inconsistencies in data from different aggregators. Most of the previous studies, however, analyzed smaller data samples, except the relatively recent studies by Ortega (2019a, 2019b, 2020b) that used 100,000 Crossref DOIs for analysis. This paper, attempts to revisits the question of coverage variation, magnitude differences and correlations in counts provided by the two aggregators, for the same set of articles. We analyzed a large sample of 1,785,149 publication records (17 times higher than the largest data sample used earlier), which constitutes the complete publication record for the whole world indexed in Web of Science for the year 2016. The data sample represents different geographies, publication types, subjects, and publishers. Our study is thus aimed to be a comprehensive large-scale comparison of coverage and mentions tracked by the two altmetric aggregators in overall and also across different document types, disciplines and publishers.
Research Questions
This paper attempts to obtain a conclusive and comprehensive answer to following research questions:
RQ1:
Do the two altmetric aggregators differ significantly in their coverage of social media events for the same set of articles?
RQ2:
What is the quantum of difference in altmetric mentions captured from different social media platforms by the two altmetric aggregators?
RQ3:
Do the altmetric mentions in the two aggregators, across different social media platforms, correlate with each other?
RQ4:
Do the agreements/ disagreements in the altmetric counts captured by the two aggregators differ by document types, disciplines and publishers?
A brief overview of the two altmetric aggregators
An altmetric aggregator is typically a platform which tracks and accumulates various types of data from different social media platforms around scholarly articles, mainly for analysis and reporting purposes. Altmetric.com and PlumX are the two most popular altmetric aggregators. We present below a very brief overview of the two aggregators, emphasizing on their coverage of sources and the data collection process.
Altmetric.com is the first popular altmetric aggregator platform, originating in 2011 through the efforts of Euan Adie and support of Digital Science . It crawls over different social media, news, and blog platforms to collect views, mentions, comments etc. around a scholarly article. The Altmetric.com aggregator captures the various unique identifiers, like PubMedID, arXiv ID, SSRN ID, RePEC ID, DOIs and ISBNs, URI for the scholarly articles. Altmetric.com makes use of various scholarly identifiers like (URLs, DOIs, PMIDs, ArXiv ids etc.) and employ different APIs and tools to track events around scholarly articles from various sources. For example, it uses Mendeley API to capture Mendeley reads, Wikipedia API for Wikipedia citations, Facebook (FB) Graph API for FB mentions, GNIP API for Twitter etc. Multiple sources of information for the same scholarly articles are cross-checked and summed up into single entry in the respective platforms. It also computes a weighted score based on mainly three factors-volume, sources, and authors- and provide a measure called ‘Altmetric Attention Score’. This score is represented as a colorful donut on the article details page . Altmetric.com offers various services for institutions, funding agencies, researchers, and agencies involved in research & development (R&D). It provides a free API with rate limit to researchers on request . It also provides an environment to obtain data in different formats, like JSON, CSV etc. PlumX is a suite of products launched by Plum analytics in 2016, initially with limited coverage . Over time, these metrics evolved significantly in several ways and now it is a quite comprehensive altmetric aggregator. Plum Analytics considered as many as 67 different types of outputs to be tracked, which are named as ‘artifacts’. These artifacts include scholarly articles, books, book chapters, conference articles. In addition, it also includes speeches, visual arts, images, figures etc. It covers a very wide variety of social platforms, such as Twitter, Facebook, YouTube; online knowledge sharing mediums, such as StackExchange, Wikipedia, Github; bibliographical data-based sites, such as Scopus, SciELO, RePEcetc. It organizes the captured data in five different types of metrics- usages, captures, mentions, citations, and social media . The data for artifacts is collected from multiple platforms using different methods and tools. These include data provider APIs, third party APIs, FTP data transfers, OAI-PMH https://plumanalytics.com/learn/resources/plum-analytics-metrics-audit-log/, accessed on 10 December, 2019 https://plumanalytics.com/learn/about-metrics/, accessed on 10 December, 2019 arvesting, Web crawlers, and RSS feeds . The data harvesting is updated on different time periods, ranging from daily to monthly basis, based on the different licensing policies of the harvested platforms. Plum Analytics refreshes PlumX in every 3-4 hours to keep it most updated. The data can be accessed through end-user interfaces, widgets, and APIs of Plum analytics. Both the aggregators, Altmetric.com and PlumX provide metrics based on data collected from various social media, bibliographic and policy document sources. Tables 1 and list the social media, bibliographic and other sources tracked by the two aggregators. The table 1 lists a total of 33 social media sources tracked by the two aggregators. Out of these 33 sources, 14 sources are captured in Altmetric.com, whereas PlumX tracks 28 sources. The platforms/ sources tracked by Altmetric.com are Twitter, Facebook, Youtube, Reddit, F1000, Blog, Mendeley, Stack Overflow, Wikipedia, News, CiteULike, LinkedIn, Google+, and Pinterest. Out of these 14 sources, PlumX tracks most except five sources- F1000, LinkedIn, Google+, Stack Overflow, and Pinterest. PlumX additionally tracks 19 social media sources that are not tracked by Altmetric.com. These sources are bit.ly, Figshare, Github, Slideshare, SoundCloud, SourceForge, Vimeo, Stack Exchange, Goodreads, Amazon, Delicious, Dryad, Dspace, SSRN, EBSCO, ePrints, AritiiRead eBooks, Ariti Library, WorldCat. The table 2 lists the bibliographic and policy document sources tracked by the two aggregators. Here, PlumX has a better coverage and also provides citation metrics. PlumX covers total 16 sources whereas Altmetric.com tracks only 6 sources. Both aggregators have only one bibliographic source in common, which is Policy document source. Related Work
Several previous studies tried to analyze the data from different altmetric aggregators for different purposes ranging from assessing their accuracy to finding how much they agree on altmetric counts for same set of scholarly articles. Zahedi, Fenner, & Costas (2015) explored the agreement/disagreement among metric scores across three altmetric providers namely, Mendeley, Lagotto, and Altmetric.com. They analyzed 30,000 DOIs for the year 2013 in five common sources and analyzed possible reasons for the differences. They found that Altmetric.com reports more tweets as compared to Lagotto and concluded that the data capture procedure of Altmetric.com, which includes tweets, public retweets, and comments in real-time, could be a probable reason for such differences. In later studies, Zahedi & Costas (2018a, 2018b) have analyzed 31,437PloSOne DOIs and explored the differences in metrics provided by four aggregators Crossref Event Data (CED), Altmetric.com, Lagotto, and Plum Analytics. They focused on the process of data collection used by different aggregators and that how different aggregators define metrics from the data collected. The results showed that Mendeley (r>0.8) and Twitter (0.5≤r≤0.9) have good agreement across aggregators, whereas Facebook (0.1≤r≤0.3) and Wikipedia (0.2≤r≤0.8) have the lowest agreement. They attributed this to the methods of tracking and processing data. For https://plumanalytics.com/niso-altmetrics-working-group-on-data-quality/, accessed on 10 December, 2019 xample, the effect of direct data collection or collection through third-party APIs, aggregation of data based on different versions, identifiers, types, etc., and the impact of frequency of update, etc. They recommended that one should not rely only on the aggregators showing a higher count for the metric. Meschede & Siebenlist (2018) explored about the relationship between the metrics across (inter-correlation) two aggregators PlumX and Altmetric.com, as well as between the metrics within the aggregator itself (intra-correlation). They analyzed sample of 5,000 journal articles from six disciplines(‘Computer Science, Engineering and Mathematics’, ‘Natural Sciences’, ‘Multidisciplinary’, ‘Medicine and Health Sciences’, ‘Arts, Humanities and Social Sciences’ and ‘Life Sciences’) and analyzed them for the eight common sources (‘Facebook’, ‘Blogs’, ‘Google+’, ‘News’ , ‘Reddit’, ‘Twitter’, ‘Wikipedia’ and ‘Mendeley’)in both aggregators. The study showed that PlumX (99%) has higher overall coverage of the data chosen for analysis as compared to Altmetric.com (39%). The intra-correlation between the metrics within the same platforms are weak. They further observed that PlumX and Altmetric.com are highly inter-correlated in terms of Mendeley and Wikipedia (with correlation coefficient values 0.97 and 0.82 respectively) but weakly correlated for other sources- Facebook (0.29), Blogs (0.46), Google+ (0.28), News (0.11), Twitter (0.49), and Reddit (0.41). Ortega (2018a) analyzed the difference in altmetric indicator counts in Crossref event data, Altmetric.com, and PlumX, using a sample 67,000 papers. For each platform, the difference in metrics across aggregators was quantified in terms of counting differences. Counting difference was computed by taking the sum of the differences in metrics provided by two aggregators at the document level and dividing by the number of publications that have non zero altmetric events and occur in both aggregators. They concluded that different aggregators should be used for data from different platforms, such as PlumX for Mendeley reads and Altmetric.com for tweets, news & blogs. In another study, Ortega (2018b) has grouped different altmetrics into three groups: social media, usage, citations using principle component analysis (PCA). In this study data from Altmetric.com, Scopus and PlumX for a set of 3,793 articles published in 2013, was used. Considering that the earlier studies provided evidence that some specific aggregators perform better for some specific data sources; they collected different indicators from different aggregators. These included tweets, Facebook mentions, news, blogs etc. from Altmetric.com; citations from Scopus; Wikipedia mentions from CED; and views & Mendeley reads from PlumX. Results showed that instead of using a single metric, such as altmetric score, one should consider the relatedness of metrics and their impact across different disciplines for evaluating research. Ortega (2018c) examined the emergence and evolution of five altmetrics (download, views, tweets, readers, and blog mention) along with bibliometric citations from the publication date of a document. The study also investigated the evolution of the relationships among these metrics by analyzing 5,185 papers from PlumX on a month to month basis. The results showed that in a document’s entire life cycle, altmetric mentions are fast appearing ones, whereas citations appearance is slow. Based on the relationship analysis of metrics, the study suggested that the reader counts influence citations. series of studies (Ortega, 2019a, 2019b, 2020b) analyzed coverage of news and blog sources in three aggregators namely, Crossref event data, Altmetric.com, and PlumX, by taking 100,000 Crossref DOIs. The results showed that, the overlap of these sources across aggregators are comparatively low in numbers (Ortega, 2019a). As for example, Altmetric.com has a higher coverage for blog (37.8%) but only 7.8% of the publication set is commonly covered in the three aggregators. The coverage in one aggregator might be high for the same set of articles but the lower overlapping ratio shows that the sources covered in aggregators vary widely. The main objective of the study in Ortega (2020b) was to explore altmetric biases with respect to country, language and subjects with a dataset of 100,000 DOIs. Author has retrieved the sources which covered the randomly selected publication set and categorized them based on their regions, language and interest level. It shows that, Altmetric.com is the most heterogeneous aggregator geographically and linguistically. However, PlumX has more coverage towards local news events, particularly for USA. Their conclusion serves as evidence that English is the most prevailing language. News and blog sources are mostly from general interest, social science and humanities disciplines. From this same dataset, Ortega (2019b) extracted the blog and news links to verify the validity, coverage, and presence of the tracked blog mentions and news mentions of the scholarly articles. There were 51,000 news & blog links found in this extraction process, which were explored for their existence and it was found that almost one-third of the links are broken. This elaborate longitudinal study concluded that these mentions should be audited periodically as the aggregators are dependent on third-party providers. Bar-Ilan, Halevi, & Milojević (2019) have analyzed altmetric data of 2,728 JASIST articles and reviews, provided by Mendeley, Altmetric.com and PlumX in two different points of time 2017 and 2018. They observed increase in overlap in coverage of documents with Mendeley reader across the three sources over time. There were 874 papers commonly covered in all sources in 2017, which increased to 1,021 papers in 2018. Further an increase in Mendeley reader counts and citations was also observed. They suggested using more than one aggregator to obtain altmetric indicators and to compare them in order to get reliable altmetrics. Ortega (2020a) has performed a meta-analysis over a set of 107 altmetric articles related to five altmetric aggregators, namely, Altmetric.com, Mendeley, PlumX, Lagotto, and ImpactStory, published during 2012-2019. The dataset consisted of papers that had either computed or published data useful in the computation of three metrics: coverage, platform wise coverage, and average mentions. The usage percentage of all the aggregators was explored. Almetric.com (54%) was found to be the most prevalent provider, followed by Mendeley (18%) and PlumX (17%). The analysis showed that Altmetric.com tracks more events for Twitter, News, and Blogs whereas PlumX performs well in FB and Mendeley platforms. The results exhibited gradual increase in tweet capture by PlumX. ata Since the focus of the work is comparing the coverage of research publications and altmetrics provided in two popular social media aggregators- Altmetric.com and PlumX, we analyzed the variation in coverage for the whole world’s research output for the year 2016. The complete set of research publications indexed in Web of Science (WoS) for the year 2016 are downloaded. The download was performed in the month of Sep. 2019. Since WoS does not allow downloading data above 100,000 records, therefore the data is collected based on Web of Science Categories (WC). The WC based data collection has an inherent problem of duplicity since in WoS a paper is generally tagged under many WCs. Due to this duplicity initially, a total of 3,545,720 records are obtained for all WCs taken together, comprising of the standard 67 fields- including TI (title), PY (publication year), DI (DOI), DT (document type), SO (publication name), DE (author keywords), AB (abstract). After the removal of duplicate entries and some erroneous records, we were left with 1,785,149 publication records. The altmetric data for the publication records found in the two aggregators- Altmetric.com and PlumX- was obtained thereafter. In order to obtain altmetric data from Altmetric.com, a DOI look up was performed for all the DOIs in the WoS data. Out of 1,785,149 publication records, a total of 902,990 records are found to be covered by Altmetric.com, which is about 50.58% of the total data. The data obtained from Altmetric.com had 46 fields, including DOI, Title, Twitter mentions, Facebook mentions, News mentions, Altmetric Attention Score, OA Status, Subjects (FoR), Publication Date, URI, etc. The data from Altmetric.com was downloaded in the month of Sep. 2019.Since we did not have an API access for PlumX, we contacted PlumX team to provide us with access to PlumX data for the 1,785,149 publication records we had. They agreed to provide us with data and created a dashboard access for us for the concerned publication records. Out of the 1,785,149 publication records, a total of 1,661,477 publication records were found covered in PlumX, which constitutes about 93.07% of the whole data. PlumX provides metrics in five categories, from a wide range of source platforms. The PlumX data was downloaded in the month of Nov. 2019. This data included fields like DOI, Title, Year, Repo URL, Researcher Name(s), Captures:Readers:Mendeley, Social Media:Tweets:Twitter, Social Media:Shares, Likes &Comments:Facebook etc. This data also has a field named Plum stable url which redirects to the page from where one can get the actual tweets, blogs etc. For our analysis we have analyzed data for four platforms- Twitter, Facebook, Mendeley, and Blog platforms- as covered in both the aggregators.
Methodology
In this exploratory analysis, the data obtained from three different data sources have been analyzed on six aspects, for the two aggregators: variation in coverage, difference in magnitude of mentions, correlations in mention values, variation across document types, variation by discipline and variation across publishers.
First of all , the coverage of scholarly articles in different social media platforms by the two aggregators was compared. The altmetric data for the articles in consideration was obtained rom the two aggregators corresponding to the four social media platforms: Twitter, Facebook, Blogs and Mendeley. The percentage of articles covered in the four social media platforms as per the data from the two aggregators was identified.
Secondly , the magnitude of mentions in the four social media platforms for the articles was analyzed and the difference in magnitude of mentions in the data drawn from the two aggregators was computed. Statistical measures (mean and median) were computed for the differences in values from each of these platforms.
Thirdly , the correlation between mention values for different social media platforms, as drawn from the two aggregators, was computed. For computing correlations, the options were to compute Pearson Correlation or Spearman Rank Correlation. However, as it has been observed in previous studies (such as Thelwall & Nevill, 2018) that the altmetric data are highly skewed, therefore, we have used Spearman Rank Correlation, which is more suitable for such skewed data. The Spearman Rank Correlation Coefficient (SRCC) was computed between the different types of mentions available from the two aggregators. The built-in function ‘ corr ’ available in pandas module of python programming language was used for this purpose, value ‘spearman’ passed as parameter to the function. The value of SRCC lies between -1 to +1, with positive values indicating positive correlation, value of 0 indicating no correlation, and negative value indicates negative correlation.
Fourthly , the difference in coverage and mentions captured by the two aggregators for the five social media platforms was analyzed across different document types. The document type of articles was taken from the ‘DT’ tag in the WoS record file. The values for ‘DT’ include journal articles, proceedings paper, book chapters, reviews, book reviews, editorial material etc. The variation in coverage levels and magnitude of mentions for different social media platforms was thus obtained for these document types.
Fifthly , the difference in coverage and mentions across different disciplines was computed by grouping the publication records into different disciplines. Each publication record was grouped into one of the fourteen major disciplinary categories as per the scheme proposed in (Rupika et al., 2016). The Web of Science Category (WC) field information for each publication record is seen and based on its value the publication record is assigned to one of the fourteen broad disciplinary categories. These fourteen broad disciplinary categories are as follows: Agriculture (AGR), Art & Humanities (AH), Biology (BIO), Chemistry (CHE), Engineering (ENG), Environment Science (ENV), Geology (GEO), Information Sciences (INF), Material Science (MAR), Mathematics (MAT), Medical Science (MED), Multidisciplinary (MUL), Physics (PHY) and Social Science (SS). This disciplinary grouping into 14 broad categories instead of more than 255 categories of Web of Science, made the analysis more manageable and easier to understand. The variations in coverage and magnitude of mentions were thus computed for publication records in each of the fourteen disciplinary groups.
Finally , difference in journal coverage and mentions was computed for 16 most frequent and well-known publishers namely ‘Springer’, ‘Nature’, ‘PLoS’, ‘Elsevier’, ‘IEEE’, ‘Wiley’, ‘Taylor & Francis’, ‘ACM’, ‘IOP’, ‘Oxford University Press’, ‘Sage’, ‘Hindawi’, ‘Cell Press’, MDPI’, ‘Cambridge University Press’, and ‘Emerald’ across two aggregators. The publisher information was obtained from the ‘PU’ tag of WoS records. In the PU field same publishers are present in different forms. For e.g. Elsevier have variants such as Elsevier Science Ltd, Elsevier Masson, and Elsevier Science Bv. All such variants represent same parent publisher Elsevier. To capture all such variants of any publisher we employed partial string matches in the PU field. This way all publication records for different publishers are obtained and differences in their coverage and in mention counts are computed across the two aggregators . Results
The altmetric data captured by the two aggregators for the large sample of articles was analysed to identify differences in coverage and magnitude of mentions across different platforms. The differences in mentions captured by the two aggregators was also analysed across different disciplines, document types and publishers.
Difference in coverage of the two aggregators
First of all, the difference in coverage of the two altmetric aggregators was identified. It was observed that out of the total set of 1,785,149 publication records, a total of 902,990 records are found to be covered by Altmetric.com (which is about 50.58% of the total data), and a total of 1,661,477 publication records were found covered in PlumX (which constitutes about 93.07% of the whole data).
Figure 1 shows the overlap of coverage of the two aggregators. It can be seen that a total of 879,981 articles are commonly covered by the two aggregators. About 97.5% of articles covered by Altmetric.com are also covered by PlumX, whereas Altmetric.com covers only 53% of articles tracked by PlumX. The PlumX aggregator has 47% of articles uniquely covered. Thus, it is observed that PlumX has a higher overall coverage of articles (including uniquely covered articles) as compared to Altmetric.com. We have tried to find out whether the difference in coverage of the two aggregators is similar across different platforms.
Figure 2 shows a bar chart of article coverage of the two aggregators for four different platforms- Twitter, Facebook, Mendeley and Blog. It can be observed that PlumX has a better coverage for Mendeley platform, whereas Altmetric.com has an edge over PlumX in coverage in the Twitter and Blog platforms. The coverage for Facebook platform of the two aggregators is almost similar. The magnitude of coverage difference between the two aggregators is highest for Mendeley and lowest for Facebook. Thus, while PlumX has an overall higher coverage of articles, Altmetric.com has better coverage in two of the four platforms analysed.
Difference in magnitude of mentions
The mean and median values of number of mentions for the four platforms as tracked by the two aggregators was computed.
Table 3 shows the number of articles tracked by the two aggregators across the different platforms, along with the mean and median values of mentions. t can be observed that the mean value of mentions for Twitter and Blog platforms is higher in Altmetric.com, whereas the mean value of mentions for Facebook and Mendeley is higher in PlumX. In case of Facebook platform, PlumX platform has significantly higher value of mean and median of mentions as compared to Altmetric.com. It may, however, be noted that these values are for different number of articles tracked by the two aggregators. A more useful comparison of the value of mentions would require comparing the mentions for the commonly covered set of articles by the two aggregators. Therefore, we have compared the values of mentions captured by the two aggregators across different platforms for the same set of commonly covered articles. The difference in mention value for the papers in Altmetric.com and PlumX is computed.
Figure 3 shows the mean of differences in mentions in the four platforms as tracked by the two aggregators. It is observed that in case of Twitter and Blog platforms, the mean value of differences is positive indicating that Altmetric.com captures higher number of mentions as compared to PlumX in these platforms. In the Facebook and Mendeley platforms, PlumX appears to have higher number of mentions tracked as compared to Altmetric.com. In order to gain further insight into the difference in magnitude of mentions, we have also plotted the frequency of differences in mentions across different platforms by the two aggregators.
Figure 4 (a) – (d) present the frequency values of differences in mentions in the two aggregators for the Twitter, Facebook, Mendeley and Blog, respectively.
Figure 4(a) shows the histogram for the differences in Twitter platform. These differences are for 565,445 commonly covered articles for Twitter platform, with at least one tweet captured in both the aggregators. It can be seen that a good percentage (approximately 50%) of papers has tweet difference equal to zero, indicating that both platforms record same number of tweets for these papers. However, the slope is inclined towards the positive side, indicating that Altmetric.com captures more tweets for a good number of the articles as compared to PlumX. We looked at some examples to verify this and found this valid. One example paper titled “When the Great Power Gets a Vote: The Effects of Great Power Electoral Interventions on Election Results” has 24,318 tweets captured by Altmetric.com, whereas PlumX captured only 493 tweets for this paper.
Figure 4(b) shows the histogram of article level differences in Facebook platform. Here the plot is created for 71,437 commonly covered articles in Facebook platform, that have non-zero FB mentions in both aggregators. It is observed that in approximately 16% of the articles, the difference in mentions is zero. However, the slope is clearly inclined towards the negative side, indicating that PlumX captures more mentions per article as compared to Altmetric.com for majority of the articles. One example article to mention would be the article titled “CRISPR gene-editing tested in a person for the first time”, which has 62,290 mentions captured by PlumX but only 341 mentions captured by Altmetric.com. The histogram for the differences in mentions in the Mendeley platform is shown in
Figure 4(c).
Here, the plot is made for a total of 830,520 commonly covered articles in Mendeley platforms, that have at least one read recorded in both the aggregators. In this case too, it is seen that pattern is inclined more towards the negative side, indicating that PlumX captures ore reads per article than Altmetric.com for majority of the articles. About 25% of articles have the same number of mentions recorded by the two aggregators. One example article would be article titled “Mastering the game of Go with deep neural networks and tree search” that has 39,621 records captured by PlumX but only 7,900 reads captured by Altmetric.com.
Figure 4(d) plots the histogram of the differences in Blog mentions for the 14,387 commonly covered articles in Blog platform, with non-zero mentions captured by both the aggregators. In this case, it is observed that more than 40 % articles have this difference equal to zero. The pattern, however, is inclined towards the positive side, indicating that Altmetric.com captures more mentions as compared to PlumX for a good number of articles. One example article to mention would be an article titled “Planet Hunters IX. KIC 8462852 – where's the flux?” that has 95 mentions captured by Altmetric.com but only 12 mentions captured by PlumX. Thus, a perusal of the figures 4 (a) to (d) indicate that Altmetric.com captures more mentions per article in case of Twitter and Blog platform, whereas PlumX captures more mentions per article for the Mendeley and Facebook platforms.
Correlations in mentions
We have also computed correlation between the mention values for different platforms across the two aggregators. The
Spearman Rank Correlation Coefficient (SRCC) between mentions is computed for the articles commonly covered by the two aggregators.
Table 4 shows the SRCC values in the article-mentions across the two aggregators. It can be observed that the correlation values for Twitter and Mendeley platforms are 0.823 and 0.95, respectively, indicating strong correlation. In case of Facebook and Blog platforms, these values are 0.272 and 0.424, respectively, indicating lower correlation. Thus, it can be said that there is more agreement in mention-based ranks of articles in Twitter and Mendeley platforms, between the two aggregators. The mention values differ in more random manner in the other two platforms. The intra-platform correlations across the two aggregators are also shown, each of which are less than 0.5, indicating weak positive rank correlations across the platforms in the two aggregators.
Variations across document types
It would be interesting to check whether the coverage and mentions in different platforms as captured by the two aggregators vary across different document types. We have, therefore, analysed the coverage and mentions for the articles of different document types. These document types correspond to Article, Book, Book Chapter, proceedings paper and Review, as defined by the Web of Science.
Table 5 shows the coverage and average mentions for Twitter, Facebook, Mendeley and Blog platforms of the different document types. It is observed that Altmetric.com has better coverage in Twitter and Blog platforms for almost all document types. In case of average mention values too, Altmetric.com has higher values across almost all document types in Twitter and Blog platforms. The PlumX platform is found to have better coverage and average mention values across almost all document types in the Facebook and endeley platforms. Thus, looking at the results across document types, it is seen that the overall trend of better coverage of Altmetric.com for Twitter and Blog and of PlumX for Facebook and Mendeley appear to be hold valid across different document types.
Variations across disciplines
The variations in coverage and mentions between the two aggregators is also analysed across different disciplines. We have used the data grouping into fourteen broad disciplines.
Table 6 presents the coverage and avg mention values in the four platforms. It is observed that in the Twitter platform, Altmetric.com has better coverage and higher mention values than PlumX for almost all disciplines. In case of Facebook platform, PlumX has higher mention values for almost all disciplines. The coverage, however, is not higher for PlumX in Facebook for all disciplines as disciplines like MED, AH, SS, BIO, and AGR has higher coverage by Altmetric.com. In Mendeley platform, PlumX has higher coverage in all disciplines, but in terms of reads, PlumX captures higher reads only for MED, SS, BIO, GEO, and MUL disciplines. In case of Blog platform, Altmetric.com has better coverage than PlumX across almost all disciplines and the average mention values of Altmetric.com are higher except for PHY, ENV, MAT and ENG disciplines. Thus, the analysis of data across different disciplines shows an overall trend of better coverage of Altmetric.com of Twitter and Blog and PlumX of Facebook and Mendeley, except in case of some disciplines where slightly different patterns are observed.
Variations across Publishers
We have also tried to see if the patterns of variations in coverage and mentions in the two aggregators change across different publishers. In order to analyse this, articles for the 16 most frequent publishers in the data are identified and analysed.
Table 7 present the coverage and average mention values for data for these publishers in the four platforms for the two aggregators. In terms of number of journals for which data is covered, the PlumX aggregator has an edge over Altmetric.com. For example, PlumX covers 76 more journals of Springer than Altmetric.com, 34 more journals of Elsevier and 31 more journals of Taylor & Francis. It can be further observed that PlumX covers more than 90% of publication records for all Publishers except for Cambridge Univ Press (84.1%), whereas coverage of Altmetric.com varies significantly between 25% to 86%. Altmetric.com has minimum coverage of about 25% for IEEE and highest coverage of 86.88% for PLoS publications. In terms of coverage and mentions for the four platforms, it is found that Altmetric.com has higher coverage in Twitter for almost all publishers. In Facebook platform, Altmetric.com shows higher coverage for all publishers except PLoS, Hindawi, and MDPI. In Mendeley platform, the coverage and average reads captured by PlumX are higher for all publishers except Springer, IEEE, Taylor & Francis, ACM and Hindawi. In case of Blog platform, Altmetric.com has in general better coverage and average mention values than PlumX, though for IEEE, IOP, Hindawi, and Emerald publishers the PlumX aggregators capture more mentions. Thus, in general it is observed that Altmetric.com has better coverage of Twitter and PlumX has better coverage of Facebook, rrespective of the publisher. However, in case of Mendeley and Blog platforms, the coverage and mention values of the two aggregators do not show same patterns for all the publishers.
Discussion
The article tried to present a comparative analysis of two well-known altmetric aggregators, namely, Altmetric.com and PlumX for four platforms- Twitter, Facebook, Mendeley and Blog. The variations in coverage and mention values captured by the two aggregators for different platforms is analyzed across different document types, disciplines and publishers as well. The results show that PlumX has an overall higher and wider coverage than Altmetric.com, with PlumX tracking about 93% articles as compared to Altmetric.com tracking about 50% articles. Some previous studies (Meschede & Siebenlist, 2018; Ortega, 2018b; 2019a; Zahedi & Costas, 2018b) have also found that PlumX has higher coverage of articles, with close to 95% articles tracked. However, the coverage differentiation is not same across all the four platforms. In case of Twitter and Blog platforms, Altmetric.com has better coverage than PlumX. Ortega (2018a) found similar pattern for Twitter, with Altmetric.com having better tracking of Twitter platform than PlumX. In case of Blog platform, Ortega (2020b) noted that Altmetric.com has better coverage than PlumX. In case of Mendeley platform, PlumX has higher coverage than Altmetric.com. One possible reason for this may be that PlumX and Mendeley are from the same parent company and hence better integration of data capture process. The coverage level of Facebook platform by the two aggregators is found to be quite similar, with PlumX having an edge over Altmetric.com. One possible reason for Altmetric.com recording slightly lesser Facebook mentions is that it has a policy of recording posts only from public pages . Ortega (2020a) has found that in collecting mentions from Facebook and Mendeley, Altmetric.com has performed poorly as compared to PlumX. It was found in previous studies (Zahedi & Costas, 2018b; Ortega, 2018a) that in general, Altmetric.com captures more mentions per article in case of Twitter and Blog platform, whereas PlumX captures more mentions per article for the Mendeley and Facebook platforms. In terms of correlations between the two aggregators, for mentions in the same platform, Twitter and Mendeley achieve higher correlation values indicating similar magnitude of mentions captured by the two aggregators for the commonly covered set of articles. The correlation values are in the lower range in case of Facebook and Blog platform, indicating higher differences in mentions captured by the two aggregators in these platforms. These findings for Mendeley shows agreement with previous studies where this platform was noted to have the highest inter-correlations across the aggregators (Zahedi & Costas, 2018b). However, for Twitter platform, it contradicts with finding of Meschede & Siebenlist (2018), https://help.altmetric.com/support/solutions/articles/6000060968-what-outputs-and-sources-does-altmetric-track-, accessed on 16 November, 2020. here it was found to be in the lower similarity group along with other platforms like Facebook, Blog etc. The variations of coverage and mention values of the two aggregators in the four platforms across different document types, disciplines and publishers show interesting patterns. Out of these three aspects, only discipline has been explored earlier by Ortega (2020b) for Blog and News mentions. The variations by publisher and document types have not been explored earlier. The results show that, in general Altmetric.com has better coverage and higher mention values than PlumX for Twitter platform across different document types, disciplines and publishers. Similarly, PlumX platform is seen to have better coverage and higher mention values than Altmetric.com in case of Facebook, irrespective of the document type, discipline and publisher. However, in case of Mendeley and Blog platforms, the coverage and average mention values of the two aggregators do not show a consistent pattern across all the document types, disciplines and publishers. In some cases, PlumX has better coverage and higher mentions than Altmetric.com while in several other cases Altmetric.com has better coverage and higher mentions. Thus, the variations in coverage and mentions across document types, discipline and publishers are more clearly seen in case of Mendeley and Blog platforms. The present study, thus, presents a comprehensive account of variations in coverage and mention values captured by the two aggregators- Altmetric.com and PlumX- across four different platforms. Further, the study also is perhaps the first effort to have analysed the variations across different document types and publishers. The analytical results are interesting and useful, with some contradicting whereas several others agreeing to findings of the previous studies, as illustrated above. The results have practical implications in terms of suggestion for use of a specific aggregator for data from different platforms. Conclusion
The article presents following useful results and conclusions.
Firstly , PlumX has an overall higher coverage than Altmetric.com and tracks a wider number of platforms.
Secondly , Altmetric.com captures more mentions per article in case of Twitter and Blog platform, whereas PlumX captures more mentions per article in case of Mendeley and Facebook platforms.
Thirdly , Altmetric.com and PlumX agree more in their mention values for Twitter and Mendeley platforms but mention values differ more in case of Facebook and Blog platforms, as observed by the correlation values.
Fourthly , Altmetric.com is found to have better coverage of Twitter whereas PlumX has better coverage of Facebook, across different document types, disciplines and publishers. In case of Mendeley and Blog platforms, variations in patterns of coverage and magnitude of mentions are observed across different document types, disciplines and publishers. Overall, the analytical results present a comprehensive account of the variations in coverage and mentions of the two aggregators across four different platforms. cknowledgement
The authors would like to acknowledge support of Stacy Konkiel, Director of Research Relations at Digital Science for providing access to Altmetric.com and Stephanie Faulkner Director of Product Management and Operations at Elsevier Research Metrics for providing dashboard access to PlumX data for our research work.
References
Bar-Ilan, J., Halevi, G. and Milojević, S., 2019. Differences between Altmetric Data Sources – A Case Study. Journal of Altmetrics, 2(1), p.1. DOI: http://doi.org/10.29024/joa. Costas, R., Zahedi, Z., &Wouters, P. (2015). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective.
Journal of the Association for Information Science and Technology , 66(10), 2003-2019. Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H., & Jens, T. (2014). Coverage and adoption of altmetrics sources in the bibliometric community.
Scientometrics , (2), 1145–1163. Huang, W., Wang, P., & Wu, Q. (2018). A correlation comparison between Altmetric Attention Scores and citations for six PLOS journals. PloS one, 13(4), e0194962. Meschede, C., & Siebenlist, T. (2018). Cross-metric compatibility and inconsistencies of altmetrics. Scientometrics , (1), 283-297. Ortega, J. L. (2018a). Reliability and accuracy of altmetric providers: a comparison among Altmetric. com, PlumX and Crossref Event Data. Scientometrics , 116(3), 2123-2138. Ortega, J. L. (2018b). Disciplinary differences of the impact of altmetric.
FEMS microbiology letters , 365(7), fny049. Ortega, J. L. (2018c). The life cycle of altmetric impact: A longitudinal study of six metrics from PlumX.
Journal of Informetrics , 12(3), 579-589. Ortega, J. L. (2019a). The coverage of blogs and news in the three major altmetric data providers. In , Rome, Italy. Ortega, J.L., (2019b). Availability and Audit of Links in Altmetric Data Providers: Link Checking of Blogs and News in Altmetric.com, Crossref Event Data and PlumX. Journal of Altmetrics, 2(1), p.4. DOI: http://doi.org/10.29024/joa. Ortega, J. L. (2020a). Altmetrics data providers: A meta-analysis review of the coverage of metrics and publication.
El profesional de la información (EPI) , (1). rtega, J. L. (2020b). Blogs and news sources coverage in altmetrics data providers: a comparative analysis by country, language, and subject. Scientometrics , (1), 555-572. Rupika, Uddin, A. & Singh, V. K. (2016). Measuring the university–industry–government collaboration in Indian research output. Current Science, 1904-1909. Thelwall, M. (2017). Are Mendeley reader counts high enough for research evaluations when articles are published? Aslib Journal of Information Management , (2), 174–183. https://doi.org/10.1108/AJIM-01-2017-0028 Thelwall, M. (2018). Early Mendeley readers correlate with later citation counts. Scientometrics , (3), 1231–1240. https://doi.org/10.1007/s11192-018-2715-9 Thelwall, M., & Nevill, T. (2018). Could scientists use Altmetric.com scores to predict longer term citation counts? Journal of Informetrics , (1), 237–248. Zahedi, Z., & Costas, R. (2018a). Challenges in the quality of social media data across altmetric data aggregators. Proceedings of . Zahedi, Z., & Costas, R. (2018b). General discussion of data quality challenges in social media metrics: Extensive comparison of four major altmetric data aggregators. PloS one, 13(5), e0197326. Zahedi, Z., Fenner, M., & Costas, R. (2015). Consistency among altmetrics data provider/aggregators: What are the challenges? Proceedings of Altmetrics workshop , 9 October 2015, Amsterdam Science Park, Amsterdam.
ABLES
Table 1: Social Media Sources Tracked by the two aggregators Source(s) Altmetric.com PlumX
Twitter x x Facebook x x YouTube x x Reddit x x F1000 x bit.ly x Blog x x Figshare x GitHub x Mendeley x x Slideshare x SoundCloud x SourceForge x Vimeo x Stack Exchange x Stack Overflow x Wikipedia x x News x x Goodreads x Amazon x Delicious (historical only) x CiteULike (historical only) x x Dryad x DSpace x SSRN x EBSCO x ePrints x AiritiiRead eBooks x Airiti Library x WorldCat x LinkedIn x Google+ x Pinterest x
Table 2:
Bibliographic Impact & Policy Implementation Sources tracked by the two aggregators
Source(s) Altmetric.com PlumX
Dynamed Plus Topics x Ariti Academic CitationIndex x National Institute for Health and Care Excellence (NICE) x OJS Journals x Open Syllabus x Patent Citations x eer Reviews x PLOS x Policy document source x x PubMed Central Europe x PubMed Clinical Guidelines x PubMedCentral (for PLOS articles only) x RePEc x SciELO x Scopus x USPTO x Web of Science x CrossRef x Bepress x CABI x Dimensions x
Table 3: Statistics of mentions for different platforms in the two aggregators (only mentions >0) Platforms Altmetric.com (902,990)
PlumX (1,661,477)
No. of Articles Mean Median No. of Articles Mean Median Twitter FB Mendeley
Blog
Table 4: Spearman Rank Correlation between mention counts in the two aggregators
PlumX
Altmetric.com
Twitter FB Mendeley
Blog
Twitter FB Mendeley
Blog
Table 5: Document type wise distribution of mentions in the two aggregators
Document Type Aggregator
Article Altmetric.com 545,842 8.286 140,542 2.217 691,290 33.073 55,730 1.951 PlumX 468,520 6.377 142,275 37.497 1,225,475 34.218 16,910 1.646 Book Altmetric.com 6,055 5.801 1,184 1.671 6,502 28.305 331 1.344 PlumX 4,174 6.692 715 23.793 12,759 23.719 148 1.405 Book Chapter Altmetric.com 2,001 5.978 307 1.651 3,259 48.541 191 1.435 PlumX 2,043 5.38 516 13.791 3,714 62.155 104 1.413 Proceedings Paper Altmetric.com 7,089 5.784 1,796 1.744 10,832 28.63 486 1.523 PlumX 5,763 4.61 1,785 22.769 28,649 24.647 182 1.434 Review Altmetric.com 58,628 12.067 17,925 2.831 67,511 68.622 5,631 1.703 PlumX 52,497 9.417 17,799 42.82 94,469 81.303 2,790 1.565 able 6: Discipline wise distribution of mentions in the two aggregators
Discipline Aggregator
MED Altmetric.com 301,704 10.505 96,605 2.587 361,663 33.583 23,116 1.946 PlumX 266,853 7.748 78,164 49.002 516,696 40.221 9,405 1.527 PHY Altmetric.com 72,979 3.842 11,623 1.466 93,002 21.201 9,053 1.488 PlumX 53,269 3.292 14,049 30.972 204,802 19.522 1,143 1.85 CHE Altmetric.com 55,197 3.628 9,020 1.401 72,337 31.303 3,575 1.45 PlumX 45,810 3.081 12,155 22.095 161,466 27.715 934 1.226 ENV Altmetric.com 40,432 6.476 9,010 1.844 53,018 38.809 4,808 1.719 PlumX 35,354 5.585 10,501 28.799 99,051 37.798 1,476 1.732 AH Altmetric.com 10,194 5.999 2,816 1.571 12,081 12.248 814 1.478 PlumX 6,963 5.221 1,898 30.347 25,738 11.274 287 1.307 SS Altmetric.com 114,537 9.84 36,918 1.961 136,109 42.023 12,726 1.84 PlumX 95,173 8.11 28,557 28.906 192,979 49.824 4,988 1.603 INF Altmetric.com 10,854 6.278 1,658 1.486 20,271 36.987 621 1.588 PlumX 7,423 5.735 3,202 18.685 65,892 28.539 221 1.534 BIO Altmetric.com 109,393 8.239 27,229 2.056 130,175 38.03 11,260 1.783 PlumX 102,863 6.201 26,304 33.449 181,715 41.593 3,314 1.438 MAR Altmetric.com 30,368 3.573 4,800 1.503 45,706 31.258 1,662 1.812 PlumX 23,781 3.009 6,854 25.608 136,295 26.78 536 1.278 MAT Altmetric.com 10,772 6.138 1,227 1.516 15,289 24.91 611 1.624 PlumX 8,433 4.769 2,251 27.961 38,415 19.445 158 1.741 GEO Altmetric.com 55,972 5.952 10,348 1.884 79,222 29.949 4,713 1.64 PlumX 48,856 4.806 12,651 26.889 143,839 31.889 1,421 1.53 ENG Altmetric.com 18,616 3.336 2,904 1.476 40,115 33.537 888 1.568 PlumX 13,705 3.044 6,128 25.164 151,743 26.519 405 1.696 MUL Altmetric.com 56,535 25.09 19,163 3.775 64,478 48.764 9,856 2.886 PlumX 51,136 18.882 26,362 99.871 84,051 60.863 3,634 2.117 AGR Altmetric.com 29,005 4.707 7,986 1.934 38,359 33.279 2,360 1.427 PlumX 25,774 3.85 7,467 24.109 78,858 31.689 819 1.319
Table 7: Publisher wise summary of values of Altmetric.com and PlumX
Publisher Aggregator
Springer Altmetric.com 1,512 44.47 73.5 4.98 16.4 1.66 97.5 26.24 5.1 1.45 PlumX 1,588 93.01 28.7 3.97 10.4 28.98 94.1 24.18 0.7 1.41 Nature Altmetric.com 125 79.44 86.9 31.83 29.7 4.41 98.6 64.37 18.2 2.98 PlumX 126 97.74 63 23.55 25.2 144.87 95.5 81.84 5.1 2.22 PLoS Altmetric.com 8 86.88 88 12.28 22.3 2.3 98.6 36.47 10.2 2.1 PlumX 8 98.46 72 8.96 43.1 47 96.4 48.08 3 1.57 Elsevier Altmetric.com 1,836 43.10 72.3 6.67 16.5 2.01 97.7 39.85 4.9 1.67 PlumX 1,867 94.46 31.3 5.57 12.5 27.07 94.1 40.08 0.9 1.43 IEEE Altmetric.com 163 24.98 37.7 3.31 5 1.69 96.7 32.02 1.6 1.68 PlumX 167 91.36 3.7 2.34 1 17.62 97.9 24.78 0.2 4.78 Wiley Altmetric.com 1,317 64.28 79.8 7.10 24.4 1.88 97.8 31.77 6.8 1.62 PlumX 1,322 93.16 49.8 6.64 5 31.11 93.7 36.15 1.9 1.53 Taylor & Franics Altmetric.com 1,353 45.83 73.9 6.29 15.8 1.88 96.2 25.22 4.3 1.51 PlumX 1,374 91.42 31.5 5.3 3.2 14.35 92.9 23.88 0.9 1.29 ACM Altmetric.com 31 54.78 75.8 5.12 19.3 1.25 94.7 23.85 8.7 1.29 PlumX 32 94.52 23 4.35 6.7 14.59 92.2 23.22 0.7 1 IOP Altmetric.com 69 54.13 79 6.13 13 1.67 97.3 18.03 24 1.63 PlumX 71 90.22 29.1 6.8 8.1 28.69 92.2 19.16 1.1 1.86 Oxford Univ Press Altmetric.com 325 68.66 86.4 10.29 21.9 2.03 96.3 33.87 14.1 1.54 PlumX 327 92.17 45 5.73 10.5 23.49 88.7 41.15 1.7 1.38 SAGE Altmetric.com 660 61.05 81.1 8.81 20.3 2.02 97.2 32.70 8.7 1.81 PlumX 663 94.73 41.2 8.31 3.5 25.91 92.4 36.3 2.3 1.58 Hindawi Altmetric.com 87 35.21 68.6 3.1 13.3 2.27 98.3 27.22 3.1 1.15 PlumX 97 93.17 26.9 2.76 13.8 20.09 95.7 23.85 0.6 1.19 Cell Altmetric.com 38 70.01 90.7 22.23 32.5 2.93 96.2 91.59 20.5 2.65 PlumX 38 94.9 64.6 14.84 24.6 54.76 80.1 120.58 4.5 1.54 MDPI Altmetric.com 55 69.34 74.3 5.45 10.3 2.49 99.3 32.30 3.2 1.44 PlumX 55 97.35 44.1 4.84 22.2 32.9 98 34.51 1.5 1.32 ambridge Univ Press Altmetric.com 305 46.50 75.1 7.01 22.8 1.70 91.4 25.61 8.3 1.55 PlumX 311 84.1 25.8 4.65 7.5 40.68 83 26.71 0.7 1.29 Emerald Altmetric.com 89 30.49 48.3 3.42 8.1 1.61 97.8 51.13 1.7 1.31 PlumX 91 92.52 15.5 2.94 0.6 1 98.5 58.48 0.4 4.37
IGURES
Altmetric.com PlumX Overlap
Figure 1: Number of Records found in PlumX and Altmetric.com
Figure 2: Coverage Percentages for various platforms in the two aggregators Figure 3: Mean of differences of mentions (Altmetric.com -- PlumX) for different platforms (a) Twitter (b) Facebook (c) Mendeley (d) Blog