Featured Researches

Digital Libraries

Author Impact: Evaluations, Predictions, and Challenges

Author impact evaluation and prediction play a key role in determining rewards, funding, and promotion. In this paper, we first introduce the background of author impact evaluation and prediction. Then, we review recent developments of author impact evaluation, including data collection, data pre-processing, data analysis, feature selection, algorithm design, and algorithm evaluation. Thirdly, we provide an in-depth literature review on author impact predictive models and common evaluation metrics. Finally, we look into the representative research issues, including author impact inflation, unified evaluation standards, academic success gene, identification of the origins of hot streaks, and higher-order academic networks analysis. This paper should help the researchers obtain a broader understanding in author impact evaluation and prediction, and provides future research directions.

Read more
Digital Libraries

Author name disambiguation of bibliometric data: A comparison of several unsupervised approaches

Adequately disambiguating author names in bibliometric databases is a precondition for conducting reliable analyses at the author level. In the case of bibliometric studies that include many researchers, it is not possible to disambiguate each single researcher manually. Several approaches have been proposed for author name disambiguation but there has not yet been a comparison of them under controlled conditions. In this study, we compare a set of unsupervised disambiguation approaches. Unsupervised approaches specify a model to assess the similarity of author mentions a priori instead of training a model with labelled data. In order to evaluate the approaches, we applied them to a set of author mentions annotated with a ResearcherID, this being an author identifier maintained by the researchers themselves. Apart from comparing the overall performance, we take a more detailed look at the role of the parametrization of the approaches and analyse the dependence of the results on the complexity of the disambiguation task. It could be shown that all of the evaluated approaches produce better results than those that can be obtained by using only author names. In the context of this study, the approach proposed by Caron and van Eck (2014) produced the best results.

Read more
Digital Libraries

Authorship analysis of specialized vs diversified research output

The present work investigates the relations between amplitude and type of collaboration (intramural, extramural domestic or international) and output of specialized versus diversified research. By specialized or diversified research, we mean within or beyond the author's dominant research topic. The field of observation is the scientific production over five years from about 23,500 academics. The analyses are conducted at the aggregate and disciplinary level. The results lead to the conclusion that in general, the output of diversified research is no more frequently the fruit of collaboration than is specialized research. At the level of the particular collaboration types, international collaborations weakly underlie the specialized kind of research output; on the contrary, extramural domestic and intramural collaborations are weakly associated with diversified research. While the weakness of association remains, exceptions are observed at the level of the individual disciplines.

Read more
Digital Libraries

AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Authors of research papers in the fields of mathematics, and other math-heavy disciplines commonly employ the Mathematics Subject Classification (MSC) scheme to search for relevant literature. The MSC is a hierarchical alphanumerical classification scheme that allows librarians to specify one or multiple codes for publications. Digital Libraries in Mathematics, as well as reviewing services, such as zbMATH and Mathematical Reviews (MR) rely on these MSC labels in their workflows to organize the abstracting and reviewing process. Especially, the coarse-grained classification determines the subject editor who is responsible for the actual reviewing process. In this paper, we investigate the feasibility of automatically assigning a coarse-grained primary classification using the MSC scheme, by regarding the problem as a multi-class classification machine learning task. We find that our method achieves an (F_1)-score of over 77%, which is remarkably close to the agreement of zbMATH and MR ((F_1)-score of 81%). Moreover, we find that the method's confidence score allows for reducing the effort by 86% compared to the manual coarse-grained classification effort while maintaining a precision of 81% for automatically classified articles.

Read more
Digital Libraries

Banana for scale: Gauging trends in academic interest by normalising publication rates to common and innocuous keywords

Many academics use yearly publication numbers to quantify academic interest for their research topic. While such visualisations are ubiquitous in grant applications, manuscript introductions, and review articles, they fail to account for the rapid growth in scientific publications. As a result, any search term will likely show an increase in supposed "academic interest". One proposed solution is to normalise yearly publication rates by field size, but this is arduous and difficult. Here, we propose an simpler index that normalises keywords of interest by a ubiquitous and innocuous keyword, such as "banana". Alternatively, one could opt for field-specific keywords or hierarchical structures (e.g. PubMed's Medical Subject Headings, MeSH) to compute "interest market share". Using this approach, we uncovered plausible trends in academic interest in examples from the medical literature. In neuroimaging, we found that not the supplementary motor area (as was previously claimed), but the prefrontal cortex is the most interesting part of the brain. In cancer research, we found a contemporary preference for cancers with high prevalence and clinical severity, and notable declines in interest for more treatable or likely benign neoplasms. Finally, we found that interest in respiratory viral infections spiked when strains showed potential for pandemic involvement, with SARS-CoV-2 and the COVID-19 pandemic being the most extreme example. In sum, the time is ripe for a quick and easy method to quantify trends in academic interest for anecdotal purposes. We provide such a method, along with software for researchers looking to implement it in their own writing.

Read more
Digital Libraries

Being published successfully or getting arXived? The importance of social capital and interdisciplinary collaboration for getting printed in a high impact journal in Physics

The structure of collaboration is known to be of great importance for the success of scientific endeavors. In particular, various types of social capital employed in co-authored work and projects bridging disciplinary boundaries have attracted researchers' interest. Almost all previous studies, however, use samples with an inherent survivor bias, i.e., they focus on papers that have already been published. In contrast, our article examines the chances for getting a working paper published by using a unique dataset of 245,000 papers uploaded to arXiv. ArXiv is a popular preprint platform in Physics which allows us to construct a co-authorship network from which we can derive different types of social capital and interdisciplinary teamwork. To emphasize the 'normal case' of community-specific standards of excellence, we assess publications in Physics' high impact journals as success. Utilizing multilevel event history models, our results reveal that already a moderate number of persistent collaborations spanning at least two years is the most important social antecedent of getting a manuscript published successfully. In contrast, inter- and subdisciplinary collaborations decrease the probability of publishing in an eminent journal in Physics, which can only partially be mitigated by scientists' social capital.

Read more
Digital Libraries

Berlin: A Quantitative View of the Structure of Institutional Scientific Collaborations

This paper examines the structure of scientific collaborations in a large European metropolitan area. It aims to identify strategic coalitions among organizations in Berlin as a specific case with high institutional and sectoral diversity. By adopting a global, regional and organization based approach we provide a quantitative, exploratory and macro view of this diversity. We use publications data with at least one organization located in Berlin from 1996-2017. We further investigate four members of the Berlin University Alliance (BUA) through their self-represented research profiles comparing it with empirical results of OECD disciplines. Using a bipartite network modeling framework, we are able to move beyond the uncontested trend towards team science and increasing internationalization. Our results show that BUA members shape the structure of scientific collaborations in the region. However, they are not collaborating cohesively in all disciplines. Larger divides exist in some disciplines e.g., Agricultural Sciences and Humanities. Only Medical and Health Sciences have cohesive intraregional collaborations which signals the success of regional cooperation established in 2003. We explain possible underlying factors shaping the observed trends and sectoral and intra-regional groupings. A major methodological contribution of this paper is evaluating coverage and accuracy of different organization name disambiguation techniques.

Read more
Digital Libraries

Best Practices for Implementing FAIR Vocabularies and Ontologies on the Web

With the adoption of Semantic Web technologies, an increasing number of vocabularies and ontologies have been developed in different domains, ranging from Biology to Agronomy or Geosciences. However, many of these ontologies are still difficult to find, access and understand by researchers due to a lack of documentation, URI resolving issues, versioning problems, etc. In this chapter we describe guidelines and best practices for creating accessible, understandable and reusable ontologies on the Web, using standard practices and pointing to existing tools and frameworks developed by the Semantic Web community. We illustrate our guidelines with concrete examples, in order to help researchers implement these practices in their future vocabularies.

Read more
Digital Libraries

Beyond the Western Core-Periphery Model: Analysing Scientific Mobility and Collaboration in the Middle East and North Africa

This study investigates the scientific mobility and international collaboration networks in the Middle East and North Africa (MENA) region between 2008 and 2017. The main goal is to establish mobility and collaboration profiles at the region and country levels. By using affiliation metadata available in scientific publications, we track international scientific mobility and collaboration networks in the region. Three complementary approaches allow us to obtain a detailed characterization of scientific mobility. First, we study the mobility flows for each country to uncover the main destinations and origins of mobile scholars. Results reveal geographical, cultural, historical, and linguistic proximities. Cooperation and exchange programs also contribute to explain some of the observed flows. Second, we introduce mobile scientists' academic age. The average academic age of migrant scholars in MENA between 2008 and 2017 was about 12.4 years. For most countries, immigrants are relatively younger than emigrants, except for Iran, Palestine, Lebanon, and Turkey. Scholars who migrated to Gulf Cooperation Council (GCC) countries, Jordan and Morocco were in average younger than emigrants by 1.5 year from the same countries. The academic age group 6-to-10 years is the most common for both emigrant and immigrant scholars. Third, we analyse gender differences of scholars. We observe a clear gender gap in terms of scientific mobility: Male scholars represent the largest group of migrants in MENA countries. We conclude discussing the policy relevance of the scientific mobility and collaboration aspects and discuss limitations and further research.

Read more
Digital Libraries

Bibliometric analysis of the world scientific production in Chemical Engineering during 2000-2011. Part 2: Analysis of the 1,000 most cited publications

A comprehensive bibliometric analysis of the scientific production of Chemical Engineering area has been carried out using the Web of Science database for the period 2000-2011 through three complementary studies. Part 2 demonstrated a displacement of the most cited publications to the Far East, especially due to China, however, this displacement is less important to that observed for total scientific production (Part 1). United States is still the country with the highest number of articles among the 1,000 most cited (31.5%), largely above what expected from their number of publications, followed by Germany (8.4%) and China (7.5%). The international collaboration, at least globally, seems not being an important issue for producing highly cited papers. In fact, only two from the top 25 most cited papers were international collaborations (8%). Furthermore, a large share of reviews among the 1,000 most cited papers (65%) has been observed. Although the number of institutions with more publications among the most cited in the area are from United States, the two institutions with the highest cited papers are CNRS (France) and CSIC (Spain). The most cited papers are highly concentrated in a few journals: around half of the most cited papers were published in five journals. Generally, the most cited papers are published in journals with high impact factors, however, there is also a significant number of highly cited papers published in journals with low or not having impact factor.

Read more

Ready to get started?

Join us today