Featured Researches

Digital Libraries

Metrics and peer review agreement at the institutional level

In the past decades, many countries have started to fund academic institutions based on the evaluation of their scientific performance. In this context, peer review is often used to assess scientific performance. Bibliometric indicators have been suggested as an alternative. A recurrent question in this context is whether peer review and metrics tend to yield similar outcomes. In this paper, we study the agreement between bibliometric indicators and peer review at the institutional level. Additionally, we also quantify the internal agreement of peer review at the institutional level. We find that the level of agreement is generally higher at the institutional level than at the publication level. Overall, the agreement between metrics and peer review is on par with the internal agreement among two reviewers for certain fields of science. This suggests that for some fields, bibliometric indicators may possibly be considered as an alternative to peer review for national research assessment exercises.

Read more
Digital Libraries

Might Europe one day again be a global scientific powerhouse? Analysis of ERC publications suggests it will not be possible without changes in research policy

Numerous EU documents praise the excellence of EU research without empirical evidence and against academic studies. We investigated research performance in two fields of high socioeconomic importance, advanced technology and basic medical research, in two sets of European countries, Germany, France, Italy, and Spain (GFIS), and the UK, the Netherlands, and Switzerland (UKNCH). Despite historical and geographical proximity, research performance in GFIS is much lower than in UKNCH, and well below the world average. Funding from the European Research Council (ERC) greatly improves performance both in GFIS and UKNCH, but ERC-GFIS publications are less cited than ERC-UKNCH publications. We conclude that research performance in GFIS and in other EU countries is intrinsically low even when it is generously funded. The technological and economic future of the EU depends on improving research, which requires structural changes in research policy within the EU, and in most EU countries.

Read more
Digital Libraries

Mining the online infosphere: A survey

The evolution of AI-based system and applications had pervaded everyday life to make decisions that have momentous impact on individuals and society. With the staggering growth of online data, often termed as the Online Infosphere it has become paramount to monitor the infosphere to ensure social good as the AI-based decisions are severely dependent on it. The goal of this survey is to provide a comprehensive review of some of the most important research areas related to infosphere, focusing on the technical challenges and potential solutions. The survey also outlines some of the important future directions. We begin by discussions focused on the collaborative systems that have emerged within the infosphere with a special thrust on Wikipedia. In the follow up we demonstrate how the infosphere has been instrumental in the growth of scientific citations and collaborations thus fueling interdisciplinary research. Finally, we illustrate the issues related to the governance of the infosphere such as the tackling of the (a) rising hateful and abusive behavior and (b) bias and discrimination in different online platforms and news reporting.

Read more
Digital Libraries

Mining university rankings: Publication output and citation impact as their basis

World University rankings have become well-established tools that students, university managers and policy makers read and use. Each ranking claims to have a unique methodology capable of measuring the 'quality' of universities. The purpose of this paper is to analyze to which extent these different rankings measure the same phenomenon and what it is that they are measuring. For this, we selected a total of seven world-university rankings and performed a principal component analysis. After ensuring that despite their methodological differences, they all come together to a single component, we hypothesized that bibliometric indicators could explain what is being measured. Our analyses show that ranking scores from whichever of the seven league tables under study can be explained by the number of publications and citations received by the institution. We conclude by discussing policy implications and opportunities on how a nuanced and responsible use of rankings can help decision making at the institutional level

Read more
Digital Libraries

Modeling Updates of Scholarly Webpages Using Archived Data

The vastness of the web imposes a prohibitive cost on building large-scale search engines with limited resources. Crawl frontiers thus need to be optimized to improve the coverage and freshness of crawled content. In this paper, we propose an approach for modeling the dynamics of change in the web using archived copies of webpages. To evaluate its utility, we conduct a preliminary study on the scholarly web using 19,977 seed URLs of authors' homepages obtained from their Google Scholar profiles. We first obtain archived copies of these webpages from the Internet Archive (IA), and estimate when their actual updates occurred. Next, we apply maximum likelihood to estimate their mean update frequency ( λ ) values. Our evaluation shows that λ values derived from a short history of archived data provide a good estimate for the true update frequency in the short-term, and that our method provides better estimations of updates at a fraction of resources compared to the baseline models. Based on this, we demonstrate the utility of archived data to optimize the crawling strategy of web crawlers, and uncover important challenges that inspire future research directions.

Read more
Digital Libraries

NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature

As part of the NLP Scholar project, we created a single unified dataset of NLP papers and their meta-information (including citation numbers), by extracting and aligning information from the ACL Anthology and Google Scholar. In this paper, we describe several interconnected interactive visualizations (dashboards) that present various aspects of the data. Clicking on an item within a visualization or entering query terms in the search boxes filters the data in all visualizations in the dashboard. This allows users to search for papers in the area of their interest, published within specific time periods, published by specified authors, etc. The interactive visualizations presented here, and the associated dataset of papers mapped to citations, have additional uses as well including understanding how the field is growing (both overall and across sub-areas), as well as quantifying the impact of different types of papers on subsequent publications.

Read more
Digital Libraries

Nature, Science, and PNAS -- Disciplinary profiles and impact

Nature, Science, and PNAS are the three most prestigious general-science journals, and Nature and Science are among the most influential journals overall, based on the journal Impact Factor (IF). In this paper we perform automatic classification of ~50,000 articles in these journals (published in the period 2005-2015) into 14 broad areas, to explore disciplinary profiles and to determine their field-specific IFs. We find that in all three journals the articles from Bioscience, Astronomy, and Geosciences are over-represented, with other areas being under-represented, some of them severely. Discipline-specific IFs in these journals vary greatly, for example, between 18 and 46 for Nature. We find that the areas that have the highest disciplinary IFs are not the ones that contribute the most articles. We also find that publishing articles in these three journals brings prestige for articles in all areas, but at different levels, the least being for Astronomy. Comparing field-specific IFs of Nature, Science and PNAS to other top journals in six largest areas (Bioscience, Medicine, Geosciences, Physics, Astronomy, and Chemistry) these three journals are always among the top seven journals, with Nature being at the very top for all fields except in Medicine.

Read more
Digital Libraries

Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view

Timely access to accurate scientific literature in the battle with the ongoing COVID-19 pandemic is critical. This unprecedented public health risk has motivated research towards understanding the disease in general, identifying drugs to treat the disease, developing potential vaccines, etc. This has given rise to a rapidly growing body of literature that doubles in number of publications every 20 days as of May 2020. Providing medical professionals with means to quickly analyze the literature and discover growing areas of knowledge is necessary for addressing their question and information needs. In this study we analyze the LitCovid collection, 13,369 COVID-19 related articles found in PubMed as of May 15th, 2020 with the purpose of examining the landscape of literature and presenting it in a format that facilitates information navigation and understanding. We do that by applying state-of-the-art named entity recognition, classification, clustering and other NLP techniques. By applying NER tools, we capture relevant bioentities (such as diseases, internal body organs, etc.) and assess the strength of their relationship with COVID-19 by the extent they are discussed in the corpus. We also collect a variety of symptoms and co-morbidities discussed in reference to COVID-19. Our clustering algorithm identifies topics represented by groups of related terms, and computes clusters corresponding to documents associated with the topic terms. Among the topics we observe several that persist through the duration of multiple weeks and have numerous associated documents, as well several that appear as emerging topics with fewer documents. All the tools and data are publicly available, and this framework can be applied to any literature collection. Taken together, these analyses produce a comprehensive, synthesized view of COVID-19 research to facilitate knowledge discovery from literature.

Read more
Digital Libraries

Neural Embeddings of Scholarly Periodicals Reveal Complex Disciplinary Organizations

Understanding the structure of knowledge domains is one of the foundational challenges in science of science. Here, we propose a neural embedding technique that leverages the information contained in the citation network to obtain continuous vector representations of scientific periodicals. We demonstrate that our periodical embeddings encode nuanced relationships between periodicals as well as the complex disciplinary and interdisciplinary structure of science, allowing us to make cross-disciplinary analogies between periodicals. Furthermore, we show that the embeddings capture meaningful "axes" that encompass knowledge domains, such as an axis from "soft" to "hard" sciences or from "social" to "biological" sciences, which allow us to quantitatively ground periodicals on a given dimension. By offering novel quantification in science of science, our framework may in turn facilitate the study of how knowledge is created and organized.

Read more
Digital Libraries

New Research Trends in Unconventional Oil and Gas Environmental Issue: A Bibliometric Analysis

With the booming of unconventional gas production in the world, how to balance environment pollution risk and economy of unconventional gas have become a common dilemma around the world. The aim of this study is to elucidate the research about environmental issue brought with development of unconventional oil and gas industry. To achieve this goal, we present a bibliometrics overview of this field from 1990 to 2018. Firstly, this study outlines a basic statistical analysis over journals, publications, authors, institutions and documents. Secondly, VOSviewer is employed to visualize the collaborative relationship to show the link between different author, institutions, regions and journals. Finally, document bibliographic coupling, cooccurrence and keyword burst detection are analyzed to reveal the emerging trend and hot topic. The results indicate that among all countries, America was the most productive country as well as cooperated the most with other countries, followed by China, while the China University of Petroleum is the most productive institution in the world, with 105 publications. Additionally, most articles were classified as energy fuels, environmental sciences and geosciences multidisciplinary. Furthermore, based on emerging trends analysis, it was concluded that hydraulic fracturing technology has become a hot topic, other popular research topics include: energy policy and regulation of unconventional gas development, greenhouse gas emissions, energy and water consumption of unconventional gas life cycle assessment.

Read more

Ready to get started?

Join us today