Featured Researches

Digital Libraries

Archive Assisted Archival Fixity Verification Framework

The number of public and private web archives has increased, and we implicitly trust content delivered by these archives. Fixity is checked to ensure an archived resource has remained unaltered since the time it was captured. Some web archives do not allow users to access fixity information and, more importantly, even if fixity information is available, it is provided by the same archive from which the archived resources are requested. In this research, we propose two approaches, namely Atomic and Block, to establish and check fixity of archived resources. In the Atomic approach, the fixity information of each archived web page is stored in a JSON file (or a manifest), and published in a well-known web location (an Archival Fixity server) before it is disseminated to several on-demand web archives. In the Block approach, we first batch together fixity information of multiple archived pages in a single binary-searchable file (or a block) before it is published and disseminated to archives. In both approaches, the fixity information is not obtained directly from archives. Instead, we compute the fixity information (e.g., hash values) based on the playback of archived resources. One advantage of the Atomic approach is the ability to verify fixity of archived pages even with the absence of the Archival Fixity server. The Block approach requires pushing fewer resources into archives, and it performs fixity verification faster than the Atomic approach. On average, it takes about 1.25X, 4X, and 36X longer to disseminate a manifest to this http URL, this http URL, and this http URL, respectively, than this http URL, while it takes 3.5X longer to disseminate a block to this http URL than this http URL. The Block approach performs 4.46X faster than the Atomic approach on verifying the fixity of archived pages.

Read more
Digital Libraries

Archiving and referencing source code with Software Heritage

Software, and software source code in particular, is widely used in modern research. It must be properly archived, referenced, described and cited in order to build a stable and long lasting corpus of scientic knowledge. In this article we show how the Software Heritage universal source code archive provides a means to fully address the first two concerns, by archiving seamlessly all publicly available software source code, and by providing intrinsic persistent identifiers that allow to reference it at various granularities in a way that is at the same time convenient and effective. We call upon the research community to adopt widely this approach.

Read more
Digital Libraries

Are Female Scientists Less Inclined to Publish Alone? The Gender Solo Research Gap

Solo research is a result of individual authorship decisions which accumulate over time, accompanying academic careers. This research is the first to comprehensively study the gender solo research gap within a whole national system: We examine the gap through individual publication portfolios constructed for each internationally visible Polish university professor. Solo research is a special case of academic publishing where scientists compete individually, sending clear signals about their research ability. Solo research has been expected to disappear for half a century, but it continues to exist. Our focus is on how male and female scientists of various biological ages, age groups, academic positions, institutions, and institutional types make use of, and benefit from, solo publishing. We tested the hypothesis that male and female scientists differ in their use of solo publishing, and we termed this difference the gender solo research gap. The highest share of solo research for both genders is noted for middle-aged scientists working as associate professors rather than for young scientists as in previous studies. The low journal prestige level of female solo publications may suggest female propensity to choose less competitive publication outlets. In our unique biographical, administrative, publication, and citation database (Polish Science Observatory), we have metadata on all Polish scientists present in Scopus (N=25,463) and on their 158,743 Scopus-indexed articles published in 2009-2018, including 18,900 solo articles.

Read more
Digital Libraries

Are University Rankings Statistically Significant? A Comparison among Chinese Universities and with the USA

Purpose: We address the question of whether differences are statistically significant in the rankings of universities. We propose methods measuring the statistical significance among different universities and illustrate the results by empirical data. Design/methodology/approach: Based on z-testing and overlapping confidence intervals, and using data about 205 Chinese universities included in the Leiden Rankings 2020, we argue that three main groups of Chinese research universities can be distinguished. Findings: When the sample of 205 Chinese universities is merged with the 197 US universities included in Leiden Rankings 2020, the results similarly indicate three main groups: high, middle, low. Using this data (Leiden Rankings and Web-of-Science), the z-scores of the Chinese universities are significantly below those of the US universities albeit with some overlap. Research limitations: We show empirically that differences in ranking may be due to changes in the data, the models, or the modeling effects on the data. The scientometric groupings are not always stable when we use different methods. R&D policy implications: Differences among universities can be tested for their statistical significance. The statistics relativize the values of decimals in the rankings. One can operate with a scheme of low/middle/high in policy debates and leave the more fine-grained rankings of individual universities to operational management and local settings. Originality/value: In the discussion about the rankings of universities, the question of whether differences are statistically significant, is, in our opinion, insufficiently addressed.

Read more
Digital Libraries

Are nationally oriented journals indexed in Scopus becoming more international? The effect of publication language and access modality

An exploratory, descriptive analysis is presented of the national orientation of scientific, scholarly journals as reflected in the affiliations of publishing or citing authors. It calculates for journals covered in Scopus an Index of National Orientation (INO), and analyses the distribution of INO values across disciplines and countries, and the correlation between INO values and journal impact factors. The study did not find solid evidence that journal impact factors are good measures of journal internationality in terms of the geographical distribution of publishing or citing authors, as the relationship between a journal's national orientation and its citation impact is found to be inverse U-shaped. In addition, journals publishing in English are not necessarily internationally oriented in terms of the affiliations of publishing or citing authors; in social sciences and humanities also USA has their nationally oriented literatures. The paper examines the extent to which nationally oriented journals entering Scopus in earlier years, have become in recent years more international. It is found that in the study set about 40 per cent of such journals does reveal traces of internationalization, while the use of English as publication language and an Open Access (OA) status are important determinants.

Read more
Digital Libraries

Are papers addressing certain diseases perceived where these diseases are prevalent? The proposal to use Twitter data as social-spatial sensors

We propose to use Twitter data as social-spatial sensors. This study deals with the question whether research papers on certain diseases are perceived by people in regions (worldwide) that are especially concerned by the diseases. Since (some) Twitter data contain location information, it is possible to spatially map the activity of Twitter users referring to certain papers (e.g., dealing with tuberculosis). The resulting maps reveal whether heavy activity on Twitter is correlated with large numbers of people having certain diseases. In this study, we focus on tuberculosis, human immunodeficiency virus (HIV), and malaria, since the World Health Organization ranks these diseases as the top three causes of death worldwide by a single infectious agent. The results of the social-spatial Twitter maps (and additionally performed regression models) reveal the usefulness of the proposed sensor approach. One receives an impression of how research papers on the diseases have been perceived by people in regions that are especially concerned by the diseases. Our study demonstrates a promising approach for using Twitter data for research evaluation purposes beyond simple counting of tweets.

Read more
Digital Libraries

Associations between author-level metrics in subsequent time periods

Understanding the dynamics of authors is relevant to predict and quantify performance in science. While the relationship between recent and future citation counts is well-known, many relationships between scholarly metrics at the author-level remain unknown. In this context, we performed an analysis of author-level metrics extracted from subsequent periods, focusing on visibility, productivity and interdisciplinarity. First, we investigated how metrics controlled by the authors (such as references diversity and productivity) affect their visibility and citation diversity. We also explore the relation between authors' interdisciplinarity and citation counts. The analysis in a subset of Physics papers revealed that there is no strong correlation between authors' productivity and future visibility for most of the authors. A higher fraction of strong positive correlations though was found for those with a lower number of publications. We also found that reference diversity computed at the author-level may impact positively authors' future visibility. The analysis of metrics impacting future interdisciplinarity suggests that productivity may play a role only for low productivity authors. We also found a surprisingly strong positive correlation between references diversity and interdisciplinarity, suggesting that an increase in diverse citing behavior may be related to a future increase in authors interdisciplinarity. Finally, interdisciplinarity and visibility were found to be moderated positively associated: significant positive correlations were observed for 30% of authors with lower productivity.

Read more
Digital Libraries

Attitudes toward Open Access, Open Peer Review, and Altmetrics among Contributors to Spanish Scholarly Journals

This paper aims to gain a better understanding of the perspectives of contributors to Spanish academic journals regarding open access, open peer review, and altmetrics. It also explores how age, gender, professional experience, career history, and perception and use of social media influence authors opinions toward these developments in scholarly publishing. A sample of contributors (n-1254) to Spanish academic journals was invited to participate in a survey about the aforementioned topics. The response rate was 24 per cent (n-295). Contributors to Spanish scholarly journals hold a favourable opinion of open access but were more cautious about open peer review and altmetrics. Younger and female scholars were more reluctant to accept open peer review practices. A positive attitude toward social networks did not necessarily translate into enthusiasm for emerging trends in scholarly publishing. Despite this, ResearchGate users were more aware of altmetrics.

Read more
Digital Libraries

Attributing and Referencing (Research) Software: Best Practices and Outlook from Inria

Software is a fundamental pillar of modern scientiic research, not only in computer science, but actually across all elds and disciplines. However, there is a lack of adequate means to cite and reference software, for many reasons. An obvious rst reason is software authorship, which can range from a single developer to a whole team, and can even vary in time. The panorama is even more complex than that, because many roles can be involved in software development: software architect, coder, debugger, tester, team manager, and so on. Arguably, the researchers who have invented the key algorithms underlying the software can also claim a part of the authorship. And there are many other reasons that make this issue complex. We provide in this paper a contribution to the ongoing eeorts to develop proper guidelines and recommendations for software citation, building upon the internal experience of Inria, the French research institute for digital sciences. As a central contribution, we make three key recommendations. (1) We propose a richer taxonomy for software contributions with a qualitative scale. (2) We claim that it is essential to put the human at the heart of the evaluation. And (3) we propose to distinguish citation from reference.

Read more
Digital Libraries

Author Growth Outstrips Publication Growth in Computer Science and Publication Quality Correlates with Collaboration

Although the computer science community successfully harnessed exponential increases in computer performance to drive societal and economic change, the exponential growth in publications is proving harder to accommodate. To gain a deeper understanding of publication growth and inform how the computer science community should handle this growth, we analyzed publication practices from several perspectives: ACM sponsored publications in the ACM Digital Library as a whole: subdisciplines captured by ACM's Special Interest Groups (SIGs); ten top conferences; institutions; four top U.S. departments; authors; faculty; and PhDs between 1990 and 2012. ACM publishes a large fraction of all computer science research. We first summarize how we believe our main findings inform (1) expectations on publication growth, (2) how to distinguish research quality from output quantity; and (3) the evaluation of individual researchers. We then further motivate the study of computer science publication practices and describe our methodology and results in detail.

Read more

Ready to get started?

Join us today