Featured Researches

Digital Libraries

Global Distribution of Google Scholar Citations: A Size-independent Institution-based Analysis

Most currently available schemes for performance based ranking of Universities or Research organizations, such as, Quacarelli Symonds (QS), Times Higher Education (THE), Shanghai University based All Research of World Universities (ARWU) use a variety of criteria that include productivity, citations, awards, reputation, etc., while Leiden and Scimago use only bibliometric indicators. The research performance evaluation in the aforesaid cases is based on bibliometric data from Web of Science or Scopus, which are commercially available priced databases. The coverage includes peer reviewed journals and conference proceedings. Google Scholar (GS) on the other hand, provides a free and open alternative to obtaining citations of papers available on the net, (though it is not clear exactly which journals are covered.) Citations are collected automatically from the net and also added to self created individual author profiles under Google Scholar Citations (GSC). This data was used by Webometrics Lab, Spain to create a ranked list of 4000+ institutions in 2016, based on citations from only the top 10 individual GSC profiles in each organization. (GSC excludes the top paper for reasons explained in the text; the simple selection procedure makes the ranked list size-independent as claimed by the Cybermetrics Lab). Using this data (Transparent Ranking TR, 2016), we find the regional and country wise distribution of GS-TR Citations. The size independent ranked list is subdivided into deciles of 400 institutions each and the number of institutions and citations of each country obtained for each decile. We test for correlation between institutional ranks between GS TR and the other ranking schemes for the top 20 institutions.

Read more
Digital Libraries

Global Research Trends in the Modern Language Journal from 1999 to 2018: A Data-Driven Analysis

The present study conducts a scientometric study of the Modern Language Journal literature from 1999 to 2018 based on the database of Web of Science, 2018. A total of 2564 items resulted from the publication name using "Modern Language Journal" as the search term. Based on the number of publications during the study period no consistent growth is observed in the research activities pertaining to the journal. The annual distribution of publications, number of authors, institution productivity, country wise publications and Citations are analyzed. Highly productive authors, institutions, and countries are identified. The results reveal that the maximum number of papers 179 is published in the year 1999. It was also observed that Byrnes H is the most productive, contributed 51 publications and Kramsch C is most cited author in the field having 543 global citations. The highest number (38.26%) of publications, contributed from USA and the foremost productive establishment was University of Iowa.

Read more
Digital Libraries

Go Wide, Go Deep: Quantifying the Impact of Scientific Papers through Influence Dispersion Trees

Despite a long history of use of citation count as a measure to assess the impact or influence of a scientific paper, the evolution of follow-up work inspired by the paper and their interactions through citation links have rarely been explored to quantify how the paper enriches the depth and breadth of a research field. We propose a novel data structure, called Influence Dispersion Tree (IDT) to model the organization of follow-up papers and their dependencies through citations. We also propose the notion of an ideal IDT for every paper and show that an ideal (highly influential) paper should increase the knowledge of a field vertically and horizontally. Upon suitably exploring the structural properties of IDT, we derive a suite of metrics, namely Influence Dispersion Index (IDI), Normalized Influence Divergence (NID) to quantify the influence of a paper. Our theoretical analysis shows that an ideal IDT configuration should have equal depth and breadth (and thus minimize the NID value). We establish the superiority of NID as a better influence measure in two experimental settings. First, on a large real-world bibliographic dataset, we show that NID outperforms raw citation count as an early predictor of the number of new citations a paper will receive within a certain period after publication. Second, we show that NID is superior to the raw citation count at identifying the papers recognized as highly influential through Test of Time Award among all their contemporary papers (published in the same venue). We conclude that in order to quantify the influence of a paper, along with the total citation count, one should also consider how the citing papers are organized among themselves to better understand the influence of a paper on the research field. For reproducibility, the code and datasets used in this study are being made available to the community.

Read more
Digital Libraries

Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations

New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories. In response, this paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study. Google Scholar found 88% of all citations, many of which were not found by the other sources, and nearly all citations found by the remaining sources (89%-94%). A similar pattern held within most subject categories. Microsoft Academic is the second largest overall (60% of all citations), including 82% of Scopus citations and 86% of Web of Science citations. In most categories, Microsoft Academic found more citations than Scopus and WoS (182 and 223 subject categories, respectively), but had coverage gaps in some areas, such as Physics and some Humanities categories. After Scopus, Dimensions is fourth largest (54% of all citations), including 84% of Scopus citations and 88% of WoS citations. It found more citations than Scopus in 36 categories, more than WoS in 185, and displays some coverage gaps, especially in the Humanities. Following WoS, COCI is the smallest, with 28% of all citations. Google Scholar is still the most comprehensive source. In many subject categories Microsoft Academic and Dimensions are good alternatives to Scopus and WoS in terms of coverage.

Read more
Digital Libraries

Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties

In this work, in which we build on, and use the outcome of, an earlier study on topic identification in an algorithmically constructed publication-level classification (ACPLC), we address the issue how to algorithmically obtain a classification of topics (containing articles), where the classes of the classification correspond to specialties. The methodology we propose, which is similar to the one used in the earlier study, uses journals and their articles to construct a baseline classification. The underlying assumption of our approach is that journals of a particular size and foci have a scope that correspond to specialties. By measuring the similarity between (1) the baseline classification and (2) multiple classifications obtained by topic clustering and using different values of a resolution parameter, we have identified a best-performing ACPLC. In two case studies, we could identify the subject foci of involved specialties, and the subject foci of specialties were relatively easy to distinguish. Further, the class size variation regarding the best performing ACPLC is moderate, and only a small proportion of the articles belong to very small classes. For these reasons, we conclude that the proposed methodology is suitable to determine the specialty granularity level of an ACPLC.

Read more
Digital Libraries

Growth and dynamics of Econophysics: A bibliometric and network analysis

Digitization of publications, advancement in communication technology, and the availability of bibliographic data have made it easier for the researchers to study the growth and dynamics of any discipline. We present a study on "Econophysics" metadata extracted from the Web of Science managed by the Clarivate Analytics from 2000-2019. The study highlights the growth and dynamics of the discipline by measures of a number of publications, citations on publications, other disciplines contribution, institutions participation, country-wise spread, etc. We investigate the impact of self-citations on citations with every five-year interval. Also, we find the contribution of other disciplines by analyzing the cited references. Results emerged from micro, meso and macro-level analysis of collaborations show that the distributions among authors collaboration and affiliations of authors follow a power law. Thus, very few authors keep producing most of the papers and are from a few institutions. We find that China is leading in the production of a number of authors and a number of papers; however, shares more of national collaboration rather than international, whereas the USA shares more international collaboration. Finally, we demonstrate the evolution of the author's collaborations and affiliations networks from 2000-2019. Overall the analysis reveals the "small-world" property of the network with average path length 5. As a consequence of our analysis, this study can serve as in-depth knowledge to understand the growth and dynamics of the Econophysics network both qualitatively and quantitatively.

Read more
Digital Libraries

Growth rates of modern science: A latent piecewise growth curve approach to model publication numbers from established and new literature databases

Growth of science is a prevalent issue in science of science studies. In recent years, two new bibliographic databases have been introduced which can be used to study growth processes in science from centuries back: Dimensions from Digital Science and Microsoft Academic. In this study, we used publication data from these new databases and added publication data from two established databases (Web of Science from Clarivate Analytics and Scopus from Elsevier) to investigate scientific growth processes from the beginning of the modern science system until today. We estimated regression models that included simultaneously the publication counts from the four databases. The results of the unrestricted growth of science calculations show that the overall growth rate amounts to 4.02% with a doubling time of 16.8 years. As the comparison of various segmented regression models in the current study revealed, the model with five segments fits the publication data best. We demonstrated that these segments with different growth rates can be interpreted very well, since they are related to either phases of economic (e.g., industrialization) and / or political developments (e.g., Second World War). In this study, we additionally analyzed scientific growth in two broad fields and the relationship of scientific and economic growth in UK. We focused on this country, since long-time series for publication counts and economic growth indices were available.

Read more
Digital Libraries

HIVE-4-MAT: Advancing the Ontology Infrastructure for Materials Science

Introduces HIVE-4-MAT - Helping Interdisciplinary Vocabulary Engineering for Materials Science, an automatic linked data ontology application. Covers contextual background for materials science, shared ontology infrastructures, and reviews the knowledge extraction and indexing process. HIVE-4-MAT's vocabulary browsing, term search and selection, and knowledge extraction and indexing are reviewed, and plans to integrate named entity recognition. Conclusion highlights next steps with relation extraction to support better ontologies.

Read more
Digital Libraries

Heavy-tailed distribution of the number of publications within scientific journals

The community of scientists is characterized by their need to publish in peer-reviewed journals, in an attempt to avoid the "perish" side of the famous maxim. Accordingly, almost all researchers authored some scientific articles. Scholarly publications represent at least two benefits for the study of the scientific community as a social group. First, they attest of some form of relation between scientists (collaborations, mentoring, heritage,...), useful to determine and analyze social subgroups. Second, most of them are recorded in large data bases, easily accessible and including a lot of pertinent information, easing the quantitative and qualitative study of the scientific community. Understanding the underlying dynamics driving the creation of knowledge in general, and of scientific publication in particular, in addition to its interest from the social science point of view, can contribute to maintaining a high level of research, by identifying good and bad practices in science. In this manuscript, we attempt to advance this understanding by a statistical analysis of publications within peer-reviewed journals. Namely, we show that the distribution of the number of articles published by an author in a given journal is heavy-tailed, but has lighter tail than a power law. Moreover, we observe some anomalies in the data that pinpoint underlying dynamics of the scholarly publication process.

Read more
Digital Libraries

Highly cited references in PLOS ONE and their in-text usage over time

In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, we analyse the citation contexts concerning their position in the text and their age at the time of citing. By selecting the perspective of highly cited papers, we can distinguish them based on the context during citation even if we do not have any other information source or metrics. We describe the top cited references based on how, when and in which context they are cited. The focus of this study is on a time perspective to explain the nature of the reception of highly cited papers. We have found that these references are distinguishable by the IMRaD sections of their citation. And further, we can show that the section usage of highly cited papers is time-dependent: the longer the citation interval, the higher the probability that a reference is cited in a method section.

Read more

Ready to get started?

Join us today