Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Staša Milojević is active.

Publication


Featured researches published by Staša Milojević.


Journal of the Association for Information Science and Technology | 2010

Power law distributions in information science: Making the case for logarithmic binning

Staša Milojević

A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6p accuracy and negative emotion with 72.8p accuracy, both based upon strength scales of 1–5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.


Proceedings of the National Academy of Sciences of the United States of America | 2014

Principles of scientific research team formation and evolution

Staša Milojević

Significance Science is an activity with far-reaching implications for modern society. Understanding how the social organization of science and its fundamental unit, the research team, forms and evolves is therefore of critical significance. Previous studies uncovered important properties of the internal structure of teams, but little attention has been paid to their most basic property: size. This study fills this gap by presenting a model that successfully explains how team sizes in various fields have evolved over the past half century. This model is based on two principles: (i) smaller (core) teams form according to a Poisson process, and (ii) larger (extended) teams begin as core teams but consequently accumulate new members through the process of cumulative advantage based on productivity. Research teams are the fundamental social unit of science, and yet there is currently no model that describes their basic property: size. In most fields, teams have grown significantly in recent decades. We show that this is partly due to the change in the character of team size distribution. We explain these changes with a comprehensive yet straightforward model of how teams of different sizes emerge and grow. This model accurately reproduces the evolution of empirical team size distribution over the period of 50 y. The modeling reveals that there are two modes of knowledge production. The first and more fundamental mode employs relatively small, “core” teams. Core teams form by a Poisson process and produce a Poisson distribution of team sizes in which larger teams are exceedingly rare. The second mode employs “extended” teams, which started as core teams, but subsequently accumulated new members proportional to the past productivity of their members. Given time, this mode gives rise to a power-law tail of large teams (10–1,000 members), which features in many fields today. Based on this model, we construct an analytical functional form that allows the contribution of different modes of authorship to be determined directly from the data and is applicable to any field. The model also offers a solid foundation for studying other social aspects of science, such as productivity and collaboration.


Journal of the Association for Information Science and Technology | 2013

Citation content analysis (CCA): A framework for syntactic and semantic analysis of citation content

Guo Zhang; Ying Ding; Staša Milojević

This study proposes a new framework for citation content analysis (CCA), for syntactic and semantic analysis of citation content that can be used to better analyze the rich sociocultural context of research behavior. This framework could be considered the next generation of citation analysis. The authors briefly review the history and features of content analysis in traditional social sciences and its previous application in library and information science (LIS). Based on critical discussion of the theoretical necessity of a new method as well as the limits of citation analysis, the nature and purposes of CCA are discussed, and potential procedures to conduct CCA, including principles to identify the reference scope, a two-dimensional (citing and cited) and two-module (syntactic and semantic) codebook, are provided and described. Future work and implications are also suggested.


Journal of Informetrics | 2013

Accuracy of simple, initials-based methods for author name disambiguation

Staša Milojević

There are a number of solutions that perform unsupervised name disambiguation based on the similarity of bibliographic records or common coauthorship patterns. Whether the use of these advanced methods, which are often difficult to implement, is warranted depends on whether the accuracy of the most basic disambiguation methods, which only use the authors last name and initials, is sufficient for a particular purpose. We derive realistic estimates for the accuracy of simple, initials-based methods using simulated bibliographic datasets in which the true identities of authors are known. Based on the simulations in five diverse disciplines we find that the first initial method already correctly identifies 97% of authors. An alternative simple method, which takes all initials into account, is typically two times less accurate, except in certain datasets that can be identified by applying a simple criterion. Finally, we introduce a new name-based method that combines the features of first initial and all initials methods by implicitly taking into account the last name frequency and the size of the dataset. This hybrid method reduces the fraction of incorrectly identified authors by 10–30% over the first initial method.


Scientific Reports | 2013

Social Dynamics of Science

Xiaoling Sun; Jasleen Kaur; Staša Milojević; Alessandro Flammini; Filippo Menczer

The birth and decline of disciplines are critical to science and society. How do scientific disciplines emerge? No quantitative model to date allows us to validate competing theories on the different roles of endogenous processes, such as social collaborations, and exogenous events, such as scientific discoveries. Here we propose an agent-based model in which the evolution of disciplines is guided mainly by social interactions among agents representing scientists. Disciplines emerge from splitting and merging of social communities in a collaboration network. We find that this social model can account for a number of stylized facts about the relationships between disciplines, scholars, and publications. These results provide strong quantitative support for the key role of social interactions in shaping the dynamics of science. While several “science of science” theories exist, this is the first account for the emergence of disciplines that is validated on the basis of empirical data.


Journal of Informetrics | 2014

Referenced Publication Years Spectroscopy applied to iMetrics: Scientometrics, Journal of Informetrics, and a relevant subset of JASIST

Loet Leydesdorff; Lutz Bornmann; Werner Marx; Staša Milojević

We have developed a (freeware) routine for “Referenced Publication Years Spectroscopy” (RPYS) and apply this method to the historiography of “iMetrics,” that is, the junction of the journals Scientometrics, Informetrics, and the relevant subset of JASIST (approx. 20%) that shapes the intellectual space for the development of information metrics (bibliometrics, scientometrics, informetrics, and webometrics). The application to information metrics (our own field of research) provides us with the opportunity to validate this methodology, and to add a reflection about using citations for the historical reconstruction. The results show that the field is rooted in individual contributions of the 1920s to 1950s (e.g., Alfred J. Lotka), and was then shaped intellectually in the early 1960s by a confluence of the history of science (Derek de Solla Price), documentation (e.g., Michael M. Kesslers “bibliographic coupling”), and “citation indexing” (Eugene Garfield). Institutional development at the interfaces between science studies and information science has been reinforced by the new journal Informetrics since 2007. In a concluding reflection, we return to the question of how the historiography of science using algorithmic means—in terms of citation practices—can be different from an intellectual history of the field based, for example, on reading source materials.


association for information science and technology | 2014

arXiv E-prints and the journal of record: An analysis of roles and relationships

Vincent Larivière; Cassidy R. Sugimoto; Benoit Macaluso; Staša Milojević; Blaise Cronin; Mike Thelwall

Since its creation in 1991, arXiv has become central to the diffusion of research in a number of fields. Combining data from the entirety of arXiv and the Web of Science (WoS), this article investigates (a) the proportion of papers across all disciplines that are on arXiv and the proportion of arXiv papers that are in the WoS, (b) the elapsed time between arXiv submission and journal publication, and (c) the aging characteristics and scientific impact of arXiv e‐prints and their published version. It shows that the proportion of WoS papers found on arXiv varies across the specialties of physics and mathematics, and that only a few specialties make extensive use of the repository. Elapsed time between arXiv submission and journal publication has shortened but remains longer in mathematics than in physics. In physics, mathematics, as well as in astronomy and astrophysics, arXiv versions are cited more promptly and decay faster than WoS papers. The arXiv versions of papers—both published and unpublished—have lower citation rates than published papers, although there is almost no difference in the impact of the arXiv versions of published and unpublished papers.


Scientometrics | 2013

Information metrics (iMetrics): a research specialty with a socio-cognitive identity?

Staša Milojević; Loet Leydesdorff

Abstract“Bibliometrics”, “scientometrics”, “informetrics”, and “webometrics” can all be considered as manifestations of a single research area with similar objectives and methods, which we call “information metrics” or iMetrics. This study explores the cognitive and social distinctness of iMetrics with respect to the general information science (IS), focusing on a core of researchers, shared vocabulary and literature/knowledge base. Our analysis investigates the similarities and differences between four document sets. The document sets are drawn from three core journals for iMetrics research (Scientometrics, Journal of the American Society for Information Science and Technology, and Journal of Informetrics). We split JASIST into document sets containing iMetrics and general IS articles. The volume of publications in this representation of the specialty has increased rapidly during the last decade. A core of researchers that predominantly focus on iMetrics topics can thus be identified. This core group has developed a shared vocabulary as exhibited in high similarity of title words and one that shares a knowledge base. The research front of this field moves faster than the research front of information science in general, bringing it closer to Price’s dream.


PLOS ONE | 2012

How Are Academic Age, Productivity and Collaboration Related to Citing Behavior of Researchers?

Staša Milojević

References are an essential component of research articles and therefore of scientific communication. In this study we investigate referencing (citing) behavior in five diverse fields (astronomy, mathematics, robotics, ecology and economics) based on 213,756 core journal articles. At the macro level we find: (a) a steady increase in the number of references per article over the period studied (50 years), which in some fields is due to a higher rate of usage, while in others reflects longer articles and (b) an increase in all fields in the fraction of older, foundational references since the 1980s, with no obvious change in citing patterns associated with the introduction of the Internet. At the meso level we explore current (2006–2010) referencing behavior of different categories of authors (21,562 total) within each field, based on their academic age, productivity and collaborative practices. Contrary to some previous findings and expectations we find that senior researchers use references at the same rate as their junior colleagues, with similar rates of re-citation (use of same references in multiple papers). High Modified Price Index (MPI, which measures the speed of the research front more accurately than the traditional Price Index) of senior authors indicates that their research has the similar cutting-edge aspect as that of their younger colleagues. In all fields both the productive researchers and especially those who collaborate more use a significantly lower fraction of foundational references and have much higher MPI and lower re-citation rates, i.e., they are the ones pushing the research front regardless of researcher age. This paper introduces improved bibliometric methods to measure the speed of the research front, disambiguate lead authors in co-authored papers and decouple measures of productivity and collaboration.


Journal of Nanoparticle Research | 2012

Multidisciplinary cognitive content of nanoscience and nanotechnology

Staša Milojević

This article examines the cognitive evolution and disciplinary diversity of nanoscience/nanotechnology (nano research) as expressed through the terminology used in titles of nano journal articles. The analysis is based on the NanoBank bibliographic database of 287,106 nano articles published between 1981 and 2004. We perform multifaceted analyses of title words, focusing on 100 most frequent words or phrases (terms). Hierarchical clustering of title terms reveals three distinct time periods of cognitive development of nano research: formative (1981–1990), early (from 1991 to 1998), and current (after 1998). Early period is characterized by the introduction of thin film deposition techniques, while the current period is characterized by the increased focus on carbon nanotube and nanoparticle research. We introduce a method to identify disciplinary components of nanotechnology. It shows that the nano research is being carried out in a number of diverse parent disciplines. Currently, only 5% of articles are published in dedicated nano-only journals. We find that some 85% of nano research today is multidisciplinary. The case study of the diffusion of several nano-specific terms (e.g., “carbon nanotube”) shows that concepts spread from the initially few disciplinary components to the majority of them in a time span of around a decade. Hierarchical clustering of disciplinary components reveals that the cognitive content of current nanoscience can be divided into nine clusters. Some clusters account for a large fraction of nano research and are identified with such parent disciplines as the condensed matter and applied physics, materials science, and analytical chemistry. Other clusters represent much smaller parts of nano research, but are as cognitively distinct. In the decreasing order of size, these fields are: polymer science, biotechnology, general chemistry, surface science, and pharmacology. Cognitive content of research published in nano-only journals is the closest to nano research published in condensed matter and applied physics journals.

Collaboration


Dive into the Staša Milojević's collaboration.

Top Co-Authors

Avatar

Ying Ding

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Selma Sabanovic

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Katy Börner

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Filippo Radicchi

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Jasleen Kaur

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge