Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kevin W. Boyack is active.

Publication


Featured researches published by Kevin W. Boyack.


Journal of Informetrics | 2011

Approaches to understanding and measuring interdisciplinary scientific research (IDR): A review of the literature

Caroline S. Wagner; J. David Roessner; Kamau Bobb; Julie Thompson Klein; Kevin W. Boyack; Joann Keyton; Ismael Rafols; Katy Börner

Interdisciplinary scientific research (IDR) extends and challenges the study of science on a number of fronts, including creating output science and engineering (S&E) indicators. This literature review began with a narrow search for quantitative measures of the output of IDR that could contribute to indicators, but the authors expanded the scope of the review as it became clear that differing definitions, assessment tools, evaluation processes, and measures all shed light on different aspects of IDR. Key among these broader aspects is (a) the importance of incorporating the concept of knowledge integration, and (b) recognizing that integration can occur within a single mind as well as among a team. Existing output measures alone cannot adequately capture this process. Among the quantitative measures considered, bibliometrics (co-authorships, co-inventors, collaborations, references, citations and co-citations) are the most developed, but leave considerable gaps in understanding of the social dynamics that lead to knowledge integration. Emerging measures in network dynamics (particularly betweenness centrality and diversity), and entropy are promising as indicators, but their use requires sophisticated interpretations. Combinations of quantitative measures and qualitative assessments being applied within evaluation studies appear to reveal IDR processes but carry burdens of expense, intrusion, and lack of reproducibility year-upon-year. This review is a first step toward providing a more holistic view of measuring IDR, although research and development is needed before metrics can adequately reflect the actual phenomenon of IDR.


Journal of the Association for Information Science and Technology | 2010

Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?

Kevin W. Boyack; Richard Klavans

A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6p accuracy and negative emotion with 72.8p accuracy, both based upon strength scales of 1–5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.


Journal of the Association for Information Science and Technology | 2002

Domain visualization using VxInsight for science and technology management

Kevin W. Boyack; Brian N. Wylie; George S. Davidson

We present the application of our knowledge visualization tool, VxInsight, to enable domain analysis for science and technology management within the enterprise. Data mining from sources of bibliographic information is used to define subsets of information relevant to a technology domain. Relationships between the individual objects (e.g., articles) are identified using citations, descriptive terms, or textual similarities. Objects are then clustered using a force-directed placement algorithm to produce a terrain view of the many thousands of objects. A variety of features that allow exploration and manipulation of the landscapes and that give detail on demand, enable quick and powerful analysis of the resulting landscapes. Examples of domain analyses used in S&T management at Sandia are given.


PLOS ONE | 2011

Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches

Kevin W. Boyack; David Newman; Russell J. Duhon; Richard Klavans; Michael Patek; Joseph R. Biberstine; Bob Schijvenaars; André Skupin; Nianli Ma; Katy Börner

Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE. Conclusions PubMeds own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts.


visualization and data analysis | 2011

OpenOrd: An Open-Source Toolbox for Large Graph Layout

Shawn Martin; W. Michael Brown; Richard Klavans; Kevin W. Boyack

We document an open-source toolbox for drawing large-scale undirected graphs. This toolbox is based on a previously implemented closed-source algorithm known as VxOrd. Our toolbox, which we call OpenOrd, extends the capabilities of VxOrd to large graph layout by incorporating edge-cutting, a multi-level approach, average-link clustering, and a parallel implementation. At each level, vertices are grouped using force-directed layout and average-link clustering. The clustered vertices are then re-drawn and the process is repeated. When a suitable drawing of the coarsened graph is obtained, the algorithm is reversed to obtain a drawing of the original graph. This approach results in layouts of large graphs which incorporate both local and global structure. A detailed description of the algorithm is provided in this paper. Examples using datasets with over 600K nodes are given. Code is available at www.cs.sandia.gov/~smartin.


PLOS ONE | 2012

Design and Update of a Classification System: The UCSD Map of Science

Katy Börner; Richard Klavans; Michael Patek; Angela M. Zoss; Joseph R. Biberstine; Robert P. Light; Vincent Larivière; Kevin W. Boyack

Global maps of science can be used as a reference system to chart career trajectories, the location of emerging research frontiers, or the expertise profiles of institutes or nations. This paper details data preparation, analysis, and layout performed when designing and subsequently updating the UCSD map of science and classification system. The original classification and map use 7.2 million papers and their references from Elsevier’s Scopus (about 15,000 source titles, 2001–2005) and Thomson Reuters’ Web of Science (WoS) Science, Social Science, Arts & Humanities Citation Indexes (about 9,000 source titles, 2001–2004)–about 16,000 unique source titles. The updated map and classification adds six years (2005–2010) of WoS data and three years (2006–2008) from Scopus to the existing category structure–increasing the number of source titles to about 25,000. To our knowledge, this is the first time that a widely used map of science was updated. A comparison of the original 5-year and the new 10-year maps and classification system show (i) an increase in the total number of journals that can be mapped by 9,409 journals (social sciences had a 80% increase, humanities a 119% increase, medical (32%) and natural science (74%)), (ii) a simplification of the map by assigning all but five highly interdisciplinary journals to exactly one discipline, (iii) a more even distribution of journals over the 554 subdisciplines and 13 disciplines when calculating the coefficient of variation, and (iv) a better reflection of journal clusters when compared with paper-level citation data. When evaluating the map with a listing of desirable features for maps of science, the updated map is shown to have higher mapping accuracy, easier understandability as fewer journals are multiply classified, and higher usability for the generation of data overlays, among others.


Journal of the Association for Information Science and Technology | 2003

Indicator-assisted evaluation and funding of research: visualizing the influence of grants on the number and citation counts of research papers

Kevin W. Boyack; Katy Börner

This article reports research on analyzing and visualizing the impact of governmental funding on the amount and citation counts of research publications. For the first time, grant and publication data appear interlinked in one map. We start with an overview of related work and a discussion of available techniques. A concrete example- grant and publication data from Behavioral and Social Science Research, one of four extramural research programs at the National Institute on Aging (NIA)--is analyzed and visualized using the VxInsight® visualization tool. The analysis also illustrates current existing problems related to the quality and existence of data, data analysis, and processing. The article concludes with a list of recommendations on how to improve the quality of grant-publication maps and a discussion of research challenges for indicator-assisted evaluation and funding of research.


Scientometrics | 2009

Mapping the structure and evolution of chemistry research

Kevin W. Boyack; Katy Börner; Richard Klavans

How does our collective scholarly knowledge grow over time? What major areas of science exist and how are they interlinked? Which areas are major knowledge producers; which ones are consumers? Computational scientometrics — the application of bibliometric/scientometric methods to large-scale scholarly datasets — and the communication of results via maps of science might help us answer these questions. This paper represents the results of a prototype study that aims to map the structure and evolution of chemistry research over a 30 year time frame. Information from the combined Science (SCIE) and Social Science (SSCI) Citations Indexes from 2002 was used to generate a disciplinary map of 7,227 journals and 671 journal clusters. Clusters relevant to study the structure and evolution of chemistry were identified using JCR categories and were further clustered into 14 disciplines. The changing scientific composition of these 14 disciplines and their knowledge exchange via citation linkages was computed. Major changes on the dominance, influence, and role of Chemistry, Biology, Biochemistry, and Bioengineering over these 30 years are discussed. The paper concludes with suggestions for future work.


ieee symposium on information visualization | 2001

Cluster stability and the use of noise in interpretation of clustering

George S. Davidson; Brian N. Wylie; Kevin W. Boyack

A clustering and ordination algorithm suitable for mining extremely large databases, including those produced by microarray expression studies, is described and analyzed for stability. Data from a yeast cell cycle experiment with 6000 genes and 18 experimental measurements per gene are used to test this algorithm under practical conditions. The process of assigning database objects to an X,Y coordinate, ordination, is shown to be stable with respect to random starting conditions, and with respect to minor perturbations in the starting similarity estimates. Careful analysis of the way clusters typically co-locate, versus the occasional large displacements under different starting conditions are shown to be useful in interpreting the data. This extra stability information is lost when only a single cluster is reported, which is currently the accepted practice. However, it is believed that the approaches presented here should become a standard part of best practices in analyzing computer clustering of large data collections.


PLOS ONE | 2014

Estimates of the Continuously Publishing Core in the Scientific Workforce

John P. A. Ioannidis; Kevin W. Boyack; Richard Klavans

Background The ability of a scientist to maintain a continuous stream of publication may be important, because research requires continuity of effort. However, there is no data on what proportion of scientists manages to publish each and every year over long periods of time. Methodology/Principal Findings Using the entire Scopus database, we estimated that there are 15,153,100 publishing scientists (distinct author identifiers) in the period 1996–2011. However, only 150,608 (<1%) of them have published something in each and every year in this 16-year period (uninterrupted, continuous presence [UCP] in the literature). This small core of scientists with UCP are far more cited than others, and they account for 41.7% of all papers in the same period and 87.1% of all papers with >1000 citations in the same period. Skipping even a single year substantially affected the average citation impact. We also studied the birth and death dynamics of membership in this influential UCP core, by imputing and estimating UCP-births and UCP-deaths. We estimated that 16,877 scientists would qualify for UCP-birth in 1997 (no publication in 1996, UCP in 1997–2012) and 9,673 scientists had their UCP-death in 2010. The relative representation of authors with UCP was enriched in Medical Research, in the academic sector and in Europe/North America, while the relative representation of authors without UCP was enriched in the Social Sciences and Humanities, in industry, and in other continents. Conclusions The proportion of the scientific workforce that maintains a continuous uninterrupted stream of publications each and every year over many years is very limited, but it accounts for the lion’s share of researchers with high citation impact. This finding may have implications for the structure, stability and vulnerability of the scientific workforce.

Collaboration


Dive into the Kevin W. Boyack's collaboration.

Top Co-Authors

Avatar

Katy Börner

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Brian N. Wylie

Sandia National Laboratories

View shared research outputs
Top Co-Authors

Avatar

George S. Davidson

Sandia National Laboratories

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Henry Small

University City Science Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lyle H. Ungar

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David K. Johnson

Sandia National Laboratories

View shared research outputs
Researchain Logo
Decentralizing Knowledge