Mike Thelwall | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mike Thelwall is active.

Explore More

Publication

Featured researches published by Mike Thelwall.

Journal of the Association for Information Science and Technology | 2012

Sentiment strength detection for the social web

Mike Thelwall; Kevan Buckley; Georgios Paltoglou

Sentiment analysis is concerned with the automatic extraction of sentiment-related information from text. Although most sentiment analysis addresses commercial tasks, such as extracting opinions from product reviews, there is increasing interest in the affective dimension of the social web, and Twitter in particular. Most sentiment analysis algorithms are not ideally suited to this task because they exploit indirect indicators of sentiment that can reflect genre or topic instead. Hence, such algorithms used to process social web texts can identify spurious sentiment patterns caused by topics rather than affective phenomena. This article assesses an improved version of the algorithm SentiStrength for sentiment strength detection across the social web that primarily uses direct indications of sentiment. The results from six diverse social web data sets (MySpace, Twitter, YouTube, Digg, RunnersWorld, BBCForums) indicate that SentiStrength 2 is successful in the sense of performing better than a baseline approach for all data sets in both supervised and unsupervised cases. SentiStrength is not always better than machine-learning approaches that exploit indirect indicators of sentiment, however, and is particularly weaker for positive sentiment in news-related discussions. Overall, the results suggest that, even unsupervised, SentiStrength is robust enough to be applied to a wide variety of different social web contexts.

PLOS ONE | 2013

Do altmetrics work? Twitter and ten other social web services.

Mike Thelwall; Stefanie Haustein; Vincent Larivière; Cassidy R. Sugimoto

Altmetric measurements derived from the social web are increasingly advocated and used as early indicators of article impact and usefulness. Nevertheless, there is a lack of systematic scientific evidence that altmetrics are valid proxies of either impact or utility although a few case studies have reported medium correlations between specific altmetrics and citation rates for individual journals or fields. To fill this gap, this study compares 11 altmetrics with Web of Science citations for 76 to 208,739 PubMed articles with at least one altmetric mention in each case and up to 1,891 journals per metric. It also introduces a simple sign test to overcome biases caused by different citation and usage windows. Statistically significant associations were found between higher metric scores and higher citations for articles with positive altmetric scores in all cases with sufficient evidence (Twitter, Facebook wall posts, research highlights, blogs, mainstream media and forums) except perhaps for Google+ posts. Evidence was insufficient for LinkedIn, Pinterest, question and answer sites, and Reddit, and no conclusions should be drawn about articles with zero altmetric scores or the strength of any correlation between altmetrics and citations. Nevertheless, comparisons between citations and metric values for articles published at different times, even within the same year, can remove or reverse this association and so publishers and scientometricians should consider the effect of time when using altmetrics to rank articles. Finally, the coverage of all the altmetrics except for Twitter seems to be low and so it is not clear if they are prevalent enough to be useful in practice.

Journal of Informetrics | 2009

Sentiment analysis: A combined approach

Rudy Prabowo; Mike Thelwall

Sentiment analysis is an important current research area. This paper combines rule-based classification, supervised learning and machine learning into a new combined method. This method is tested on movie reviews, product reviews and MySpace comments. The results show that a hybrid classification can improve the classification effectiveness in terms of micro- and macro-averaged F1. F1 is a measure that takes both the precision and recall of a classifier’s effectiveness into account. In addition, we propose a semi-automatic, complementary approach in which each classifier can contribute to other classifiers to achieve a good level of effectiveness.

Information Processing and Management | 2004

Search engine coverage bias: evidence and possible causes

Liwen Vaughan; Mike Thelwall

Commercial search engines are now playing an increasingly important role in Web information dissemination and access. Of particular interest to business and national governments is whether the big engines have coverage biased towards the US or other countries. In our study we tested for national biases in three major search engines and found significant differences in their coverage of commercial Web sites The US sites were much better covered than the others in the study: sites from China, Taiwan and Singapore. We then examined the possible technical causes of the differences and found that the language of a site does not affect its coverage by search engines. However, the visibility of a site, measured by the number of links to it, affects its chance to be covered by search engines. We conclude that the coverage bias does exist but this is due not to deliberate choices of the search engines but occurs as a natural result of cumulative advantage effects of US sites on the Web. Nevertheless, the bias remains a cause for international concern.

Journal of Medical Internet Research | 2011

Online Interventions for Social Marketing Health Behavior Change Campaigns: A Meta-Analysis of Psychological Architectures and Adherence Factors

Brian Cugelman; Mike Thelwall; Phil Dawes

Background Researchers and practitioners have developed numerous online interventions that encourage people to reduce their drinking, increase their exercise, and better manage their weight. Motivations to develop eHealth interventions may be driven by the Internet’s reach, interactivity, cost-effectiveness, and studies that show online interventions work. However, when designing online interventions suitable for public campaigns, there are few evidence-based guidelines, taxonomies are difficult to apply, many studies lack impact data, and prior meta-analyses are not applicable to large-scale public campaigns targeting voluntary behavioral change. Objectives This meta-analysis assessed online intervention design features in order to inform the development of online campaigns, such as those employed by social marketers, that seek to encourage voluntary health behavior change. A further objective was to increase understanding of the relationships between intervention adherence, study adherence, and behavioral outcomes. Methods Drawing on systematic review methods, a combination of 84 query terms were used in 5 bibliographic databases with additional gray literature searches. This resulted in 1271 abstracts and papers; 31 met the inclusion criteria. In total, 29 papers describing 30 interventions were included in the primary meta-analysis, with the 2 additional studies qualifying for the adherence analysis. Using a random effects model, the first analysis estimated the overall effect size, including groupings by control conditions and time factors. The second analysis assessed the impacts of psychological design features that were coded with taxonomies from evidence-based behavioral medicine, persuasive technology, and other behavioral influence fields. These separate systems were integrated into a coding framework model called the communication-based influence components model. Finally, the third analysis assessed the relationships between intervention adherence and behavioral outcomes. Results The overall impact of online interventions across all studies was small but statistically significant (standardized mean difference effect size d = 0.19, 95% confidence interval [CI] = 0.11 - 0.28, P < .001, number of interventions k = 30). The largest impact with a moderate level of efficacy was exerted from online interventions when compared with waitlists and placebos (d = 0.28, 95% CI = 0.17 - 0.39, P < .001, k = 18), followed by comparison with lower-tech online interventions (d = 0.16, 95% CI = 0.00 - 0.32, P = .04, k = 8); no significant difference was found when compared with sophisticated print interventions (d = –0.11, 95% CI = –0.34 to 0.12, P = .35, k = 4), though online interventions offer a small effect with the advantage of lower costs and larger reach. Time proved to be a critical factor, with shorter interventions generally achieving larger impacts and greater adherence. For psychological design, most interventions drew from the transtheoretical approach and were goal orientated, deploying numerous influence components aimed at showing users the consequences of their behavior, assisting them in reaching goals, and providing normative pressure. Inconclusive results suggest a relationship between the number of influence components and intervention efficacy. Despite one contradictory correlation, the evidence suggests that study adherence, intervention adherence, and behavioral outcomes are correlated. Conclusions These findings demonstrate that online interventions have the capacity to influence voluntary behaviors, such as those routinely targeted by social marketing campaigns. Given the high reach and low cost of online technologies, the stage may be set for increased public health campaigns that blend interpersonal online systems with mass-media outreach. Such a combination of approaches could help individuals achieve personal goals that, at an individual level, help citizens improve the quality of their lives and at a state level, contribute to healthier societies.

Journal of the Association for Information Science and Technology | 2001

Extracting macroscopic information from Web links

Mike Thelwall

Much has been written about the potential and pitfalls of macroscopic Web-based link analysis, yet there have been no studies that have provided clear statistical evidence that any of the proposed calculations can produce results over large areas of the Web that correlate with phenomena external to the Internet. This article attempts to provide such evidence through an evaluation of Ingwersens ([1998]) proposed external Web Impact Factor (WIF) for the original use of the Web: the interlinking of academic research. In particular, it studies the case of the relationship between academic hyperlinks and research activity for universities in Britain, a country chosen for its variety of institutions and the existence of an official government rating exercise for research. After reviewing the numerous reasons why link counts may be unreliable, it demonstrates that four different WIFs do, in fact, correlate with the conventional academic research measures. The WIF delivering the greatest correlation with research rankings was the ratio of Web pages with links pointing at research-based pages to faculty numbers. The scarcity of links to electronic academic papers in the data set suggests that, in contrast to citation analysis, this WIF is measuring the reputations of universities and their scholars, rather than the quality of their publications.

association for information science and technology | 2014

Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature

Stefanie Haustein; Isabella Peters; Cassidy R. Sugimoto; Mike Thelwall; Vincent Larivière

Data collected by social media platforms have been introduced as new sources for indicators to help measure the impact of scholarly research in ways that are complementary to traditional citation analysis. Data generated from social media activities can be used to reflect broad types of impact. This article aims to provide systematic evidence about how often Twitter is used to disseminate information about journal articles in the biomedical sciences. The analysis is based on 1.4 million documents covered by both PubMed and Web of Science and published between 2010 and 2012. The number of tweets containing links to these documents was analyzed and compared to citations to evaluate the degree to which certain journals, disciplines, and specialties were represented on Twitter and how far tweets correlate with citation impact. With less than 10% of PubMed articles mentioned on Twitter, its uptake is low in general but differs between journals and specialties. Correlations between tweets and citations are low, implying that impact metrics based on tweets are different from those based on citations. A framework using the coverage of articles and the correlation between Twitter mentions and citations is proposed to facilitate the evaluation of novel social‐media‐based metrics.

Scientometrics | 2012

Validating online reference managers for scholarly impact measurement

Xuemei Li; Mike Thelwall; Dean Giustini

This paper investigates whether CiteULike and Mendeley are useful for measuring scholarly influence, using a sample of 1,613 papers published in Nature and Science in 2007. Traditional citation counts from the Web of Science (WoS) were used as benchmarks to compare with the number of users who bookmarked the articles in one of the two free online reference manager sites. Statistically significant correlations were found between the user counts and the corresponding WoS citation counts, suggesting that this type of influence is related in some way to traditional citation-based scholarly impact but the number of users of these systems seems to be still too small for them to challenge traditional citation indexes.

Journal of Computer-Mediated Communication | 2006

Hyperlink Analyses of the World Wide Web: a Review

Han Woo Park; Mike Thelwall

We have recently witnessed the growth of hyperlink studies in the field of Internet research. Although investigations have been conducted across many disciplines and topics, their approaches can be largely divided into hyperlink network analysis (HNA) and Webometrics. This article is an extensive review of the two analytical methods, and a reflection on their application. HNA casts hyperlinks between Web sites (or Web pages) as social and communicational ties, applying standard techniques from Social Networks Analysis to this new data source. Webometrics has tended to apply much simpler techniques combined with a more in-depth investigation into the validity of hypotheses about possible interpretations of the results. We conclude that hyperlinks are a highly promising but problematic new source of data that can be mined for previously hidden patterns of information, although much care must be taken in the collection of raw data and in the interpretation of the results. In particular, link creation is an unregulated phenomenon and so it would not be sensible to assume that the meaning of hyperlinks in any given context is evident, without a systematic study of the context of link creation, and of the relationship between link counts, among other measurements. Social Networks Analysis tools and techniques form an excellent resource for hyperlink analysis, but should only be used in conjunction with improved techniques for data collection, validation and interpretation.

Synthesis Lectures on Information Concepts, Retrieval, and Services | 2009

Introduction to Webometrics: Quantitative Web Research for the Social Sciences

Mike Thelwall

Webometrics is concerned with measuring aspects of the web: web sites, web pages, parts of web pages, words in web pages, hyperlinks, web search engine results. The importance of the web itself as a communication medium and for hosting an increasingly wide array of documents, from journal articles to holiday brochures, needs no introduction. Given this huge and easily accessible source of information, there are limitless possibilities for measuring or counting on a huge scale (e.g., the number of web sites, the number of web pages, the number of blogs) or on a smaller scale (e.g., the number of web sites in Ireland, the number of web pages in the CNN web site, the number of blogs mentioning Barack Obama before the 2008 presidential campaign). This book argues that it can be useful for social scientists to measure aspects of the web and explains how this can be achieved on both a small and large scale. The book is intended for social scientists with research topics that are wholly or partly online (e.g., social networks, news, political communication) and social scientists with offline research topics with an online reflection, even if this is not a core component (e.g., diaspora communities, consumer culture, linguistic change). The book is also intended for library and information science students in the belief that the knowledge and techniques described will be useful for them to guide and aid other social scientists in their research. In addition, the techniques and issues are all directly relevant to library and information science research problems. Table of Contents: Introduction / Web Impact Assessment / Link Analysis / Blog Searching / Automatic Search Engine Searches: LexiURL Searcher / Web Crawling: SocSciBot / Search Engines and Data Reliability / Tracking User Actions Online / Advaned Techniques / Summary and Future Directions

Explore More