Thomas Lansdall-Welfare
University of Bristol
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Thomas Lansdall-Welfare.
international world wide web conferences | 2012
Thomas Lansdall-Welfare; Vasileios Lampos; Nello Cristianini
Large scale analysis of social media content allows for real time discovery of macro-scale patterns in public opinion and sentiment. In this paper we analyse a collection of 484 million tweets generated by more than 9.8 million users from the United Kingdom over the past 31 months, a period marked by economic downturn and some social tensions. Our findings, besides corroborating our choice of method for the detection of public mood, also present intriguing patterns that can be explained in terms of events and social changes. On the one hand, the time series we obtain show that periodic events such as Christmas and Halloween evoke similar mood patterns every year. On the other hand, we see that a significant increase in negative mood indicators coincide with the announcement of the cuts to public spending by the government, and that this effect is still lasting. We also detect events such as the riots of summer 2011, as well as a possible calming effect coinciding with the run up to the royal wedding.
Digital journalism | 2013
Ilias N. Flaounas; Omar Ali; Thomas Lansdall-Welfare; Tijl De Bie; Nicholas Alexander Mosdell; Justin Matthew Wren Lewis; Nello Cristianini
News content analysis is usually preceded by a labour-intensive coding phase, where experts extract key information from news items. The cost of this phase imposes limitations on the sample sizes that can be processed, and therefore to the kind of questions that can be addressed. In this paper we describe an approach that incorporates text-analysis technologies for the automation of some of these tasks, enabling us to analyse data sets that are many orders of magnitude larger than those normally used. The patterns detected by our method include: (1) similarities in writing style among several outlets, which reflect reader demographics; (2) gender imbalance in media content and its relation with topic; (3) the relationship between topic and popularity of articles.
Proceedings of the National Academy of Sciences of the United States of America | 2017
Thomas Lansdall-Welfare; James Thompson; Justin Matthew Wren Lewis; FindMyPast Newspaper Team; Nello Cristianini
Significance The use of large datasets has revolutionized the natural sciences and is widely believed to have the potential to do so with the social and human sciences. Many digitization efforts are underway, but the high-throughput methods of data production have not yet led to a comparable output in analysis. A notable exception has been the previous statistical analysis of the content of historical books, which started a debate about the limitations of using big data in this context. This study moves the debate forward using a large corpus of historical British newspapers and tools from artificial intelligence to extract macroscopic trends in history and culture, including gender bias, geographical focus, technology, and politics, along with accurate dates for specific events. Previous studies have shown that it is possible to detect macroscopic patterns of cultural change over periods of centuries by analyzing large textual time series, specifically digitized books. This method promises to empower scholars with a quantitative and data-driven tool to study culture and society, but its power has been limited by the use of data from books and simple analytics based essentially on word counts. This study addresses these problems by assembling a vast corpus of regional newspapers from the United Kingdom, incorporating very fine-grained geographical and temporal information that is not available for books. The corpus spans 150 years and is formed by millions of articles, representing 14% of all British regional outlets of the period. Simple content analysis of this corpus allowed us to detect specific events, like wars, epidemics, coronations, or conclaves, with high accuracy, whereas the use of more refined techniques from artificial intelligence enabled us to move beyond counting words by detecting references to named entities. These techniques allowed us to observe both a systematic underrepresentation and a steady increase of women in the news during the 20th century and the change of geographic focus for various concepts. We also estimate the dates when electricity overtook steam and trains overtook horses as a means of transportation, both around the year 1900, along with observing other cultural transitions. We believe that these data-driven approaches can complement the traditional method of close reading in detecting trends of continuity and change in historical corpora.
international conference on big data | 2014
Thomas Lansdall-Welfare; Giuseppe Alessandro Veltri; Nello Cristianini
The contents of English-language online-news over 5 years have been analyzed to explore the impact of the Fukushima disaster on the media coverage of nuclear power. This big data study, based on millions of news articles, involves the extraction of narrative networks, association networks, and sentiment time series. The key finding is that media attitude towards nuclear power has significantly changed in the wake of the Fukushima disaster, in terms of sentiment and in terms of framing, showing a long lasting effect that does not appear to recover before the end of the period covered by this study. In particular, we find that the media discourse has shifted from one of public debate about nuclear power as a viable option for energy supply needs to a re-emergence of the public views of nuclear power and the risks associated with it. The methodology used presents an opportunity to leverage big data for corpus analysis and opens up new possibilities in social scientific research.
PLOS ONE | 2016
Sen Jia; Thomas Lansdall-Welfare; Cynthia Carter; Nello Cristianini
Feminist news media researchers have long contended that masculine news values shape journalists’ quotidian decisions about what is newsworthy. As a result, it is argued, topics and issues traditionally regarded as primarily of interest and relevance to women are routinely marginalised in the news, while men’s views and voices are given privileged space. When women do show up in the news, it is often as “eye candy,” thus reinforcing women’s value as sources of visual pleasure rather than residing in the content of their views. To date, evidence to support such claims has tended to be based on small-scale, manual analyses of news content. In this article, we report on findings from our large-scale, data-driven study of gender representation in online English language news media. We analysed both words and images so as to give a broader picture of how gender is represented in online news. The corpus of news content examined consists of 2,353,652 articles collected over a period of six months from more than 950 different news outlets. From this initial dataset, we extracted 2,171,239 references to named persons and 1,376,824 images resolving the gender of names and faces using automated computational methods. We found that males were represented more often than females in both images and text, but in proportions that changed across topics, news outlets and mode. Moreover, the proportion of females was consistently higher in images than in text, for virtually all topics and news outlets; women were more likely to be represented visually than they were mentioned as a news actor or source. Our large-scale, data-driven analysis offers important empirical evidence of macroscopic patterns in news content concerning the way men and women are represented.
international conference on data mining | 2016
Sen Jia; Thomas Lansdall-Welfare; Nello Cristianini
When analysing human activities using data mining or machine learning techniques, it can be useful to infer properties such as the gender or age of the people involved. This paper focuses on the sub-problem of gender recognition, which has been studied extensively in the literature, with two main problems remaining unsolved: how to improve the accuracy on real-world face images, and how to generalise the models to perform well on new datasets. We address these problems by collecting five million weakly labelled face images, and performing three different experiments, investigating: the performance difference between convolutional neural networks (CNNs) of differing depths and a support vector machine approach using local binary pattern features on the same training data, the effect of contextual information on classification accuracy, and the ability of convolutional neural networks and large amounts of training data to generalise to cross-database classification. We report record-breaking results on both the Labeled Faces in the Wild (LFW) dataset, achieving an accuracy of 98.90%, and the Images of Groups (GROUPS) dataset, achieving an accuracy of 91.34% for cross-database gender classification.
international conference on data mining | 2016
Fabon Dzogang; Thomas Lansdall-Welfare; Nello Cristianini
Understanding changes in the mood and mentalhealth of large populations is a challenge, with the need for largenumbers of samples to uncover any regular patterns within thedata. The use of data generated by online activities of healthyindividuals offers the opportunity to perform such observationson the large scales and for the long periods that are required. Various studies have previously examined circadian fluctuationsof mood in this way. In this study, we investigate seasonalfluctuations in mood and mental health by analyzing the accesslogs of Wikipedia pages and the content of Twitter in the UK overa period of four years. By using standard methods of NaturalLanguage Processing, we extract daily indicators of negativeaffect, anxiety, anger and sadness from Twitter and comparethis with the overall daily traffic to Wikipedia pages aboutmental health disorders. We show that both negative affect onTwitter and access to mental health pages on Wikipedia follow anannual cycle, both peaking during the winter months. Breakingthis down into specific moods and pages, we find that peakaccess to the Wikipedia page for Seasonal Affective Disordercoincides with the peak period for the sadness indicator inTwitter content, with both most over-expressed in Novemberand December. A period of heightened anger and anxiety onTwitter partly overlaps with increased information seeking aboutstress, panic and eating disorders on Wikipedia in the late winterand early spring. Finally, we compare Twitter mood indicatorswith various weather time series, finding that negative affectand anger can be partially explained in terms of the climatictemperature and photoperiod, sadness can be partially explainedby the photoperiod and the perceived change in the photoperiod, while anxiety is partially explained by the level of precipitation. Using these multiple sources of data allows us to have accessto inexpensive, although indirect, information about collectivevariations in mood over long periods of time, in turn helpingus to begin to separate out the various possible causes of these fluctuations.
PLOS ONE | 2016
Fabon Dzogang; Thomas Lansdall-Welfare; FindMyPast Newspaper Team; Nello Cristianini
We address the problem of observing periodic changes in the behaviour of a large population, by analysing the daily contents of newspapers published in the United States and United Kingdom from 1836 to 1922. This is done by analysing the daily time series of the relative frequency of the 25K most frequent words for each country, resulting in the study of 50K time series for 31,755 days. Behaviours that are found to be strongly periodic include seasonal activities, such as hunting and harvesting. A strong connection with natural cycles is found, with a pronounced presence of fruits, vegetables, flowers and game. Periodicities dictated by religious or civil calendars are also detected and show a different wave-form than those provoked by weather. States that can be revealed include the presence of infectious disease, with clear annual peaks for fever, pneumonia and diarrhoea. Overall, 2% of the words are found to be strongly periodic, and the period most frequently found is 365 days. Comparisons between UK and US, and between modern and historical news, reveal how the fundamental cycles of life are shaped by the seasons, but also how this effect has been reduced in modern times.
international world wide web conferences | 2015
Sen Jia; Thomas Lansdall-Welfare; Nello Cristianini
Analysing the representation of gender in news media has a long history within the fields of journalism, media and communication. Typically this can be performed by measuring how often people of each gender are mentioned within the textual content of news articles. In this paper, we adopt a different approach, classifying the faces in images of news articles into their respective gender. We present a study on
intelligent data analysis | 2017
Sen Jia; Thomas Lansdall-Welfare; Nello Cristianini
885{,}573