Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew J. Reagan is active.

Publication


Featured researches published by Andrew J. Reagan.


Proceedings of the National Academy of Sciences of the United States of America | 2015

Human language reveals a universal positivity bias

Peter Sheridan Dodds; Eric M. Clark; Suma Desu; Morgan R. Frank; Andrew J. Reagan; Jake Ryland Williams; Lewis Mitchell; Kameron Decker Harris; Isabel M. Kloumann; James P. Bagrow; Karine Megerdoomian; Matthew T. McMahon; Brian F. Tivnan; Christopher M. Danforth

Significance The most commonly used words of 24 corpora across 10 diverse human languages exhibit a clear positive bias, a big data confirmation of the Pollyanna hypothesis. The study’s findings are based on 5 million individual human scores and pave the way for the development of powerful language-based tools for measuring emotion. Using human evaluation of 100,000 words spread across 24 corpora in 10 languages diverse in origin and culture, we present evidence of a deep imprint of human sociality in language, observing that (i) the words of natural human language possess a universal positivity bias, (ii) the estimated emotional content of words is consistent between languages under translation, and (iii) this positivity bias is strongly independent of frequency of word use. Alongside these general regularities, we describe interlanguage variations in the emotional spectrum of languages that allow us to rank corpora. We also show how our word evaluations can be used to construct physical-like instruments for both real-time and offline measurement of the emotional content of large-scale texts.


EPJ Data Science | 2016

The emotional arcs of stories are dominated by six basic shapes

Andrew J. Reagan; Lewis Mitchell; Dilan Patrick Kiley; Christopher M. Danforth; Peter Sheridan Dodds

Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture’s evolution through its texts using a ‘big data’ lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories and forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,327 stories from Project Gutenberg’s fiction collection, we find a set of six core emotional arcs which form the essential building blocks of complex emotional trajectories. We strengthen our findings by separately applying matrix decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.


PLOS ONE | 2015

Climate Change Sentiment on Twitter: An Unsolicited Public Opinion Poll

Emily M. Cody; Andrew J. Reagan; Lewis Mitchell; Peter Sheridan Dodds; Christopher M. Danforth

The consequences of anthropogenic climate change are extensively debated through scientific papers, newspaper articles, and blogs. Newspaper articles may lack accuracy, while the severity of findings in scientific papers may be too opaque for the public to understand. Social media, however, is a forum where individuals of diverse backgrounds can share their thoughts and opinions. As consumption shifts from old media to new, Twitter has become a valuable resource for analyzing current events and headline news. In this research, we analyze tweets containing the word “climate” collected between September 2008 and July 2014. Through use of a previously developed sentiment measurement tool called the Hedonometer, we determine how collective sentiment varies in response to climate change news, events, and natural disasters. We find that natural disasters, climate bills, and oil-drilling can contribute to a decrease in happiness while climate rallies, a book release, and a green ideas contest can contribute to an increase in happiness. Words uncovered by our analysis suggest that responses to climate change news are predominately from climate change activists rather than climate change deniers, indicating that Twitter is a valuable resource for the spread of climate change awareness.


Scientific Reports | 2017

Forecasting the onset and course of mental illness with Twitter data

Andrew Reece; Andrew J. Reagan; Katharina L. M. Lix; Peter Sheridan Dodds; Christopher M. Danforth; Ellen J. Langer

We developed computational models to predict the emergence of depression and Post-Traumatic Stress Disorder in Twitter users. Twitter data and details of depression history were collected from 204 individuals (105 depressed, 99 healthy). We extracted predictive features measuring affect, linguistic style, and context from participant tweets (N = 279,951) and built models using these features with supervised learning algorithms. Resulting models successfully discriminated between depressed and healthy content, and compared favorably to general practitioners’ average success rates in diagnosing depression, albeit in a separate population. Results held even when the analysis was restricted to content posted before first depression diagnosis. State-space temporal analysis suggests that onset of depression may be detectable from Twitter data several months prior to diagnosis. Predictive results were replicated with a separate sample of individuals diagnosed with PTSD (Nusers = 174, Ntweets = 243,775). A state-space time series model revealed indicators of PTSD almost immediately post-trauma, often many months prior to clinical diagnosis. These methods suggest a data-driven, predictive approach for early screening and detection of mental illness.


PLOS ONE | 2017

The Lexicocalorimeter: Gauging public health through caloric input and output on social media

Sharon E. Alajajian; Jake Ryland Williams; Andrew J. Reagan; Stephen C. Alajajian; Morgan R. Frank; Lewis Mitchell; Jacob Lahne; Christopher M. Danforth; Peter Sheridan Dodds

We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the “caloric content” of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of “caloric input”, “caloric output”, and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and well-being measures such as diabetes rates; has the capability of providing a real-time signal reflecting a population’s health; and has the potential to be used alongside traditional survey data in the development of public policy and collective self-awareness. Because our Lexicocalorimeter is a linear superposition of principled phrase scores, we also show we can move beyond correlations to explore what people talk about in collective detail, and assist in the understanding and explanation of how population-scale conditions vary, a capacity unavailable to black-box type methods.


Physical Review E | 2017

Simon's fundamental rich-get-richer model entails a dominant first-mover advantage

Peter Sheridan Dodds; David Rushing Dewhurst; Fletcher F. Hazlehurst; Colin M. Van Oort; Lewis Mitchell; Andrew J. Reagan; Jake Ryland Williams; Christopher M. Danforth

Herbert Simons classic rich-get-richer model is one of the simplest empirically supported mechanisms capable of generating heavy-tail size distributions for complex systems. Simon argued analytically that a population of flavored elements growing by either adding a novel element or randomly replicating an existing one would afford a distribution of group sizes with a power-law tail. Here, we show that, in fact, Simons model does not produce a simple power-law size distribution as the initial element has a dominant first-mover advantage, and will be overrepresented by a factor proportional to the inverse of the innovation probability. The first groups size discrepancy cannot be explained away as a transient of the model, and may therefore be many orders of magnitude greater than expected. We demonstrate how Simons analysis was correct but incomplete, and expand our alternate analysis to quantify the variability of long term rankings for all groups. We find that the expected time for a first replication is infinite, and show how an incipient group must break the mechanism to improve their odds of success. We present an example of citation counts for a specific field that demonstrates a first-mover advantage consistent with our revised view of the rich-get-richer mechanism. Our findings call for a reexamination of preceding work invoking Simons model and provide an expanded understanding going forward.


Proceedings of the National Academy of Sciences of the United States of America | 2015

Reply to Garcia et al.: Common mistakes in measuring frequency-dependent word characteristics.

Peter Sheridan Dodds; Eric M. Clark; Suma Desu; Morgan R. Frank; Andrew J. Reagan; Jake Ryland Williams; Lewis Mitchell; Kameron Decker Harris; Isabel M. Kloumann; James P. Bagrow; Karine Megerdoomian; Matthew T. McMahon; Brian F. Tivnan; Christopher M. Danforth

The concerns expressed by Garcia et al. (1) are misplaced due to a range of misconceptions about word usage frequency, word rank, and expert-constructed word lists such as LIWC (Linguist Inquiry and Word Count) (2). We provide a complete response in our papers online appendices (3). Garcia et al. (1) suggest that the set of function words in the LIWC dataset (2) show a wide spectrum of average happiness with positive skew (figure 1A in ref. 1) when, according to their interpretation, these words should exhibit a Dirac δ function located at neutral (havg = 5 on a 1–9 scale). However, many words tagged as function words in the LIWC dataset readily elicit an emotional response in raters as exemplified by “greatest” (havg = 7.26), “best” (havg = 7.26), “negative” (havg = 2.42), and “worst” (havg = 2.10). In our study (3), basic function words that are expected to be neutral, such as “the” (havg = 4.98) and “to” (havg = 4.98), were appropriately scored as such. Moreover, no meaningful statement about biases can be made for sets of words chosen without frequency of use properly incorporated.


PLOS ONE | 2018

Divergent discourse between protests and counter-protests: #BlackLivesMatter and #AllLivesMatter

Ryan J. Gallagher; Andrew J. Reagan; Christopher M. Danforth; Peter Sheridan Dodds

Since the shooting of Black teenager Michael Brown by White police officer Darren Wilson in Ferguson, Missouri, the protest hashtag #BlackLivesMatter has amplified critiques of extrajudicial killings of Black Americans. In response to #BlackLivesMatter, other Twitter users have adopted #AllLivesMatter, a counter-protest hashtag whose content argues that equal attention should be given to all lives regardless of race. Through a multi-level analysis of over 860,000 tweets, we study how these protests and counter-protests diverge by quantifying aspects of their discourse. We find that #AllLivesMatter facilitates opposition between #BlackLivesMatter and hashtags such as #PoliceLivesMatter and #BlueLivesMatter in such a way that historically echoes the tension between Black protesters and law enforcement. In addition, we show that a significant portion of #AllLivesMatter use stems from hijacking by #BlackLivesMatter advocates. Beyond simply injecting #AllLivesMatter with #BlackLivesMatter content, these hijackers use the hashtag to directly confront the counter-protest notion of “All lives matter.” Our findings suggest that Black Lives Matter movement was able to grow, exhibit diverse conversations, and avoid derailment on social media by making discussion of counter-protest opinions a central topic of #AllLivesMatter, rather than the movement itself.


PLOS ONE | 2016

Tracking Climate Change through the Spatiotemporal Dynamics of the Teletherms, the Statistically Hottest and Coldest Days of the Year.

Peter Sheridan Dodds; Lewis Mitchell; Andrew J. Reagan; Christopher M. Danforth

Instabilities and long term shifts in seasons, whether induced by natural drivers or human activities, pose great disruptive threats to ecological, agricultural, and social systems. Here, we propose, measure, and explore two fundamental markers of location-sensitive seasonal variations: the Summer and Winter Teletherms—the on-average annual dates of the hottest and coldest days of the year. We analyse daily temperature extremes recorded at 1218 stations across the contiguous United States from 1853–2012, and observe large regional variation with the Summer Teletherm falling up to 90 days after the Summer Solstice, and 50 days for the Winter Teletherm after the Winter Solstice. We show that Teletherm temporal dynamics are substantive with clear and in some cases dramatic shifts reflective of system bifurcations. We also compare recorded daily temperature extremes with output from two regional climate models finding considerable though relatively unbiased error. Our work demonstrates that Teletherms are an intuitive, powerful, and statistically sound measure of local climate change, and that they pose detailed, stringent challenges for future theoretical and computational models.


PLOS ONE | 2014

Collective Philanthropy: Describing and Modeling the Ecology of Giving

William L. Gottesman; Andrew J. Reagan; Peter Sheridan Dodds

Reflective of income and wealth distributions, philanthropic gifting appears to follow an approximate power-law size distribution as measured by the size of gifts received by individual institutions. We explore the ecology of gifting by analysing data sets of individual gifts for a diverse group of institutions dedicated to education, medicine, art, public support, and religion. We find that the detailed forms of gift-size distributions differ across but are relatively constant within charity categories. We construct a model for how a donors income affects their giving preferences in different charity categories, offering a mechanistic explanation for variations in institutional gift-size distributions. We discuss how knowledge of gift-sized distributions may be used to assess an institutions gift-giving profile, to help set fundraising goals, and to design an institution-specific giving pyramid.

Collaboration


Dive into the Andrew J. Reagan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Morgan R. Frank

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Suma Desu

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge