David Stillwell
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Stillwell.
Proceedings of the National Academy of Sciences of the United States of America | 2013
Michal Kosinski; David Stillwell; Thore Graepel
We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait “Openness,” prediction accuracy is close to the test–retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.
Proceedings of the National Academy of Sciences of the United States of America | 2015
Wu Youyou; Michal Kosinski; David Stillwell
Significance This study compares the accuracy of personality judgment—a ubiquitous and important social-cognitive activity—between computer models and humans. Using several criteria, we show that computers’ judgments of people’s personalities based on their digital footprints are more accurate and valid than judgments made by their close others or acquaintances (friends, family, spouse, colleagues, etc.). Our findings highlight that people’s personalities can be predicted automatically and without involving human social-cognitive skills. Judging others’ personalities is an essential skill in successful social living, as personality is a key driver behind people’s interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants’ Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy.
Journal of Personality and Social Psychology | 2015
Gregory Park; H. Andrew Schwartz; Johannes C. Eichstaedt; Margaret L. Kern; Michal Kosinski; David Stillwell; Lyle H. Ungar; Martin E. P. Seligman
Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden.
conference on computer supported cooperative work | 2012
Daniele Quercia; Renaud Lambiotte; David Stillwell; Michal Kosinski; Jon Crowcroft
We study the relationship between Facebook popularity (number of contacts) and personality traits on a large number of subjects. We test to which extent two prevalent viewpoints hold. That is, popular users (those with many social contacts) are the ones whose personality traits either predict many offline (real world) friends or predict propensity to maintain superficial relationships. We find that the predictor for number of friends in the real world (Extraversion) is also a predictor for number of Facebook contacts. We then test whether people who have many social contacts on Facebook are the ones who are able to adapt themselves to new forms of communication, present themselves in likable ways, and have propensity to maintain superficial relationships. We show that there is no statistical evidence to support such a conjecture.
Journal of Personality and Social Psychology | 2013
Peter J. Rentfrow; Samuel D. Gosling; Markus Jokela; David Stillwell; Michal Kosinski; Jeff Potter
There is overwhelming evidence for regional variation across the United States on a range of key political, economic, social, and health indicators. However, a substantial body of research suggests that activities in each of these domains are typically influenced by psychological variables, raising the possibility that psychological forces might be the mediating or causal factors responsible for regional variation in the key indicators. Thus, the present article examined whether configurations of psychological variables, in this case personality traits, can usefully be used to segment the country. Do regions emerge that can be defined in terms of their characteristic personality profiles? How are those regions distributed geographically? And are they associated with particular patterns of key political, economic, social, and health indicators? Results from cluster analyses of 5 independent samples totaling over 1.5 million individuals identified 3 robust psychological profiles: Friendly & Conventional, Relaxed & Creative, and Temperamental & Uninhibited. The psychological profiles were found to cluster geographically and displayed unique patterns of associations with key geographical indicators. The findings demonstrate the value of a geographical perspective in unpacking the connections between microlevel processes and consequential macrolevel outcomes.
Machine Learning | 2014
Michal Kosinski; Pushmeet Kohli; David Stillwell; Thore Graepel
Individual differences in personality affect users’ online activities as much as they do in the offline world. This work, based on a sample of over a third of a million users, examines how users’ behaviour in the online environment, captured by their website choices and Facebook profile features, relates to their personality, as measured by the standard Five Factor Model personality questionnaire. Results show that there are psychologically meaningful links between users’ personalities, their website preferences and Facebook profile features. We show how website audiences differ in terms of their personality, present the relationships between personality and Facebook profile features, and show how an individual’s personality can be predicted from Facebook profile features. We conclude that predicting a user’s personality profile can be applied to personalize content, optimize search results, and improve online advertising.
empirical methods in natural language processing | 2014
Maarten Sap; Gregory Park; Johannes C. Eichstaedt; Margaret L. Kern; David Stillwell; Michal Kosinski; Lyle H. Ungar; Hansen Andrew Schwartz
Demographic lexica have potential for widespread use in social science, economic, and business applications. We derive predictive lexica (words and weights) for age and gender using regression and classification models from word usage in Facebook, blog, and Twitter data with associated demographic labels. The lexica, made publicly available,1 achieved state-of-the-art accuracy in language based age and gender prediction over Facebook and Twitter, and were evaluated for generalization across social media genres as well as in limited message situations.
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality | 2014
H. Andrew Schwartz; Johannes C. Eichstaedt; Margaret L. Kern; Gregory Park; Maarten Sap; David Stillwell; Michal Kosinski; Lyle H. Ungar
Depression is typically diagnosed as being present or absent. However, depression severity is believed to be continuously distributed rather than dichotomous. Severity may vary for a given patient daily and seasonally as a function of many variables ranging from life events to environmental factors. Repeated population-scale assessment of depression through questionnaires is expensive. In this paper we use survey responses and status updates from 28,749 Facebook users to develop a regression model that predicts users’ degree of depression based on their Facebook status updates. Our user-level predictive accuracy is modest, significantly outperforming a baseline of average user sentiment. We use our model to estimate user changes in depression across seasons, and find, consistent with literature, users’ degree of depression most often increases from summer to winter. We then show the potential to study factors driving individuals’ level of depression by looking at its most highly correlated language features.
Assessment | 2014
Margaret L. Kern; Johannes C. Eichstaedt; H. Andrew Schwartz; Lukasz Dziurzynski; Lyle H. Ungar; David Stillwell; Michal Kosinski; Stephanie M. Ramones; Martin E. P. Seligman
Objective: We present a new open language analysis approach that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personality trait. Method: Using millions of posts from 69,792 Facebook users, we examined the correlation of personality traits with online word usage. Our analysis method consists of feature extraction, correlational analysis, and visualization. Results: The distinguishing words and phrases were face valid and provide insight into processes that underlie the Big Five traits. Conclusion: Open-ended data driven exploration of large datasets combined with established psychological theory and measures offers new tools to further understand the human psyche.
Developmental Psychology | 2014
Margaret L. Kern; Johannes C. Eichstaedt; H. Andrew Schwartz; Gregory Park; Lyle H. Ungar; David Stillwell; Michal Kosinski; Lukasz Dziurzynski; Martin E. P. Seligman
We introduce a new method, differential language analysis (DLA), for studying human development in which computational linguistics are used to analyze the big data available through online social media in light of psychological theory. Our open vocabulary DLA approach finds words, phrases, and topics that distinguish groups of people based on 1 or more characteristics. Using a data set of over 70,000 Facebook users, we identify how word and topic use vary as a function of age and compile cohort specific words and phrases into visual summaries that are face valid and intuitively meaningful. We demonstrate how this methodology can be used to test developmental hypotheses, using the aging positivity effect (Carstensen & Mikels, 2005) as an example. While in this study we focused primarily on common trends across age-related cohorts, the same methodology can be used to explore heterogeneity within developmental stages or to explore other characteristics that differentiate groups of people. Our comprehensive list of words and topics is available on our web site for deeper exploration by the research community.