Salvatore Giorgi
University of Pennsylvania
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Salvatore Giorgi.
meeting of the association for computational linguistics | 2016
Lucie Flekova; Jordan Carpenter; Salvatore Giorgi; Lyle H. Ungar; Daniel Preoţiuc-Pietro
User traits disclosed through written text, such as age and gender, can be used to personalize applications such as recommender systems or conversational agents. However, human perception of these traits is not perfectly aligned with reality. In this paper, we conduct a large-scale crowdsourcing experiment on guessing age and gender from tweets. We systematically analyze the quality and possible biases of these predictions. We identify the textual cues which lead to miss-assessments of traits or make annotators more or less confident in their choice. Our study demonstrates that differences between real and perceived traits are noteworthy and elucidates inaccurately used stereotypes in human perception.
Social Psychological and Personality Science | 2017
Jordan Carpenter; Daniel Preotiuc-Pietro; Lucie Flekova; Salvatore Giorgi; Courtney Hagan; Margaret L. Kern; Anneke Buffone; Lyle H. Ungar; Martin E. P. Seligman
People associate certain behaviors with certain social groups. These stereotypical beliefs consist of both accurate and inaccurate associations. Using large-scale, data-driven methods with social media as a context, we isolate stereotypes by using verbal expression. Across four social categories—gender, age, education level, and political orientation—we identify words and phrases that lead people to incorrectly guess the social category of the writer. Although raters often correctly categorize authors, they overestimate the importance of some stereotype-congruent signal. Findings suggest that data-driven approaches might be a valuable and ecologically valid tool for identifying even subtle aspects of stereotypes and highlighting the facets that are exaggerated or misapplied.
conference on information and knowledge management | 2016
Daniel Preotiuc-Pietro; Jordan Carpenter; Salvatore Giorgi; Lyle H. Ungar
Research into the darker traits of human nature is growing in interest especially in the context of increased social media usage. This allows users to express themselves to a wider online audience. We study the extent to which the standard model of dark personality -- the dark triad -- consisting of narcissism, psychopathy and Machiavellianism, is related to observable Twitter behavior such as platform usage, posted text and profile image choice. Our results show that we can map various behaviors to psychological theory and study new aspects related to social media usage. Finally, we build a machine learning algorithm that predicts the dark triad of personality in out-of-sample users with reliable accuracy.
empirical methods in natural language processing | 2016
Laura Smith; Salvatore Giorgi; Rishi Solanki; Johannes C. Eichstaedt; H. Andrew Schwartz; Muhammad Abdul-Mageed; Anneke Buffone; Lyle H. Ungar
We investigate whether psychological wellbeing translates across English and Spanish Twitter, by building and comparing source language and automatically translated weighted lexica in English and Spanish. We find that the source language models perform substantially better than the machine translated versions. Moreover, manually correcting translation errors does not improve model performance, suggesting that meaningful cultural information is being lost in translation. Further work is needed to clarify when automatic translation of well-being lexica is effective and how it can be improved for crosscultural analysis.
PLOS ONE | 2018
Brenda Curtis; Salvatore Giorgi; Anneke Buffone; Lyle H. Ungar; Robert D. Ashford; Jessie Hemmons; Dan Summers; Casey Hamilton; H. Andrew Schwartz
Objectives The current study analyzes a large set of Twitter data from 1,384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county. Methods Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis. Results Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. ‘ready gettin leave’) can explain much of the variance associated between socioeconomics and excessive alcohol consumption. Conclusions Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
Psychological Assessment | 2018
Jeremy D. W. Clifton; Joshua D. Baker; Crystal L. Park; David B. Yaden; Alicia B. W. Clifton; Paolo Terni; Jessica L. Miller; Guang Zeng; Salvatore Giorgi; H. Andrew Schwartz; Martin E. P. Seligman
Beck’s insight—that beliefs about one’s self, future, and environment shape behavior—transformed depression treatment. Yet environment beliefs remain relatively understudied. We introduce a set of environment beliefs—primal world beliefs or primals—that concern the world’s overall character (e.g., the world is interesting, the world is dangerous). To create a measure, we systematically identified candidate primals (e.g., analyzing tweets, historical texts, etc.); conducted exploratory factor analysis (N = 930) and two confirmatory factor analyses (N = 524; N = 529); examined sequence effects (N = 219) and concurrent validity (N = 122); and conducted test-retests over 2 weeks (n = 122), 9 months (n = 134), and 19 months (n = 398). The resulting 99-item Primals Inventory (PI-99) measures 26 primals with three overarching beliefs—Safe, Enticing, and Alive (mean &agr; = .93)—that typically explain ∼55% of the common variance. These beliefs were normally distributed; stable (2 weeks, 9 months, and 19 month test-retest results averaged .88, .75, and .77, respectively); strongly correlated with many personality and wellbeing variables (e.g., Safe and optimism, r = .61; Enticing and depression, r = −.52; Alive and meaning, r = .54); and explained more variance in life satisfaction, transcendent experience, trust, and gratitude than the BIG 5 (3%, 3%, 6%, and 12% more variance, respectively). In sum, the PI-99 showed strong psychometric characteristics, primals plausibly shape many personality and wellbeing variables, and a broad research effort examining these relationships is warranted.
meeting of the association for computational linguistics | 2017
Fatemeh Almodaresi; Lyle H. Ungar; Vivek Kulkarni; Mohsen Zakeri; Salvatore Giorgi; H. Andrew Schwartz
Natural language processing has increasingly moved from modeling documents and words toward studying the people behind the language. This move to working with data at the user or community level has presented the field with different characteristics of linguistic data. In this paper, we empirically characterize various lexical distributions at different levels of analysis, showing that, while most features are decidedly sparse and non-normal at the message-level (as with traditional NLP), they follow the central limit theorem to become much more Log-normal or even Normal at the user- and county-levels. Finally, we demonstrate that modeling lexical features for the correct level of analysis leads to marked improvements in common social scientific prediction tasks.
empirical methods in natural language processing | 2017
H. Andrew Schwartz; Salvatore Giorgi; Maarten Sap; Patrick Crutchley; Lyle H. Ungar; Johannes C. Eichstaedt
Work-in-Progress | 2015
Lucie Flekova; Daniel Preoţiuc-Pietro; Jordan Carpenter; Salvatore Giorgi; Lyle H. Ungar
empirical methods in natural language processing | 2018
Mohammadzaman Zamani; H. Andrew Schwartz; Veronica E. Lynn; Salvatore Giorgi; Niranjan Balasubramanian