Gender Bias, Social Bias and Representation: 70 Years of B^Hollywood
GG ENDER B IAS , S
OCIAL B IAS AND R EPRESENTATION : 70Y
EARS OF B H OLLYWOOD
A P
REPRINT
Kunal Khadilkar ∗ Carnegie Mellon University [email protected]
Ashiqur R. KhudaBukhsh ∗ Carnegie Mellon University [email protected]
Tom M. Mitchell
Carnegie Mellon University [email protected]
February 19, 2021 A BSTRACT
With an outreach in more than 90 countries, a market share of 2.1 billion dollars and a targetaudience base of at least 1.2 billion people, Bollywood, aka the Mumbai film industry, is a formidableentertainment force. While the number of lives Bollywood can potentially touch is massive, nocomprehensive NLP study on the evolution of social and gender biases in Bollywood dialoguesexists. Via a substantial corpus of movie dialogues spanning a time horizon of 70 years, we seek tounderstand the portrayal of women, in a broader context studying subtle social signals, and analyzethe evolving trends in geographic and religious representation in India. Our argument is simple– popular movie content reflects social norms and beliefs in some form or shape. In this project,we propose to analyze such trends over 70 years of Bollywood movies contrasting them with theirHollywood counterpart and critically acclaimed world movies. K eywords Gender Bias · Social Bias · Bollywood · Hollywood
What types of social biases can we analyze and detect through the lens of a diachronic corpus of popular entertainment?
In this paper, we focus on Bollywood, aka the Mumbai film industry, and analyze a curated corpus of film subtitles forthe last 70 years. While Bollywood is an entertainment industry worth billions and has a target audience of 1.2 billionpeople, little or no work exists that analyzed a wide range of social biases and signals that can be uncovered from thisrich corpus. In this work, we contrast our findings with an analogous corpus of Hollywood and for a specific subset ofresearch questions, we dig deeper and look into world movies.Our primary focus is gender bias. As shown in Table 1, several Bollywood movies are riddled with sexist and misogynistdialogues. It is thus not surprising that cutting edge NLP methods would reveal some of these existing biases. We are,however, interested in a broader research question:
In a developing nation, what kind of social insights can be gleanedfrom popular entertainment?
Is it possible to understand subtle gender biases such as son’s preference? Can we trackthe evolving nature of retrograde social practises like dowry?Our second focus in this work is broader representation questions such as geographic representation, religious represen-tation and caste representation. In our mixed method analyses, we identify that (1) some of gender biases observedin Bollywood is very much present in it’s Western counterpart; (2) a positive trend is witnessed in observing reducedbiases with progress of time and; and (3) a similar trend is observed in religious and geographic representation, with aconsiderable scope for improved diversity and inclusion. ∗ Kunal Khadilkar and Ashiqur R. KhudaBukhsh are equal contribution first authors. a r X i v : . [ c s . C Y ] F e b ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
Akeli ladki khuli tijori ki tarah hoti hai (JabWe Met) A girl who is alone is like an open treasure.(Jab We Met)Generated Movie Revenue: ≈ $14,899,137Marriage se pehle ladkiyajn sex object hotihain, our marriage ke baad they object tosex! (Kambakkht Ishq) Before marriage, girls are sex objects, andafter marriage, girls object to sex. (Kam-bakkht Ishq)Generated Movie Revenue: ≈ $17,531,586Tu ladki ke peeche bhagega, ladki paise kepeeche bhagegi. Tu paise ke piche bhagega,ladki tere peeche bhagegi (Wanted) You are chasing the girls, while the girlsare chasing money. If you start chasingmoney, girls will automatically chase you.(Wanted)Generated Movie Revenue: ≈ $27,630,059Table 1: Illustrative examples of misogynistic dialogues present in blockbuster Bollywood movies (movie names arepresented in parentheses). The dialogues (in blue) are in Romanized Hindi, and their approximate English translationsare highlighted in red. We aim to answer the following research questions -
RQ 1:
How is gender bias reflected through movie dialogues in the Bollywood and Hollywood movie industries?
RQ 2:
How do award winning foreign feature films compare with Bollywood and Hollywood in addressing genderequality? Does genre make any difference?
RQ 3 : Is beauty associated with fair skin in the movie dialogues describing women?
RQ 4:
Does Bollywood reflect the well-documented son’s preference in medical and social science research? How hasthe sentiment around retrograde social practices such as dowry evolved?
RQ 5:
Which geographical areas have been consistently underrepresented in the Indian film industry?
RQ 6:
Can we gain an insight into the religious representation of a country, through a film corpus spanning 70 years?
RQ 7:
Can we extract economic signals through popular film dialogues?
RQ 8:
How are religions perceived in movies? Can we track evolving national priorities from popular entertainment?
Studies analyzing gender stereotypes across different languages [1] and detecting bias in word embeddings [2] usebooks and news data sources for their analyses. Existing lines of work in uncovering entertainment industry bias focuson a single Bollywood movie [3] or a small subset of movies [4]. [5] focused on plot points and film information takenfrom Wikipedia. We consider a different and potentially richer data set of film subtitles spanning 70 years. We contrastour work with Hollywood and award winning world movies, and our analyses cover a broader set of aspects such asretrograde social practices, uncovering subtler biases and highlighting geographical and religious underrepresentation.Unlike previous work on movie subtitles [1], our focus is on Bollywood content largely ignored by the informationscience research community so far.In terms of our dataset, our work is most similar to [6] that looked at a smaller corpus of Bollywood subtitles. Our workcontrasts with [6] in the following key ways. First, our treatment to the analyses of gender bias (1) includes diachronic-word embedding analysis and word embedding association tests (WEAT); (2) is grounded on well-established lexicons;(3) looks into subtler signals such as son’s preference; (4) tracks retrograde social practices such as dowry and (5)considers additional data sources (e.g., world movies). Second, our work tackles important questions on geographicand religious representation unaddressed by [6]. Finally, we look into questions related to economic signals and showpromising computational creativity results.In this paper, we explore a wide array of NLP techniques to analyze our research questions through the lens of popularmovies. We use1. Simple count-based statistics relying on highly popular lexicons [7] and gender representation studies [8, 9];2.
Cloze test , an analysis technique that have a solid grounding in psycholinguistics literature [10, 11]. Tothe best of our knowledge, for the first time, we explore a recent technique [12] previously used to minepolitical insights [13] in the context of uncovering social biases. Through a series of cloze tests on a languagemodel [14] fine-tuned on our data sets, we present our findings.3.
Analysis of aligned diachronic word embedding spaces using recently proposed techniques [15].2ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT Free form text completion using GPT-2 [16], for a novel task of tracking economic signals.We employ this broad suite of NLP techniques on a novel domain of popular entertainment.Geographical and community representation in India has been studied by various political and social scientists. [17]showcase the population misrepresentation in political settings, while [18] focuses on underrepresentation of north-eastIndia in mainstream newspapers. Our work complements this research [18] and presents corroborating evidence from avery different data source.
We construct the following two data sets of movie subtitles.
1. Bollywood movies, D bolly : We consider movies spanning seven decades (1950–2020). For each decade, we retrievedsubtitles [19] of 100 movies (700 total films).
2. Hollywood movies, D holly : Similar to Bollywood movies, we considered 100 top-grossing movies from each of theseven decades (700 total films). Overall, D bolly and D holly consist of 1.1M dialogues (6.2M Tokens) and 1M dialogues(5.4M Tokens), respectively.For a subset of our analyses, we divide our corpus into three buckets presented in Table 2.Corpus Industry Time Period D oldbolly Bollywood 1950 - 1969 D midbolly Bollywood 1970 - 1999 D newbolly Bollywood 2000 - 2020 Corpus Industry Time Period D oldholly Hollywood 1950 - 1969 D midholly Hollywood 1970 - 1999 D newholly Hollywood 2000 - 2020Table 2: Dataset split for Bollywood and HollywoodWe further collect 150 movies which have been nominated for the
Best International Feature Film award 1970 onwards,at the Oscars ( ) for a subset of our analyses.
RQ 1:
How is gender bias reflected through movie dialogues in the Bollywood and Hollywood movie industries?
Figure 1:
MPR in D bolly and D holly Following extensive literature on gendered pronouns’ relative distributions and their implications [8, 9], we considera simple measure of gender representation: relative occurrence of pronouns of each gender (Men: he, him. Women:she, her). Let N w denote the number of times a token w appears in a corpus. We define Male Pronoun Ratio (MPR) asfollows:
MPR = N he + N him N he + N him + N she + N her ∗ . Figure 1 plots MPR of our decade-wise movie data sets and contrastswith
MPR computed using google n-grams. Our results indicate that even now, both Bollywood and Hollywood exhibitcomparable skew in gendered pronoun usage. 3ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
When presented with a sentence (or a sentence stem) with a missing word, a cloze task [10] is essentially a fill-in-the-blank task. For instance, in the following cloze task:
In the [MASK] , it is very sunny , summer is a likely completionfor the missing word. Given a cloze test, BERT , a well-known language model [14], outputs a series of token rankedby probability. In fact, in the above cloze test, the top three tokens (ranked by probability) predicted by
BERT base are:summer, winter and spring. Recent lines of research has explored
BERT ’s masked query prediction for (1) knowledgebase extraction [12] and (2) mining political insights [13].
Probe
BERT base
BERT D oldbolly BERT D newbolly BERT D oldholly BERT D newholly cloze man, widow, woman,doctor, slave, soldier,bachelor, merchant,farmer, lawyer, ser-vant [ ] prostitute, servant,woman, slave, bach-elor, doctor, lawyer,man, widow, maid,worker [ ] doctor, woman, ser-vant, lawyer, maid,hindu, nurse, teacher,gardener, lady, man[ ] woman, slave, ser-vant, nurse, lady,man, teacher, lawyer,peasant, maid, wife[ ] woman, lawyer, doc-tor, nurse, teacher,man, writer, secre-tary, prostitute, pro-fessional, carpenter[ ] cloze man, soldier, gentle-man, farmer, mer-chant, woman, slave,bachelor, doctor, car-penter, servant [ ] man, gentleman,lawyer, lawyer, ser-vant, doctor, farmer,worker, craftsman,slave, criminal [ ] doctor, lawyer,policeman, man,farmer, bachelor,gardener, servant,soldier, mechanic,builder [ ] carpenter, police-man, lawyer, soldier,farmer, gentleman,servant, man, peas-ant, slave, doctor[ ] man, lawyer, soldier,doctor, carpenter,gentleman, clergy-man, farmer, writer,craftsman, minister[ ] Table 3: Cloze test results. Predicted tokens are ranked by decreasing probability. Positive and negative words arecolor coded with blue and red, respectively. The number in the bracket represents the average valence score (obtainedfrom [7]) calculated for the answers to the cloze test.Our cloze test results are summarized in Table 3. We observe that completion results for both genders across bothmovie industries improve over time. In order to quantify the completion results, we consider a well-known lexicon ofemotional valence ratings [7] of nearly 14,000 English words to quantify the change of cloze test completions over time.The valence score of these words are presented in a scale of 1 to 10 with 10 indicating highly positive and 1 indicatinghighly negative. For example, the emotional valence score of happy and sad are 8.47 and 2.10, respectively. For agiven data set and a cloze test pair, we compute the average valence score of the completions (listed in square bracketsin Table 17). We further note that comparing completion words across the two genders can be technically difficultsimply because we observe that there can be potential bias in the scores. For instance, the valence scores for man and woman are 5.42 and 7.09, respectively. Hence we restrict ourselves to comparing within a specific gender for a givenmovie industry.
Bollywood Hollywood
Women 22% 7%Men 5% 15%Table 4: Percent increase in average valence score for cloze test completions between old movies and new movies.Table 4 lists the percentage of increase in the valence score of the completion for a particular gender across differentmovie industries. We note that for both Bollywood and Hollywood, the valence scores for both genders improved overtime. However, for Bollywood, we notice that rate of increase for women is substantially more pronounced than that formen. This observation aligns with the continual fight for gender equality in India [20] and major movements that havemobilised voices for women’s right to work [21], financial independence [22], and marital laws [23].
The meaning of words and the context in which they are used change over time [24]. The language spoken in acommunity is representative of the cultural norms and customs followed in that region. Existing hypotheses [25]indicate that word frequency may play a role in changing the meaning of the words over time. The meanings ofless-frequent words are more susceptible to drifts than that of highly frequent words. [15] provide a robust multilingualapproach to align diachronic word embeddings using orthogonal procrustes. We follow the same method to aligndifferent sub-corpora for Bollywood and Hollywood. In addition to the four sub-corpora – D oldbolly , D newbolly , D oldholly , D newholly – we consider the time period of 1970-2000 and construct two additional sub-corpora D midbolly and D midholly .Wetrain word2vec [26] with SGNS (Skip-gram with Negative Sampling), to create embeddings for each of the bucketsmentioned above. Let W ( t ) ∈ R ( d ) ×V be the matrix of word embeddings learnt for period t for vocabulary V . Following4ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT (a)
Woman over the years (b)
Man over the years
Figure 2: Nearest neighbors of man and woman over the years. The overall average valence of nearest neighborsaccording to the lexicon provided in [7] for a given time-period is presented in blue font.[15], we align the word embeddings using the top 10,000 common tokens present across time periods t and t + 1 byoptimizing: R t = arg min Q T Q = I (cid:13)(cid:13)(cid:13) Q W t − W ( t +1) (cid:13)(cid:13)(cid:13) F (1)where R t ∈ R d × d .We focus on the portrayal of women and men using these aligned embeddings. We observe that the valence scores forboth genders across both movie industries show a similar pattern. The scores are the lowest during the 1970-1999 timeperiod. The valence scores for the newer movies are better than the scores for the older movies. The dip in the valencescores during the time-period of 1970-1999 in India can be ascribed to a social and cultural crisis influenced by anunstable political climate (assassinations of two prime ministers [27, 28]), two major wars between India and Pakistan[29, 30], and a large overlap with a pre-economic liberalization period [31]. While valence scores give us an indication regarding the difference in valence (degree of pleasantness) scores betweenthe two genders on Language Models cloze tests, we explore quantifying the bias in our corpus using a more populartechnique of Word Embedding Association Test (WEAT) [32]. For the purpose of this analysis, We train the sub-corporausing GloVe [33] embeddings, as given in [32], to understand the evolution of gender bias through time. According toWEAT, we consider two equal sized sets of occupations, S and S and two sets of attribute words A and A .The similarity of two words, say x and y , is given by calculating the cosine similarity of the corresponding wordembeddings, cos ( w x , w y ) . As given in [32], the differential association of a word c with word sets A and A is givenby: g ( c, A , A , w ) = mean a(cid:15)A cos ( w c , w a ) − mean b(cid:15)A cos ( w c , w b ) (2)Further, the WEAT is calculated as: B weat ( w ) = mean s(cid:15)S g ( s, A , A , w ) − mean t(cid:15)S g ( t, A , A , w ) std − dev s(cid:15)S S g ( s, A , A , w ) (3)Here, the occupation sets taken from [34] S and S are: S = [“maestro”, “skipper”, “protege”, “philosopher”, “captain”, “architect”, “financier”, "warrior”, “broad-caster”,“magician”, “pilot”, “boss”] S = [“homemaker”, “nurse”, “receptionist”, “librarian”, “socialite”,“hairdresser”,“nanny”, “bookkeeper”, “stylist”, “housekeeper”, “designer”, “counselor”]And the attribute word sets A and A are: A = [‘he’,‘man’,‘male’] A = [‘she’,‘woman’,‘female’]5ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
Industry
Old Mid New
Bollywood 0.601 0.559 0.543Hollywood 0.456 0.443 0.410Table 5: Average
WEAT scores for Bollywood and Hollywood across different time periods. A larger value indicatesgreater gender bias.As shown in Table 5, we find that the average
WEAT scores across both industries reduced over time. As compared toBollywood, for any given time period, Hollywood exhibits lesser gender bias.
RQ 2:
How do award winning foreign feature films compare with Bollywood and Hollywood in addressing genderequality? Does genre make any difference?
Along with performing a comparative study between Bollywood and Hollywood, we are further interested in comparingthe
WEAT scores of Bollywood and Hollywood with that of a set of critically acclaimed world movies. To this end, weconstruct a corpus of 150 movies nominated at the foreign film category at the Academy awards.Industry
WEAT scoreBollywood 0.523Hollywood 0.504World movies
Table 6:
WEAT scores for Bollywood, Hollywood and world movies.Our results indicate that the average
WEAT score obtained for nominated foreign feature films is the lowest as comparedto the average
WEAT score for Bollywood and Hollywood.Adventure/Action and Romance are the two most popular genres across different industries, with hundreds of filmsreleased every year. Action films generally tend to be male dominated, compared to Romantic films. We explorethis hypothesis using
WEAT , for the given genres. For this analysis, we consider four token-balanced sub corpora, G actionbolly , G romancebolly , G actionholly , G romanceholly , each containing 150 films. All the films were post 1990, and were chosen basedon the genre lists/tags given by imdb and Google. Romance Action
Hollywood 0.079
Bollywood 0.343
Table 7:
WEAT scores for Romance and Action films.Table 7 shows that the gender bias for action movies is a lot more pronounced than that in Romance movies. Acrossboth industries and movie genres, Hollywood action films exhibit most bias.
RQ 3 : Is beauty associated with fair skin in the movie dialogues describing women?
We first present our cloze test results with the probe "
A beautiful woman should have [MASK] skin. " in Table 8. Wenote that while
BERT base model predicts soft in place of [MASK] , all fine-tuned
BERT models on the film corpora predict fair as the top choice. Figure 3 visualizes the nearest neighbors of beautiful in our aligned embedding spaces ofHollywood and Bollywood sub-corpora. 6ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
BERT base
BERT D oldbolly BERT D newbolly BERT D oldholly BERT D newholly soft, beautiful, pale,tanned, smooth fair, no, pale, tanned,tan fair, tanned, golden,smooth, pale fair, pale, blue,golden, gold fair, pale, tanned,golden, dark Table 8: Cloze test results for the probe
A beautiful woman should have [MASK] skin. .Figure 3: Nearest neighbors of beautiful over the yearsAs shown in Figure 3 and Table 8, the age-old affinity toward lighter skin in Indian culture [35, 36, 37] is reflectedthrough the consistent presence of fair among the nearest neighbors of all three Bollywood sub-corpora. Although ourcloze tests indicates Hollywood also exhibits bias towards lighter skin color, our diachronic word embedding analysisreveals that possibly the bias is less pronounced than in Bollywood.
RQ 4:
Does Bollywood reflect the well-documented son’s preference in medical and social science research? How hasthe sentiment around retrograde social practices such as dowry evolved?
Occupational stereotypes aside, a diachronic corpus may reveal subtler forms of biases. In our next analysis, we seek toanalyze two seemingly disconnected aspects: son’s preference and perception of dowry. Son’s preference in India isa well-documented phenomenon and skewed sex ratio, female feticide and higher child mortality rate for girls haveattracted policymakers’ attention [38, 39, 40, 41]. In order to prevent female feticide, in 1994, the Parliament of Indiaenacted the Pre-Conception and Pre-Natal Diagnostic Techniques (PCPNDT) Act also known as the Prohibition of SexSelection Act that effectively rendered prenatal sex discernment illegal.Similarly, certain retrograde practice such as dowry, can influence son’s preference as a girl-child might be looked uponas financial burden [42]. The ‘dowry’ system has plagued the Indian society for a long time [43]. Dowry refers to atransaction of tangible financial objects in the form of durable goods, cash, and real or movable property between thebride’s family gives and the bridegroom, his parents and his relatives as a condition of the marriage. Although legally,dowry has been prohibited in India since 1961 [44], this practice has continued well after its legal prohibition and has astrong link to social crises such as female foeticide [45], domestic abuse and violence [46, 47], and dowry deaths [48].However, while the practice continued, recent studies have reported positive changes in the society where the generalattitude towards the system has become negative [49].
A popular Bollywood plot point is the introduction of a child into the family. Approximately, every 1 in 10 collectedmovies had a scene involving birth of a child. We were curious to analyze when a child is born in a Bollywood movie,is it a boy or a girl? We retrieve the dialogues talking about childbirth using a template based approach, by searching forthe following keywords - ‘birth’,‘baby’,‘pregnant’,‘pregnancy’,‘congratulations’ as well as phrases - “It’s a boy”,“It’s a girl” . We annotated the retrieved dialogues related to childbirth, and perform a temporalanalysis. 7ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
Let N w denote the number of times a dialogue talking about the baby’s gender w appears in a corpus. We define MaleBirth Ratio (MBR) as follows:
MBR = N boy N boy + N girl ∗ . Table 9 suggests that the family dynamics portrayed in Bollywood movies have shownconsiderable shift, with the MBR being 73.9 in old movies, to almost achieving parity (54.5) in newer movies.
Old Mid NewMBR dowry over the years.
As shown in Figure 4, we observe that while nouns such as money , debt , jewellery , fees , and loan are the nearestneighbors in older films indicating compliance to this practice, modern films exhibit non-compliance (e.g., guts and refused ) and indicate some of the consequences of such non-compliance (e.g., divorce and trouble ) in the form ofnearest neighbors. We find that indeed, films provide a snapshot of cultural values of a particular country thus allowingus to gauge the progress of a nation over time. RQ 5:
Which geographical areas have been consistently underrepresented in the Indian film industry?
We now shift our focus to two broader representational aspects: geographical and religious representations. Of the 28states in India, we next present an analysis of the relative representation of each of these states and major Indian cities.From the beginning, Bollywood has it’s roots in Mumbai, and Delhi is India’s capital. Hence, it is not surprising thatthe cities are mentioned heavily across all time periods (see Table 10). Figure 5(a) and 5(b) compare the geographicrepresentations in the most recent 20 years with the rest of our corpus spanning 1950–2000. We observe that initiallybased out of major hotspots of Delhi, Goa and cities like Mumbai, recent Bollywood content is geographically moreinclusive. However, A key point we highlight in Figure 5(c) is that, in line with prior research on underrepresentation ofNorth Eastern states in news content [18], there is severe underrepresentation of Bollywood content surrounding theseNorth Eastern states. There have been zero mentions of the states of Manipur, Arunachal Pradesh, Meghalaya, Tripura,Mizoram in over 500 movies across 70 years. 8ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT (a) 1950–1999 (b) 2000–2020 (c) No representation
Figure 5: (a) Geographical Representation in films from 1950s to 2000s (b) Geographical Representation in films post2000 (c) States never mentioned in our corpus in the last 70 years. The base maps used for this plot are sourced from theGovernment of India. The authors are aware that these maps include disputed territories. These maps do not constitutejudgments on existing disputes.
Old Mid New
Bombay/Mumbai (55) Bombay/Mumbai (73) Bombay/Mumbai (91)Agra (50) Delhi (46) Delhi (63)Delhi (30) Agra (46) Agra (42)Daman (22) Daman (14) Bangalore/Bengaluru (11)Lucknow (14) Rampur (13) Pune (10)Mathura (6) Lucknow (12) Hyderabad (9)Srinagar (6) Pune (9) Amritsar (9)Jalgaon (4) Bangalore (8) Lucknow (8)Allahabad (4) Nagpur (7) Chandigarh (6)Table 10: City mentions in movies
RQ 6:
Can we gain an insight into the religious representation of a country, through a film corpus spanning 70 years?
Most-frequent surnamesSingh, Krishna, Khan, Rai, Ali, Kapoor, Sharma, Mohan, Prasad, Khanna, Shah,Lal, Thakur, Dev, Shekhar, Chaudhary, Gandhi, Verma, Gupta, Prakash, Rana, Nath,Patel, Pandey, Roy, Pandit, Saxena, Mathur, Roshan, Bachchan, Pal, Mehta,Narayan, Das, Rode, Dayal, Mehra, Bhagat, Shastri, Chandra, Patil, Banerjee,Tilak, Rao, Tripathi, Yadav, Kumari, Suman, Mukherjee, Bhatia, Acharya,Chatterjee, Rehman, IyerTable 11: The top surnames occurring in Bollywood movies (in decreasing order of frequency)India is a diverse country with 6 main religions, 22 major languages and more than 700 dialects. According to the2011 census [50], the religious distribution among the Indian population is 79.8% follow Hinduism, 14.2% adheresto Islam, 1.72% adheres to Sikhism, 2.3% adheres to Christianity, 0.7% adheres to Buddhism and 0.37% adheres toJainism. There has been discussion in the past, regarding the religious stereotypes portrayed in Bollywood [51], andwe build upon this by understanding the religious representation present in the large corpus of Bollywood subtitles.The surnames appearing in the movies (e.g., Mrs. Kapoor, Mr. Khan etc.) are annotated manually by two annotators,with each surname given one label from the list of labels: Hindu; Muslim; Sikh; Christian; Parsi; and Multiple. Theannotators achieved a Cohen κ score of 0.8879 indicating high inter-rater agreement. The discrepancies were resolvedby the annotators through a follow-up adjudication process and by consulting relevant literature. Figures 6, 7, 8 providethe religion distribution obtained in movies. We note that: (1) the distribution is more or less consistent with the census9ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
Surnames of doctorsKapoor, Chopra, Khurana, Tripathi, Kapoor, Ansari, Awasthi,Kothari, Mathur, Puri, Nayak, Bhalerao, Sawant, Tandon, Swamy, Banerjee, Verma,Rana, Ruby, Singh, Shrivastav, Khanna, Bhandari, Tiwari, Saxena, Shinde, Mehta,Goenka, Kumar, GoswamiTable 12: Surnames of doctors in Bollywood moviesnumbers; (2) representation for other religions has increased in recent years; and (3) the representation of Muslims isslightly less than the community’s population share. Sikh (7.26)Hindu (81.27)Muslim (6.16)Multiple (3.74)Parsi (1.32)Christian (0.22)Figure 6: Religious Representation in Old moviesSikh (8.14)Hindu (83.54)Muslim (5.43)Multiple (2.07)Parsi (0.63)Christian (0.15)Figure 7: Religious Representation in Mid moviesSikh (8.06)Hindu (79.52)Muslim (7.81)Multiple (3.84)Christian (0.49)Parsi (0.24)Figure 8: Religious Representation in New moviesTable 12 indicates the surnames of the doctors occuring in Bollywood movies. To retrieve these surnames from thesubtitles, we employ a template based approach, searching for keywords like ‘Dr.’, ‘doctor’ from our corpus.While a broader religious representation is observed in our overall results, the observed representation for the medicalprofession is quite skewed, with large number of surnames being Brahmins (the uppermost caste in the Hindu caste-system in India).
10 Economic Signals
RQ 7:
Can we extract economic signals through popular film dialogues?
By looking at a popular entertainment corpus of a developing nation, we are able to showcase the evolution of genderbias, evolving attitude towards social evils, geographic and religious representations. Can we detect economic signals10ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
Old Mid New
Predicted Average Ransom Amount 594,805 10,959,940 29,688,280Inflation adjusted - 2,194,830 21,000,280Table 13: Average amount for text completion results on the input sentence
The ransom amount is using fine-tuned
GPT-2 models. The inflation-adjusted values for 594,805 INR in 1960 is presented in bottom row.as well, from the Bollywood dialogues? 100 rupees in 1958 are equivalent to 8,117.22 rupees in 2020 [52]. We seek tounderstand whether language models can capture these noisy signals.
GPT-2 [16], a popular language model with morethan 100 million parameters, has achieved state of the art results for text completion, zero shot transfer learning etc.
GPT-2 has been widely used for generating free form text, to create artificial newsletters, poems [53] etc. We noticedthat the most common dialogues expressing monetary figures or large amounts of money were generally associated withransom. For example, a sample dialogue is “We have kidnapped your kid, the ransom amount is 2 million Rupees.”To understand this change at a large scale, we fine-tune
GPT-2 on three Bollywood sub-corpora, each belonging tofilms from different time periods, for the end goal of text/dialogue generation. On these finetuned models, we input thesentence “The ransom amount is” and analyze the generated text by the model. Table 13 showcases the average amountacross 100 generated samples from the finetuned models. We note that our while our predicted values overestimate theinflation rate, the ransom amounts capture the general increasing pattern and have increased significantly over time.
11 Evolving National Priorities as Reflected in Movies and Responsible Censorship
RQ 8:
How are religions perceived in movies? Can we track evolving national priorities from popular entertainment?
India has faced two major partitions over the last 70 years, which have resulted in considerable religious turmoil andriots [54]. The Censor Board is a governing body which along with giving each movie a certification, has the abilityto remove offensive or controversial content or in some extreme cases, completely ban films from being screened intheatres. With religion being a contentious topic in India, offensive terms surrounding it are also discouraged, andthis has been constant throughout the years. To validate this hypothesis, we look at the word ‘religion’ and how it hasevolved over the years. Figure 10 indicates that ‘religion’ is always accompanied with neutral/mild terms, and moviedialogues have stayed away from using extreme or hateful terms surrounding religion.Along with understanding the nearest neighbors of ‘religion’, we wanted to understand the discussion surroundingthe two biggest religions in India, ‘Hindu’ and ‘Muslim’ (Figures 9(a), 9(b)). We contrast our findings on the moviecorpus with prior research involving raw social media data [13]. While negative words like ruthless , shameless ,and traitor creeping up in newer movies might indicate religious polarization, we notice that words like terrorists found in social media data [13], do not surface among the nearest neighbors. Along with word embedding analysis, weanalyze the BERT cloze tests for the probes (1)
Hindus are [MASK] and (2)
Muslims are [MASK] . For both probes, wedo not notice completions such as terrorists or fools previously reported in [13]. This indicates that while recentsocial media analyses might indicate religious polarization, film certification board largely ensured movie content donot reflect such divide. BERT base
BERT D oldbolly BERT D newbolly corruption, poverty, malaria, pollu-tion, hunger, terrorism, unemployment,drought, famine, war, tourism poverty, love, war, hunger, unemploy-ment, india, famine, money, marriage, ed-ucation, kashmir poverty, pakistan, kashmir, terrorism, cor-ruption, india, drugs, dowry, unemploy-ment, hunger, rape Table 14: Cloze test results for
The biggest problem in India is [MASK] . Similar to
BERT ’s cloze test applications to uncover gender and racial bias showcased in earlier sections, we employ
BERT to analyze evolving national priorities using the following two cloze tests.1. For Bollywood:
The biggest problem in India is [MASK] .11ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT (a)
Hindu over the years (b)
Muslim over the years
Figure 9: Nearest neighbors of
Hindu and
Muslim over the years.
BERT base
BERT D oldholly BERT D newholly poverty, corruption, unemployment,crime, terrorism, racism, pollution,hunger, war, cancer, inequality war, poverty, money, unemployment, slav-ery, immigration, alcoholism, education,imperialism, russia, hunger poverty, slavery, immigration, unemploy-ment, money, war, racism, hunger, com-munism, america, education Table 15: Cloze test results for
The biggest problem in America is [MASK] .2. For Hollywood:
The biggest problem in America is [MASK] .We observe that the dynamic political conditions are reflected in the completion results in Tables 14 and 15 (e.g.,
Kashmir , Pakistan and
Russia ) [29, 30, 55]. We also note that the list of ongoing problems in the U.S. contains themajor issue on which the 2020 election was fought over: racism.
12 Future Directions and Conclusion
While our current work focused on answering research questions related to inclusion and bias, our corpus may find usein several other interesting applications. Here, we outline an exploratory analysis on computational creativity.In recent years, the emergence of neural architectures and language models like
GPT-2 [16] with millions of parametershave resulted in rapid advancements for various NLP tasks. These models has proved efficient in generating artificialpoems [56, 57], stories, news articles with just a few epochs of fine-tuning. We employ a similar strategy to generateFigure 10: religion over the years12ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT dialogues based on the subtitles collected from more than 2000 films, using
GPT-2 [16]. To the best of our knowledge,ours is the first ever attempt to automatically generate artificial Bollywood dialogues with human evaluations conductedby annotators with professional experience with the Bollywood industry.
Input Probe Artificially Generated Subtitles
The ransom forkidnapping The ransom for kidnapping hasn’t been paid yet.We’ll ask for the ransom at the next hearing.Then call up Mr. Trivedi and ask him for the ransom at the next hearing.Fine.Then we’ll ask for the next hearing at our doorstep.And he’ll refuse.Yes, tell me.Come.Sir, please listen to me.Your eyes are so Your eyes are so beautiful.They don’t let me sleep."They are just dreams."My heart is singing."Don’t let it go, beloved."Don’t let it go, beloved."What does thepolice have tosay about this in-cident? What does the police have to say about this incident?What do you mean?They don’t have a witness, madam.No case can be filed under section 302.They only record the statement of the complainants.The police is much more capable than this.Their power and reach is much more impressive.Table 16: Machine generated subtitles using the GPT2 model. We provide the input probe to the model and output thefree form text completion generated by the finetuned LM on our corpus.
Human Evaluation of the generated dialogues:
We employed two annotators either working professionally or closelyassociated with the Bollywood film industry, to guess the dialogues into one of the two labels, Real or Generated. Weprovided a list of 5 real dialogue snippets taken from movies, along with 6 artificially generated subtitles.
Annotator labelled 4 out of 6 artificial dialogues as being real, while annotator labelled 3 out of 6 artificial dialogues as beingreal, thus showcasing the efficacy of recent advancement in language models in generating human-like movie diaolgues. In this paper, we analyzed how social biases and subtle gender biases get reflected on diachronic corpora of popularentertainment. Our research indicates that our NLP methods are capable of uncovering important social signals. Ourresults demonstrate that indeed, societal changes do get reflected in popular content. But does popular entertainmentalso influence the society in turn? A recent movie on acid attack, Chhapak, was inspired from a true story of an acidattack survivor who set up an NGO and was a recipient of the
International Women of Courage [58] award. Her biopicand her initiative of
Stop Acid Sale when released, triggered regulatory legislation that made it difficult to buy certaintypes of acids without legal authorization. Devising NLP methods to identify how popular entertainment influencessociety will be a worthy future research challenge. 13ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
References [1] Molly Lewis and Gary Lupyan. Gender stereotypes are reflected in the distributional structure of 25 languages.
Nature human behaviour , pages 1–8, 2020.[2] Nikhil Garg, Londa Schiebinger, Dan Jurafsky, and James Zou. Word embeddings quantify 100 years of genderand ethnic stereotypes.
Proceedings of the National Academy of Sciences , 115(16):E3635–E3644, 2018.[3] Sushmita Chatterjee. ‘english vinglish’and bollywood: what is ‘new’about the ‘new woman’?
Gender, Place &Culture , 23(8):1179–1192, 2016.[4] Subuhi Khan and Laramie Taylor. Gender policing in mainstream hindi cinema: A decade of central femalecharacters in top-grossing bollywood movies.
International Journal of Communication , 12:22, 2018.[5] Nishtha Madaan, Sameep Mehta, Taneea S. Agrawaal, Vrinda Malhotra, Aditi Aggarwal, and Mayank Saxena.Analyzing gender stereotyping in bollywood movies.
CoRR , abs/1710.04117, 2017.[6] Kunal Khadilkar and Ashiqur R. KhudaBukhsh. An unfair affinity toward fairness: Characterizing 70 years ofsocial biases in b h ollywood (student abstract). In The Thirty-Fifth AAAI Conference on Artificial Intelligence,AAAI 2021 , page To Appear. AAAI Press, 2021.[7] Amy Beth Warriner, Victor Kuperman, and Marc Brysbaert. Norms of valence, arousal, and dominance for 13,915english lemmas.
Behavior research methods , 45(4):1191–1207, 2013.[8] Jean M Twenge, W Keith Campbell, and Brittany Gentile. Male and female pronoun use in us books reflectswomen’s status, 1900–2008.
Sex roles , 67(9-10):488–493, 2012.[9] Marie Gustafsson Sendén, Sverker Sikström, and Torun Lindholm. “she” and “he” in news media messages:Pronoun use reflects gender biases in semantic contexts.
Sex Roles , 72(1-2):40–49, 2015.[10] Wilson L Taylor. “cloze procedure”: A new tool for measuring readability.
Journalism quarterly , 30(4):415–433,1953.[11] Nathaniel Smith and Roger Levy. Cloze but no cigar: The complex relationship between cloze, corpus, andsubjective probabilities in language processing. In
Proceedings of the Annual Meeting of the Cognitive ScienceSociety , volume 33, 2011.[12] Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and AlexanderMiller. Language models as knowledge bases? In
Proceedings of the 2019 Conference on Empirical Methodsin Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP) , pages 2463–2473, Hong Kong, China, November 2019. Association for ComputationalLinguistics.[13] Shriphani Palakodety, Ashiqur R. KhudaBukhsh, and Jaime G. Carbonell. Mining insights from large-scalecorpora using fine-tuned language models. In
Proceedings of the Twenty-Fourth European Conference on ArtificialIntelligence (ECAI-20) , pages 1890–1897, 2020.[14] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectionaltransformers for language understanding. In
NAACL-HLT , pages 4171–4186, June 2019.[15] William L. Hamilton, Jure Leskovec, and Dan Jurafsky. Diachronic word embeddings reveal statistical laws ofsemantic change. In
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers) , pages 1489–1501, Berlin, Germany, August 2016. Association for ComputationalLinguistics.[16] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models areunsupervised multitask learners. 2018.[17] Jashodhara Mukherjee. From kabir singh to housefull 4, these sexist film dialogues from 2019 need to be cancelled.[18] Haoginlen Chongloi. Portrayal of northeast india in mainstream media: A case of underrepresentation andmisinterpretation.
International Journal of Research in Social Sciences , 7(5).[19] Pierre Lison and Jörg Tiedemann. Opensubtitles2016: Extracting large parallel corpora from movie and tvsubtitles. 2016.[20] Purusottam Nayak and Bidisha Mahanta. Women empowerment in india.
Bulletin of Political Economy , 5(2):155–183, 2012.[21] Martha Chen.
A matter of survival: Women’s right to employment in India and Bangladesh , volume 38. Oxford:Clarendon Press, 1995. 14ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT [22] Meenu Goyal and Jai Parkash. Women entrepreneurship in india-problems and prospects.
International journal ofmultidisciplinary research , 1(5):195–207, 2011.[23] Shalu Nigam. Understanding justice delivery system from the perspective of women litigants as victims ofdomestic violence in india (specifically in the context of section 498-a, ipc).
Nigam Shalu (2005) UnderstandingJustice Delivery System From The Perspective of Women Litigants as Victims of Domestic Violence in India(Specifically in the Context of Section 498-A IPC), Occasional Paper , (35), 2005.[24] Jing Yi Xie, Renato Ferreira Pinto Junior, Graeme Hirst, and Yang Xu. Text-based inference of moral sentimentchange. In
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing andthe 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , pages 4654–4663,November 2019.[25] Joan Bybee.
Frequency of Use and the Organization of Language . 12 2006.[26] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of wordsand phrases and their compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q.Weinberger, editors,
Advances in Neural Information Processing Systems 26 , pages 3111–3119. Curran Associates,Inc., 2013.[27] Robert L Hardgrave Jr. India in 1984: Confrontation, assassination, and succession.
Asian Survey , 25(2):131–144,1985.[28] DR Kaarthikenyan and Radhavinod Raju.
Rajiv Gandhi Assassination . Sterling Publishers Pvt. Ltd, 2008.[29] Victoria Schofield.
Kashmir in conflict: India, Pakistan and the unending war . Bloomsbury Publishing, 2010.[30] Sumantra Bose.
Kashmir: Roots of conflict, paths to peace . Harvard University Press, 2009.[31] Jørgen Dige Pedersen. Explaining economic liberalization in india: state and society perspectives.
WorldDevelopment , 28(2):265–282, 2000.[32] Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, and Richard S. Zemel. Understanding theorigins of bias in word embeddings.
CoRR , abs/1810.03611, 2018.[33] Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe: Global vectors for word representation. In
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages1532–1543, Doha, Qatar, October 2014. Association for Computational Linguistics.[34] Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. Man is to computerprogrammer as woman is to homemaker? debiasing word embeddings, 2016.[35] Ncoza C Dlova, Saja H Hamed, Joyce Tsoka-Gwegweni, and Anneke Grobler. Skin lightening practices: anepidemiological study of south african women of african and indian ancestries.
British Journal of Dermatology ,173:2–9, 2015.[36] Sriya Chattopadhyay.
Fair-Unfair: Prevalence of Colorism in Indian Matrimonial Ads and Married Women’sPerceptions of Skin-Tone Bias in India . PhD thesis, Bowling Green State University, 2019.[37] Anwesha Madhukalya. These dialogues from bollywood blockbusters are so sexist that you’ll want to pull yourhair out, 2020.[38] Rohini Pande, Anju Malhotra, Sanyukta Mathur, Manisha Mehta, Anju Malhotra, Margaret A Lycette, Sarah Deg-nan Kambou, Veronica Magar, Jill Gay, Heidi Lary, et al. Son preference and daughter neglect in india. 2006.[39] P. N. Mari Bhat and A. J. Francis Zavier. Fertility decline and gender bias in northern india.
Demography ,40(4):637–657, 2003.[40] Rebeca Echávarri. Gender bias in sex ratio at birth: The case of india. 02 2006.[41] Sucharita Sinha Mukherjee. Women’s empowerment and gender bias in the birth and survival of girls in urbanindia.
Feminist Economics , 19(1):1–28, 2013.[42] Nadia Diamond-Smith, Nancy Luke, and Stephen McGarvey. ‘too many girls, too much dowry’: son preferenceand daughter aversion in rural tamil nadu, india.
Culture, health & sexuality , 10(7):697–708, 2008.[43] Sonia Dalmia and Pareena G Lawrence. The institution of dowry in india: Why it continues to prevail.
TheJournal of Developing Areas , pages 71–93, 2005.[44] R Jaganmohan Rao. Dowry system in india—a socio-legal approach to the problem.
Journal of the Indian LawInstitute , 15(4):617–625, 1973.[45] Devaki Monani Ghansham. Female foeticide and the dowry system in india. In
Townsville International Women’sConference, James Cook Univ., Australia , 2002. 15ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT [46] Priya R Banerjee. Dowry in 21st-century india: the sociocultural face of exploitation.
Trauma, Violence, & Abuse ,15(1):34–40, 2014.[47] Mudita Rastogi and Paul Therly. Dowry and its link to violence against women in india: Feminist psychologicalperspectives.
Trauma, Violence, & Abuse , 7(1):66–77, 2006.[48] Nehaluddin Ahmad. Dowry deaths (bride burning) in india and abetment of suicide: a socio-legal appraisal.
JEAsia & Int’l L. , 1:275, 2008.[49] Padma Srinivasan and Gary R Lee. The dowry system in northern india: Women’s attitudes and social change.
Journal of Marriage and Family , 66(5):1108–1117, 2004.[50] Religion census 2011.[51] By: Haris Zargar 4 Feb 2020 Analysis. How bollywood furthers india’s hindu nationalism, Mar 2020.[52] Inflation calculator - indian rupee.[53] Gwern. Gpt-2 neural network poetry, Mar 2019.[54] Sriya Iyer and Anand Shrivastava. Religious riots and electoral politics in india.
Journal of DevelopmentEconomics , 131:104 – 122, 2018.[55] Walter LaFeber and Brian Abbott.
America, Russia, and the Cold War, 1945-1975 . Wiley, 1972.[56] Tim Van de Cruys. Automatic poetry generation from prosaic text. In
Proceedings of the 58th Annual Meeting of theAssociation for Computational Linguistics , pages 2471–2480, Online, July 2020. Association for ComputationalLinguistics.[57] Rajat Agarwal and Katharina Kann. Acrostic poem generation. In
Proceedings of the 2020 Conference onEmpirical Methods in Natural Language Processing (EMNLP) , pages 1230–1240, Online, November 2020.Association for Computational Linguistics.[58] Acid attack victim laxmi to receive international women of courage award - india news , firstpost, Mar 2014.16ender Bias, Social Bias and Representation: 70 Years of B H ollywood A P
REPRINT
A APPENDIX
A.1 Implementation Details
Experiments are conducted on a Google Colab Pro instance, using the Tesla V100 and P100 GPUs provided by Googlein the Colab notebook.We follow the standard preprocessing steps recommended to fine-tune BERT language model. For our task, we use thebert-base-uncased pretrained English model, with the following parameter details: 12 transformer layers, hidden statelength of 768, 12 attention heads, 110M overall parameters. The pre-trained model is fine tuned on the target corpususing the training parameters showcased below.• Batch size: 16• Maximum sequence length: 128• Maximum predictions per sequence: 20• Fine-tuning steps: 10,000• Warmup steps: 10• Learning rate: 2e-5For fine-tuning the language model for free form text completion tasks, we use the smallest GPT2 model with 124Mparameters, trained for 10,000 steps. The results showcased in Table 16 are obtained from the fine-tuned model bysetting length=250 and temperature=0.9.
A.2 BERT Fine Tuning Results
Probe
BERT base
BERT D oldbolly BERT D newbolly BERT D oldholly BERT D newholly cloze man, widow, woman,doctor, slave, soldier,bachelor, merchant,farmer, lawyer, ser-vant [ ] prostitute, servant,woman, slave, bach-elor, doctor, lawyer,man, widow, maid,worker [ ] doctor, woman, ser-vant, lawyer, maid,hindu, nurse, teacher,gardener, lady, man[ ] woman, slave, ser-vant, nurse, lady,man, teacher, lawyer,peasant, maid, wife[ ] woman, lawyer, doc-tor, nurse, teacher,man, writer, secre-tary, prostitute, pro-fessional, carpenter[ ] cloze man, soldier, gentle-man, farmer, mer-chant, woman, slave,bachelor, doctor, car-penter, servant [ ] man, gentleman,lawyer, lawyer, ser-vant, doctor, farmer,worker, craftsman,slave, criminal [ ] doctor, lawyer,policeman, man,farmer, bachelor,gardener, servant,soldier, mechanic,builder [ ] carpenter, police-man, lawyer, soldier,farmer, gentleman,servant, man, peas-ant, slave, doctor[ ] man, lawyer, soldier,doctor, carpenter,gentleman, clergy-man, farmer, writer,craftsman, minister[ ] cloze soft, beautiful, pale,tanned, smooth fair, no, pale, tanned,tan fair, tanned, golden,smooth, pale fair, pale, blue,golden, gold fair, pale, tanned,golden, darksoft, beautiful, pale,tanned, smooth fair, no, pale, tanned,tan fair, tanned, golden,smooth, pale fair, pale, blue,golden, gold fair, pale, tanned,golden, dark