[PDF] Auditing Digital Platforms for Discrimination in Economic Opportunity Advertising

Abstract

Digital platforms, including social networks, are major sources of economic information. Evidence suggests that digital platforms display different socioeconomic opportunities to demographic groups. Our work addresses this issue by presenting a methodology and software to audit digital platforms for bias and discrimination. To demonstrate, an audit of the Facebook platform and advertising network was conducted. Between October 2019 and May 2020, we collected 141,063 ads from the Facebook Ad Library API. Using machine learning classifiers, each ad was automatically labeled by the primary marketing category (housing, employment, credit, political, other). For each of the categories, we analyzed the distribution of the ad content by age group and gender. From the audit findings, we considered and present the limitations, needs, infrastructure and policies that would enable researchers to conduct more systematic audits in the future and advocate for why this work must be done. We also discuss how biased distributions impact what socioeconomic opportunities people have, especially when on digital platforms some demographic groups are disproportionately excluded from the population(s) that receive(s) content regulated by law.

Full PDF

AAuditing Digital Platforms for Discrimination in EconomicOpportunity Advertising

SARA KINGSLEY,

Carnegie Mellon University

CLARA WANG,

Carnegie Mellon University

ALEXANDRA MIKHALENKO,

Carnegie Mellon University

PROTEETI SINHA,

Carnegie Mellon University

CHINMAY KULKARNI,

Carnegie Mellon UniversityDigital platforms, including social networks, are major sources of economic information. Evidence suggests thatdigital platforms display di erent socioeconomic opportunities to demographic groups. Our work addressesthis issue by presenting a methodology and software to audit digital platforms for bias and discrimination. Todemonstrate, an audit of the Facebook platform and advertising network was conducted. Between October2019 and May 2020, we collected 141,063 ads from the Facebook Ad Library API. Using machine learningclassi ers, each ad was automatically labeled by the primary marketing category (housing, employment, credit,political, other). For each of the categories, we analyzed the distribution of the ad content by age group andgender. From the audit ndings, we considered and present the limitations, needs, infrastructure and policiesthat would enable researchers to conduct more systematic audits in the future and advocate for why this workmust be done. We also discuss how biased distributions impact what socioeconomic opportunities peoplehave, especially when on digital platforms some demographic groups are disproportionately excluded fromthe population(s) that receive(s) content regulated by law.CCS Concepts: • Theory of computation ! Design and analysis of algorithms ; • Human-centered com-puting ! Collaborative and social computing systems and tools ; Social networking sites ; • Appliedcomputing ! Evidence collection, storage and analysis ; Law .Additional Key Words and Phrases: discrimination, audit, digital platforms, civil rights

ACM Reference Format:

Sara Kingsley, Clara Wang, Alexandra Mikhalenko, Proteeti Sinha, and Chinmay Kulkarni. 2020. AuditingDigital Platforms for Discrimination in Economic Opportunity Advertising. In ,. ACM, New York, NY, USA,29 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

Social networks are a major source of information about hiring and economic opportunities andimpact the choices people have, and more generally, who gets what in society. The question iswhether certain demographics in society are presented with opportunities that others are not.Digital platforms, including social networks, employ algorithms that decide, often in real time,what content is displayed to users based on data and assumptions about users and the relevance June 2020, MD4SG une 2020, MD4SG Kingsley, et al. of what is displayed in the content of the ad creative itself. As a result, answering the question ofwhether demographic groups are disproportionately shown some opportunities requires examininghundreds of thousands of algorithmic decisions; i.e. conducting platform-wide algorithmic audits.Digital platform audits have been generally hard to conduct[23, 34, 35]. First, much of thedata about content and users is available only to platform operators, not third-party auditors[31, 34, 35]. Second, even when some data is available, auditors need to augment this data withdetailed categorical information of interest. For example, if a platform operator gives researchersaccess to all employment ads published on the digital platform, to conduct a meaningful audit, itis necessary to engineer features that provide information about the employment opportunities,i.e., what kind of job is being advertised. An auditor might not discover certain biases without thisinformation, such as if platform shows ads for high-skilled (and high-paying) jobs to men, andlow-skilled (and low-paying) to women. Third, and nally, audits must be replicable so ndingscan be veri ed by other researchers[35].Our paper presents a methodology and software for auditing activity in advertising on digitalplatforms. Speci cally, we build an auditing toolkit for the Facebook platform. Facebook currentlyprovides a programmatic API to access ads. Our audit leverages the Facebook API to get relevantads-and-data about their distribution among user demographics. Between October 2019 and May2020, we collected more than 141,063 advertisements.For the audit, the API data was augmented with category labels, which were automaticallyinferred by classi ers we developed. These labels allow researchers to rapidly investigate adcontent distribution along di erent categories of interest. Our classi ers, which we make availableopen-source, were used to classify ads by marketing categories that are regulated by law or policy(i.e., housing, employment, credit, political, and other). Once labeled, the demographic distributionsof the advertisements were analyzed for bias. Our audit goes beyond previous experiments, andyields a statistical distribution of advertising content among users of social media by demographic. Policy implications . We note that advertisers are prohibited from displaying credit or job oppor-tunities disproportionately across demographic groups under current U.S. law (primarily throughthe Equal Credit Opportunity Act [18, 30] for credit opportunities, and the Civil Rights Act of1964 [15] regulations for jobs). As a result, Facebook (and other social networks) ban targetingpeople by age and gender when advertisements are for housing, employment or credit (so-calledHEC ads). However, our auditing results suggest that if advertisers published their ads as HECadvertisers, observed di erences in the distribution of ads, if not a result of direct demographictargeting, then the bias is achieved through other means such as the platform optimizing thedistribution (e.g. to maximize clicks.). It is also possible that advertisers did not mark themselvesas HEC advertisers; or, otherwise managed to circumvent the platform’s rules. Finally, it is alsopossible that Facebook only includes some ads selectively in its API, and other ads (which we werenot allowed to access) corrected any biases in distribution.In any case, an audit suggests several implications for future policy work. First, if advertiserscircumvented platform rules, stricter checks on advertisers (including spot-checking ads) may bewarranted. On the other hand, if advertisers were compliant, it would suggest that policies andtargeting prohibitions are insu cient, and algorithms for ads distribution must be modi ed. This paper draws on prior work in three related areas: studies of algorithmic biases in advertis-ing; methods for uncovering statistical biases using auditing in non-digital platforms; and legaljudgments that suggest policy implications for digital platform audits. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Perhaps the simplest method for discovering if ads are disproportionately distributed to certaindemographics is by analyzing "ad explanations" provided by platforms. Ad explanations informusers of some but not all of the reasons why they were shown a particular ad. Explanations alsosignal some information about biases in distribution, and have been used in legal cases as evidence.For example, in legal proceedings against Facebook, the Communication Workers of America [28]used screenshots of ad explanations as evidence that the platform allowed advertisers to targetemployment ads based on their age (Figure 1).Ad explanations have limited ability in measuring distributional biases. Explanations tend not todisclose all the traits advertisers used to target people[5, 17]. In addition, Facebook users are notalways aware that ads have links to explanations about why they were shown a given advertisement.When users are aware of these links in ads, they have reported being "confused by the ambiguityof the explanations" [17]. That is, ad explanations do not always make transparent to users thetargeting methods or reasons why a platform sent them a particular ad.

Fig. 1. Employment Ad and Ad Explanation from Communication Workers of America Civil Rights Lawsuitagainst Facebook. Available online: . Last accessed: Tuesday, June30, 2020.

Planned experiments

Planned experiments are a method that allow us to learn how advertisers and digital platformscould target users [4, 5, 21, 39]. In a planned experiment, researchers create ads with speci callycontrolled attributes. For example, researchers will create identical ads, but change one factor about une 2020, MD4SG Kingsley, et al. a (treatment) ad while not changing that factor in the other (control) ad. After collecting dataabout the distribution of the ads, any di erences observed will be used to infer how the changedfactor in the (treatment) ad correlates with any di erences from the control. Hence, by virtue ofthe research design, planned experiments can uncover the targeting methods that make it possiblefor advertisers to reach speci c demographics, and the platform optimizations (that advertisers donot control) that result in disparate allocations of advertised economic opportunities. [4].In Ali (2019) and Sapiezynski (2019), the researchers published experimental ads to Facebook,using the Look-alike Audience (LAL) tool. On the Facebook ad portal, the LAL tool allows advertisersto upload lists of and data about customers that have user traits that the advertiser wants to target [4].After publishing the experimental ads, Ali et al. analyzed the distribution by gender and race (U.S.voter data was used to infer race). In their study, bias was observed. More importantly, the nding ofbias indicated that the LAL tool could be used to target Facebook users by their demographic traits.In another planned experiment, Lambrecht and Tucker (2019) also published STEM employmentads on Facebook [21]. The design of the experimental ad content was varied to observe how thata ected to which demographics Facebook would send the ads. Their study found that more menwere shown ads for STEM employment opportunities. Older women were found to be particularlydisadvantaged in the ad distribution. The nding suggests that digital platforms rely on stereotypes("men prefer computing jobs more than women") and/or the biases of labor markets ("more men areemployed in computing jobs") to decide how to optimize the distribution of content among users.Ultimately, despite their utility in identifying problematic features (e.g., Look-alike audiencetools, certain optimizations), planned experiments cannot describe the statistical distributions ofthe ads that a platform serves as a whole. This is because advertising experiments are typicallyonly able to collect data about those to whom the experimental ads reached, and not data about thedemographics that were shown ads published by other advertisers. In this section, di erent theories of discrimination from economics and law are brie y described,and particular applications to digital contexts are discussed. Statistical discrimination.

In the economics literature, statistical discrimination describes the behavior of rms (or agents)when they base hiring and employment decisions on information they have about an entiredemographic. The idea is that this discrimination is rational [19]; that is, rms or economic agentsact on what they believe or assume about an entire group when making decisions where theyhave incomplete information about individuals. Economic theories of statistical discriminationsuggest that decision-makers who statistically discriminate do not do so because of animus or an“intrinsic adversity to any particular group per se ” but merely to improve the perceived quality ofdecisions [19]. While there is substantial evidence that human decision-makers do have habitual,implicit biases that do not improve the quality of the decision [12], this theory is helpful forthinking about the ways that algorithms distribute opportunities. Algorithms need not have an“animus” towards particular individuals, but may still lead to biased decisions, especially if inputsto computational systems that are biased themselves. Legally Protected A ributes In US law, demographic traits that should not be used in making decisions about regulated economicopportunities are called “protected” attributes (cf. Table 1) [13, 14]. US law generally disallows fordisparate treatment based on protected attributes for outcomes relating to employment, housing,credit, and other economic opportunities. Related to the current work, under US law, it is unlawful uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Table 1. Legally Protected Demographic A ributes in the US Attribute

Age

Persons age 40 years and older

Sex or Gender

Includes persons who identify as non-binary and/or transgender

Sexual Orientation

Includes LGBPQ+

Race

Includes persons who identify as belonging to more than one race

Color

Includes discrimination based on complexion

Disability

Includes physical, neurological medical and mental health identities

Genetic Information

Not limited to information known from genetic testing

Pregnancy status

Pregnancy is also a form of gender discrimination

Ethnicity

Includes persons who identify as Hispanic, Latinx, Jewish

National origin

Includes discrimination based on 1st spoken language

Religion

Includes discrimination against any religious group or belief

Political affiliation

Includes 3rd parties

Military or veteran status

Includes reserve members, ROTC

Citizen status

Includes persons who are not legally recognized citizens

Association

It is unlawful to discriminate against persons based ontheir association with a protected class

Witness or Whistleblower

It is unlawful to discriminate against persons becausethey were a victim or witness of discrimination and reported itto discriminate against persons in marketing for certain economic opportunities (e.g., housing,employment, credit) on the basis of demographic attributes that are legally protected.

Disparate treatment

Disparate treatment is a form of discrimination where decisions that consider or account for anindividual’s demographic attributes in ways that adversely a ect their opportunities [22]. Forexample, if an employer only o ers paid maternity leave to female employees, this adversely treatsmen who are employed at the company. O ering paid parental leave only to one gender, or only topersons who identify as male or female, is discriminatory; such a policy intentionally uses a legallyprotected demographic trait (gender identity) to determine eligibility for an employment bene t. Disparate impact

Even policies and decisions that are facially neutral because they are uniformly applied to everydemographic can still have di erent impacts on individuals from di erent demographic groups [22].For example, an employer may deny paid parental leave to new parents, and apply this policyuniformly to all its employees. However, this policy could have a disparate impact on women, whomay need to take more unpaid time o for unpaid parenting than men. Such policies are said tohave disparate impact, and are illegal in the US when they discriminate against legally protecteddemographic classes (see Table 1). Proxy Discrimination

Proxy targeting is when attributes of users that are statistically correlated with protected demo-graphic traits are used to include or exclude them from the target population [33]. Proxy targeting une 2020, MD4SG Kingsley, et al. can thus be used as an implicit method to make decisions about who will receive ads when legallyprotected demographic traits are not used explicitly by a platform’s algorithm to make ad decisions.In the United States, it is unlawful to use proxies for legally protected demographic attributesto discriminate against protected classes in marketing for certain economic opportunities (e.g.,housing, employment, credit). For example, because US zip codes (postal codes) are so stronglycorrelated with race and ethnicity, it is unlawful to exclude persons from receiving ads for housingopportunities based on their zip code. Similarly, targeting "new college graduates", "fresh" or "youngprofessionals" is an illegal proxy for age discrimination [3]. Proxy targeting need not be limited todemographic variables: requiring that applicants are able to "stand up for long periods of time","walk" or be "quick on their feet", when recruiting for a desk job that does not require the use oflegs or feet, could constitute discrimination on the basis of disability. Bias auditing in non-digital contexts

Audits have traditionally been used to uncover housing [7]and employment discrimination. Such audit studies use either paired-testing or correspondenceaudits, or resume study research designs[20].In both paired-testing [7] and correspondence audits [8], near-identical candidates apply to jobsor housing; ideally, candidates are equal in every respect except in the targeted demographic trait(e.g. race or gender) [7, 20]. Data is then collected (such as whether candidates are o ered a jobor qualify for housing), which then yields evidence of discrimination, if any. Resume studies aresimilar, except that they create resumes of two ctitious candidates (materially identical in everyrespect, except the target demographic), and apply to the same jobs. Again data such as interviewo ers are used to evidence discrimination. Unlike paired testing, because resumes are ctitious,they can be used to test a wider variety of traits (even where a matched pair of individuals is hardto nd).These auditing methods have been used as legal evidence of discrimination in many employmentand housing cases. Resume studies in particular can surface statistical discrimination. We positthat digital auditing methods such as ours can yield similar bene ts in digital contexts. Arti cial intelligence (AI) audits are referred to as "algorithm audits"[34] and are described as "thecollection and analysis of outcomes from a xed algorithm or de ned model within a system"[34].[1] Input analysis investigates whether the data used to train algorithms bias outcomes arebiased or discriminatory. Studies in the medical literature, for instance, have been conducted withresearch participants who were all men. Despite this, the ndings have been used to constructdiagnostic criteria for entire populations. This has led to disparate outcomes in identifying heartdisease and autism among women.Using unrepresentative data to build models or decision-making criteria for entire populationsis problematic. For example, researchers Joy Boulamwini and Timnit Gebru (2018) conducted anaudit of facial recognition software [10]. In response, some of the investigated companies impliedin public statements that the observed errors in classifying the faces of Black men and womenwere a result of using non-representative data to build the technology [34]. In alignment with whatBoulamwini and Gebru have evidenced, as well as in other researchers in the machine learningcommunity, it is worth stating that data is rarely the only source of bias in sociotechnical softwareapplications, machines or systems.[2] Output analysis investigates if the outcomes of a system are biased or discriminatory.First, researchers analyze if algorithms used in a system produce disproportionate impact among uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG demographics. An algorithm can be perfectly unbiased mathematically-speaking, and its inputsalso, while the outputs a ect populations di erently. A sock puppet audit replaces actors in traditional audits with scriptsthat pretend to be users or create fake tra c to a speci c site [7, 38]. Using fake data, a sock puppetaudit aims to test if an algorithm is biased. For example, one study found racial discrimination inAirbnb by using a sock puppet audit method where users that where identical in all aspects exceptname applied to various hosts [32]. The study discovered that those with names used more oftenby African-American individuals were rejected at a much higher rate than those with names usedmore by White Americans [32]. One downside of sock puppet auditing is the large amounts of dataneeded to demonstrate that a program or service is discriminatory. While this is made easier withthe help of computer programs, it may cause problems later if researchers violate the ComputerFraud and Abuse Act or any Terms of Service of the platform [38]. In this section, we outline the policy context against discrimination, and potential policy bene tsof digital audits. To keep our discussion focused, we outline the policy context in the United States,but other countries and regions (such as the EU) have policies with similar intent. Legal protections against discrimination.

In the United States, protections against discriminationare guaranteed under the Civil Rights Act of 1964 [6], and follow-up legislation including the FairHousing Act [29] and the Equal Credit Opportunity Act [11]. Some states have additional protectionsagainst discrimination. Overall, these laws prevent discrimination based on demographic traits,such as race, religion, sex, disability, national origin, and attributes such as genetic information.These laws consider di erences in opportunity to be acts of discrimination, just as di erences inactual bene ts provided to individuals. For instance, under the Equal Credit Opportunity Act, it isnot necessary to refuse credit to people based on demographics to be considered discriminatory;instead it is su cient for the creditor to have advertised to or discouraged people from applyingfor credit based on their protected demographic traits. Similar protections exist for employmentand housing opportunities. As such, we consider advertisements to be a means for conveyingopportunities; showing people fewer employment or credit ads in essence robs them of economicopportunity. Reactive legal frameworks.

Enforcement of anti-discriminatory regulation in the United Statesis largely reactive. For instance, job seekers must demonstrate that a di erence in employmentopportunities exists at a particular employer, rather than employers having to guarantee that theirpractices are non-discriminatory. The di culty and cost of proving discrimination have madeit di cult to systematically address discriminatory practices. However, if digital audits can beconducted automatically or at low cost, it may be possible to monitor practices e ciently, andultimately, improve processes that work identify where systematic change is required. Civil Rights regulations imply special protections for opportunities in housing, employment, andcredit (the “HEC” categories). Recent litigation has operationalized these protections on socialnetworks, through a combination of technology and platform rules for ad targeting [2, 28].In a March 2019 settlement, Facebook established “a separate advertising portal for creatinghousing, employment, and credit ("HEC") ads” that has “limited targeting options" for advertisers [2].Through technological operationalization, Facebook removed options that permit advertisers toexplicitly target users by gender, age, "multicultural a nity", and zip-code level or geographic une 2020, MD4SG Kingsley, et al. Table 2. Advertising Class Definitions

Class De nition Housing

Advertises real estate property for rent or sale

Excludes mortgage, home nancing or loans Employment

Recruits applicants for job openings

Includes job fairs, employment agent listings, employment services

Credit

Advertises about credit, e.g., cards, rates, records;loans, e.g., auto, student, home, personal, business;insurance, e.g., life, medical, home, car, pet, disaster.

Political

Any ad by or about political candidates, campaigns,elected o cials, elections, or policy/political agendas Other

Any ad not de ned by the other categoriestargeting of less than 15 miles in radius. Facebook’s operalization of the settlement agreement alsomodi ed the "Lookalike Audience tool" [2] The modi cation made it so advertisers cannot explicitlytarget demographics based on legally protected attributes. Finally, Facebook’s operalization of thesettlement agreement created new platform rules. The instituted rules state that advertisers mustcreate and publish HEC ads on the HEC portal [2].However, advertiser targeting is not the only source of disproportionate ad distribution – platformoptimizations may cause biases too [4, 40]. Unlike ad targeting, rather than applying to a particularadvertisement or advertiser, platform optimizations lead to statistical biases. As a result, they maynot be prevented through rules limiting advertisers. Audits such as ours, along with establishedmetrics, may help operationalize protections against such statistical biases. Our system has three components: an ad-querying and logging script that interacts with the AdLibrary and downloads ads; a set of classi ers for adding meta-information to collected ads; and adatabase that allows researchers to query collected ads. ery script. Our Python-based script queries and downloads ads from the Ad Library API. Theappendix lists the elds and parameters used for querying the API. In our script, the parameter"search terms" was used to conduct keyword requests for data. Facebook’s API currently seems tolimit requests to around 2,000 ads per keyword . In addition, the script also allows users to specifyFacebook pages of advertisers of interest (such as Monster.com for job ads). Our script currentlyonly requests data for ads displayed in the United States, but requests both active and inactive ads(i.e. both ads currently running on Facebook platforms and those no longer being delivered.) Foreach query, we save the exact text of the ads returned, URLs of any ad images (but not the imagesthemselves), and the distribution of impressions by age, gender and geography. We also logged thesearch terms used in each data request in our database, along with the date of the request. Augmentation.

The Ad Library allows keyword searches for speci c topics, but these searches arenot always relevant for legal/policy applications. For instance, searching for "jobs" or "hiring" andonly wanting to view ads recruiting applicants for employment, however, one discovers that they According to Facebook, the o cial limit is 5,000 ads per API request. However, we found that the API often returns anerror and asks developers to reduce the amount of data they are requesting if more than 2,000 ads are requested at a time.8 uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG must wade through many political and others advertisements, such as depicted in Figure. For legalor policy utility, downloaded ads need additional information.We allow users to enrich ads with this additional information through augmentation . One usefulaugmentation system is a set of classi ers that automatically classi es each ad into policy-relevantcategories.We include a set of classi ers in our system, as described in Table 2. As detailed in Section 4, theseclassi ers allow for reasonably nuanced classi cation: if an advertisement is about employmentpolicy, for example, the instance is classi ed political, because the ad pertains to a political agendaor policy. On the other hand, an ad for a job position in an elected representative’s o ce is classi edas an employment opportunity, not a political ad (Figure 2). This distinction will hopefully guidepolicy and legal workers when using our database, and help to identify the advertisements thatare pertinent to their work. A di erent augmentation system we are currently developing labelsobjects in ad images, for instance, “dog”, “woman”, “construction equipment” etc.Note that Facebook already has an internal system to identify, moderate, and remove (ad)content that violates policy or law. However, our method and software allows for an independentidenti cation for external auditors, with standard performance metrics. Facebook has not releasedthis data about its internal system. Furthermore, the goals of auditors are inherently di erent frombusiness goals of Facebook – including social justice concerns such as hate crimes , discrimination ,genocide, , sexism, misogyny, gender harassment, racism , racial harassment , xenophobia, anti-religious sentiment, ableism, ageism, anti-Semitism , white supremacy , and violence In addition,whatever system Facebook uses to monitor/moderate advertising content, the system is not usablefor external reviewers or auditors who want to track inorganic content (i.e., advertisements) on theplatform for legal, policy or human rights reasons. Our software/methods will hopefully ll thisgap. Database.

Our database includes numerous tables that contain the original raw data retrievedfrom the Facebook API, and tables that have the data enrichers included, as well as engineered Nelson, Blake (January 3, 2020). “Gov. Murphy pushes Facebook to do more to ght-anti-semitism.html .org/deeplinks/2019/08/second-circuit-rules-section-230-bars-civil-terrorism-claims-against-facebook. Authors’ note: United States domestic, and not only international, terroristevents are a problem on Facebook. We note there is sometimes bias in the reporting of what constitutes a terrorist event forthe United States generally, and ask that our readers not take this article as characteristic or de nitional of all terroristviolence. 9 une 2020, MD4SG Kingsley, et al. Fig. 2. Our classifiers categorize ads for policy/legal utility. An employment Ad (C) recruits applicants for jobopenings at an employer . In contrast, political and other ads (Images: A, C, D) sometimes inform users that anemployer is creating jobs. features that augment the API data. For example, the database contains time-series tables that haveinformation about when an advertisement was created, and also, how long ads were displayedto users on the Facebook platform. From these tables, users of our audit toolkit can analyze adsby certain policy or legal events, such as the date when the Facebook settlement agreement withcivil rights organizations took e ect. In addition, the database contains the development, trainingand test data-sets used to develop our classi ers. Researchers may access and use these data tablesto verify and replicate our results. Finally, the database has tables which contain ads only for aspeci ed HEC class (e.g., housing, employment, credit). Users may download and/or browse thesetables to monitor and explore information about HEC ads published on Facebook platforms. For afull list of database column information, as well as tables that will be made available to our audittoolkit users, please refer to the appendix of this paper. Below, we describe the speci c audit we conducted using our system described above. For theaudit, data was collected from Facebook’s Ad Library API . For each ad collected from the API,Facebook provided us with data about how the ad was distributed among demographic groups andby geographic region. Hence, the data allowed us to evaluate how the ads were distributed.We augmented the data from the Ad Library by automatically classifying and de ning ads bya primary marketing category (housing, employment, credit, political, and other). To train themodels that were used to classify the ads, we manually added class labels to 3,767 ad campaigninstances. The labels were de ned in a way to describe an ad’s intent or type of message. In additionto the primary classes, we built a logical ("Rules") algorithm to produce labels for ad sub-classesor types. For example, credit ads include sub-types or classes, such as for: student loans, debtrelief, automobile loans, and home loans or mortgages. De ning sub-classes allows us to identifydistributional di erences of particular kinds of ads within each class, which might otherwise canceleach other out (e.g. some demographics may only see ads for debt relief, but not home loans.) uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Table 3. Advertising Objective Options

Objective Facebook Description

Brand Awareness

Reach people more likely to pay attentionto your ads and increase awareness for your brand.

Reach

Show your ad to the maximum number of people

Traffic

Send more people to a destination such as a website,app or Messenger conversation.

Engagement

Get more people to see and engage with your post or Page.

App Installs end people to the app store where they can download your app

Video Views

Promote videos to raise awareness about your brand

Lead Generation

Collect lead information from people interested in your business

Messages

Get more people to have conversations with your businessMessenger, WhatsApp or Instagram Direct

Conversions

Get people to take valuable actions on your website, app or in Messenger,such as adding payment info or making a purchase

Catalog Sales

Create ads that automatically show itemsfrom your catalog based on your target audience.

Store Traffic create ads to generate tra c to your physical store location Table 4. Delivery Start Year, Number of Ad Campaigns

Year Ad Campaigns ( Accounting for classification performance.

There are two sources of potential error in our audit.First, our classi cation models may have labeled ads incorrectly. Therefore, we evaluate the modelsperformance in predicting advertising classes using multiple standard measures of goodness:accuracy, precision and recall. For most primary classes, the Naive Bayes models had precision andrecall measures above 90 percent. For credit and employment ads, statistical measures of goodnessneared 97-99 percent. Performance results held when our Naive Bayes models were tested onadditional data-sets of derived labeled data (Table 7). These measures suggest that the classi ersadequately classify the collected ads to audit the distribution.A second source of error may be that we may have omitted some relevant keywords andadvertisers for querying the Ad Library. This source of error is extremely hard to measure as itdeals in unknown unknowns. However, we tried to minimize these errors by expanding the set ofkeywords used over time, based on the copy of ads that we retrieved from the Ad Library. However,we did not notice any substantive changes to the overall statistical distributions we observed basedon inclusion/exclusion of particular keywords, so this source of error is likely small. une 2020, MD4SG Kingsley, et al. Table 5. Advertising Class Distribution, Hand-annotated data from October to December 2019.

Class Percent

Example Ad Text

Housing

Employment

Credit nance yourout-of-pocket health expense." Political

Other tsthey need to purchase nutritious foods.DiscoverDiscover if you qualify for the federal nutrition programand learn how to get started on your application today." Manually Annotated Data-set

From October 2019 through December 2019, we collected , and were manually classi ed. The labeleddata was used to train a machine learning model to automatically label each instance in our databaseby advertising class (i.e., employment, credit, housing, political or other). Manually Annotated Data-set

In January 2020, we collected an additional ad campaignsfrom the Facebook Ad Library API. We hand-annotated labels for advertising instances.The data was used to evaluate the Naive Bayes model trained on data from Oct. - Dec. 2019; and totrain and evaluate a new model.

We primarily use a naive Bayes model to classify ads according to whether the advertisementswere for housing, employment, credit, political, or other opportunities. (The other opportunitiesclass corresponds to “Uncategorized” ads as described by Facebook. However, these are better seenas opportunities other than housing, employment, credit or politics, so “other” seems to be moredescriptive.) In addition, we use a rules-based model for subcategory classi cation.Both models use the same features, namely the text in the body of the advertisement, expressedas a bag-of-words of unigrams and bigrams. These tokens are used as-is, and not stemmed orlemmatized. Naive Bayes

The Naïve Bayes (NB) model uses a Bayesian model with prior probability of term’s frequency,used smoothing, and a multinomial distribution. Ads were classi ed directly as one of credit,employment, housing, political, or other classes. Table 7 shows model performance. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Table 6. Advertising Class Distribution, Hand-annotated data from January 2020.

Class Percent

Example Ad Text

Housing

Employment

Credit es.See if you qualify - http://bitly.com/PA >BC ” Political

Other ed experts is huge. Learnabout the exibility of our ONLINE program here!" Rules Model

Our Rules model is primarily used for sub-category classi cation. It uses a matching algorithm todetermine the class-value of an instance. The Rules algorithm takes a vector of terms and searchesfor those terms in the main document or body text of an advertisement. If matching terms arefound, the algorithm applies the classes label tied to that vector of terms, and otherwise, applies alabel indicating that the instance does not belong to the class.For each category, a classi er learns a “term-topic" vector that contains unigrams and bigramsthat commonly occur in the text of ad instances belonging to a particular class. Ads are classi edas belonging to the class if they contain any of the terms in this term topic vector, and as notbelonging to the class otherwise. Note that this classi er may output more than one label for an adas all rules are run in parallel. une 2020, MD4SG Kingsley, et al. Table 7. Performance on Test Data for Naive Bayes model for ads from 2019 and 2020

Year 2019 2020Class Precision Recall F1 Precision Recall F1 credit employment housing political other ers (with the same features) was also created for each category of ads. Wedescribe its use below. The Naive Bayes classi er has generally high performance. However, for policy/legal audits, pre-cision is more important than recall – audits would like to know if there is some statistical dis-crimination for ads that are clearly in a particular category, rather than try and include all ads thatmay be so classi ed. Therefore, for our reported analysis, we created an additional rules model foreach category of ads (Like our sub-categorization model, this may produce multiple labels for aninstance.)Overall, this results in a stricter inclusion test for each category. For example, ads that are moreabout credit opportunities for housing (e.g. mortgages, home improvement loans) than housingopportunities more likely labeled as credit ads. Because the rules-model is only used as a lter forthe nal analysis, it only increases precision measures from Table 7 (possibly reducing recall.)For our analysis of sub-classes of credit ads, while we only report ndings here based on thisstricter inclusion test, whether or not ads were ltered did not signi cantly change the ndings.Please inquire with authors for ndings based on analysis that: (a) only classi ed the ads usingthe Naive Bayes model, (b) only classi ed the ads using the Rules model, and (c) only used in theanalysis if the Naive Bayes models and the rule-based model agreed on at least one classi cation(e.g. whether or not the ads was a "housing" ad). Ethics

To our knowledge, we have complied with Facebook’s policies regarding the use of data from its AdLibrary, and more generally, platform Terms of Service (TOS). Facebook permits authorized users ofthe Ad Library API to publish research about Facebook advertising. We did not intentionally collectany Facebook user Personal Identi able Information (PII). However, if a Facebook user publishedads on Facebook-owned platforms using a Facebook page that is named after their real name, thatname data might be included in the Ad Library, and hence, our database. Our research database contains 141,063 Facebook ad campaigns for the period under study for theaudit and 1,722,559 observations of more than 80 variables for those ads. These advertisementswere collected between October 2019 and May 2020, and ran on Facebook between 2016 and May2020 (see Table 4.)

Systemic Gender Discrimination across Facebook Advertisements.

From our investigation, we dis-covered the design of the Facebook advertising portal could bake discrimination into advertising uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG against persons who do not identify their gender to Facebook or who identify as non-binary and/orwho identify a custom gender that is neither male or female.The Facebook advertising portal only allows advertisers to target men, women or all. On the adportal, if an advertiser speci es their advertisement is for housing, employment or credit, the onlyaudience selection possible for gender is "all." Facebook de nes "all" as men and women. We do notknow whether "all" includes the gender group that Facebook calls "unknown" (in our paper werefer to this group as "custom" since it includes persons who identify as non-binary and/or specifya custom gender). Across every type of advertisement, persons who Facebook classi ed as havingan "unknown" or custom gender receive few if any advertisements, and when shown ads, are onaverage between 0% and 1% of the total demographic. Credit Advertisements

In our database, for the time period under study, there are 67,181 observations of 6,385 ad campaignsclassi ed as credit by the Naive Bayes model. A total of 274 advertisers published the ads onFacebook. Despite the number of ads campaigns (6,385), only 146 of the advertisements displayedunique text in the main text of the advertisement. This means that few advertisers published mostof the ads. Out of the ads published, many were advertisements for the same thing, even thoughthe ads delivered as distinct advertisements on Facebook (the ads also had individual budgets). Notevery advertisement with the same text in the ad delivered to identical proportions of demographicgroups.In the credit ads, there were only 208 unique website links embedded in the ads; many more adshad identical embedded URL links. Embedded website links, if clicked on by a user, open a newbrowser tab that navigates to the website at the URL link. Among the ads labeled as credit, theembedded website links might direct the user to the website where they may apply for nancing orloans, or other credit opportunities. Gender Distribution.

An estimated 57.9 percent of credit ads were sent to a greater percentage ofmen; whereas, 42.1 percent of credit ads were displayed to a greater percent of women. No creditads were shown to a greater percentage of persons labeled as having a custom gender identity.The distribution of credit ads among gender identity groups is notable. First, more womenare shown ads on average, as a percent of the total demographic, in every ad class except credit.Second, more women than men use Facebook in the United States. Researchers have suggested thatthe size of a demographic on Facebook could explain the distribution of advertisements amongthose demographic groups. If this were true, since more women use Facebook in the United States,more women than men would receive advertisements. However, for credit ads in our database,the Facebook user-population of women in the United States does not explain the ad distributionby gender demographic. Facebook would need to show more credit ads to women for the size ofthe user demographic to explain the distribution of the ads; and this is not the case. Facebook,therefore, is not likely distributing credit ads based on the representation of a demographic on theplatform. Instead, the platform and/or advertisers seem to target a speci c demographic (men) toshow credit advertisements.What is remarkable, and worth noting again, is that for nearly every class but credit, more adshave distributions that are skewed toward women, meaning women are a greater percentage of thetotal demographic. In stark contrast, across advertisers and ad campaigns, a greater proportionof credit ads were distributed to a greater percentage of men. As observed for every ad class, thedistribution of credit ads is never biased toward persons identi ed as having a custom genderidentity (in other words, persons who do not identify as male or female on Facebook). une 2020, MD4SG Kingsley, et al. Fig. 3. Average Percentage Point of Total Demographic – Credit Ads by Gender and AgeTable 8. Credit Ads Only Shown to One Gender

Men Women Custom gender

Ad Campaigns (

209 210 2

Advertisers

29 32 2

Funding entities

23 22 2

Embedded Websites (

24 34 2

Table 9. In the above, are the number of campaigns that were only sent to: men, women or non-binary users(Facebook labels non-binary users, "unknown" gender). The number of advertisers and entities that fundedthe ads is shown below the campaign count. Finally, the number of embedded website links in the ads isgiven. Ads contain links to o -Facebook websites, such as to loan applications, and these links constituteembedded websites. Employment Advertisements

In our database, for the time period under study, there were 165,853 observations of 13,250 adcampaigns that were classi ed as employment ads by the Naive Bayes model.A total of 2,926 advertisers published the ads classi ed as employment in our database. Despite thenumber of ads campaigns, only 2,181 of the advertisements displayed unique text in the main textof the advertisement. There were 2,254 unique website links embedded in the ads. The embeddedwebsite links, if clicked on by a user, open a new browser tab that navigates to the website at theURL link. Among employment ads, the embedded website links might direct the user to the websitewhere they may apply for the jobs advertised on Facebook, or other job opportunities. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Table 10. Proportion of Credit Ads by Age Skew

Age Skew Percent

13 - 17 .<1%

18 - 24

25 - 34

35 - 44

Fig. 4. Average Percentage Point of Total Demographic – Employment Ads by Gender and Age

Gender Distribution.

An estimated 64.8 percent of employment ads were shown to a greater pro-portion of women (of the total demographic by gender group), while 35.2 percent were shown to agreater proportion of men. None of the employment ads were shown to a greater proportion ofpersons who do not identify their gender identity to Facebook and/or who do not identify as maleor female. une 2020, MD4SG Kingsley, et al. Fig. 5. Employment Advertisement only sent to men on FacebookTable 11. Proportion of Employment Ads by Age Skew

Age Skew Percent

13 - 17

18 - 24

25 - 34

35 - 44 . Only 2 employment ads were displayed only to persons who wereidenti ed as having a custom gender identity. For the reported numbers about employment ads only sent to one gender, we based our estimate on instances that wereclassi ed as "employment" by both the Naive Bayes and Rules models. We did this because as reported in our Naive Bayesmodel performance metrics, there is some amount of error in classi cation. By using the agreement between both models,we capture a minimum estimate of employment ads that were only shown to one gender, and gain a little precision. Infuture work, we are working to ne-tune our algorithms further. Of note, using the Naive Bayes model, we report thatmore than 400 employment ads were shown only to men, more than 500 only to women, and only 3 to persons identi ed ashaving a custom gender 18 uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Housing Advertisements

In our database, for the time period under study, there were 22,140 observations of 1,820 adcampaigns that were classi ed as housing ads by the Naive Bayes model.A total of 518 advertisers published the ads classi ed as employment in our database. Despite thenumber of ads campaigns, only 210 of the advertisements displayed unique text in the main textof the advertisement. There were 365 unique website links embedded in the ads. The embeddedwebsite links, if clicked on by a user, open a new browser tab that navigates to the website at theURL link. Among housing ads, the embedded website links might direct the user to the websitewhere they may apply for an apartment or other housing opportunity advertised on Facebook, orother opportunities. An estimated 73.5 percent of housing ads were shown to a greaterproportion of women (compared to men or persons who identify as having no gender or a customgender). Only around 26.5 percent of housing ads were shown to a greater proportion of men outof the total percentage of the demographic by gender, across all age groups. None of the housingads were shown to more persons that Facebook identi es as unknown or custom gender (includesnon-binary and/or trans). More housing ads (35.9%) were shown to a larger proportion of personsages 25 to 34 years compared to housing ads in which other age groups comprised the largestproportion of the total demographic by age. An estimated 18.6% were shown to a greater percentof persons between the ages of 35-44, compared to the percent of the total demographic shown toother age groups; followed by persons ages 55 to 64, for which 14.5% ads favored this demographic.An estimated 9.6 percent of housing ads were sent to a greater percentage of persons aged 65 plus. une 2020, MD4SG Kingsley, et al. Table 12. Proportion of Housing Ads by Age Skew

Age Skew Percent

13 - 17

18 - 24

25 - 34

35 - 44

Limitations

Our ndings could be limited by our data collection method and the use of algorithms to classifyads by their primary and secondary advertising categories. In addition, we have not yet analyzedthe data by dimensions available to us. For this paper, our audit work did not explore variance inthe distribution of ads by geographic region or time period. We also did not analyze if the image orvideo media embedded in advertisements impacts the distribution of ads. There is a lot of potential in future work in this eld. In the coming months, we are planningto study whether Facebook’s inferences about the kind of ad content users prefer to be shownmatches users’ actual preferences. This is really important to study, as it is possible that Facebook’sinferences on user preferences is statistically discriminatory and does not accurately representthe true preferences of the user in question. Facebook, in a lawsuit, suggests that it makes moresense from a business standpoint to show women ads for items such as cosmetics or clothing, whileshowing men ads for professional sports, as these are seen to be the general trend in preferencesby gender [41, 43]. However, this is clearly a stereotype, and makes it harder for users to receiveads on actual areas of interest.Furthermore, these comparisons on preferences are not really valid, as it might not be completelyaccurate to rank all forms of content in the same ranking. We could study whether it would be morerepresentative of user preferences to rank preferences for seeing types of content that informs usersabout economic opportunities separately from a ranking about consumer opportunities. So, jobopportunities would be ranked separately to consumer opportunities like buying clothes, cosmetics,sporting equipment and so on.Another possible avenue of research could involve studying advertising trends and distributionsin other countries around the world, to observe whether the labour market distribution is in uencedby the distribution of ads about economic opportunities. For example, we could study advertisementsin India, and observe whether the distribution of economic opportunity ads re ects facts about the uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG labor market, such as the female labour force participation rate of 27% [36]. To accurately checkthis re ection, we could compare this statistic to the distribution of job opportunity ads on theplatform in India. Advertisements for economic opportunities are not distributed proportionally or equally on Face-book by gender and age among persons living in the United States.

Gender . Women are a larger percent of the total demographic shown advertisements. However,in evaluating the distribution of advertisements by class and sub-class, men are a larger percentof the demographic that receives ads for credit, particularly new lines of credit and nancing.Meanwhile, ads for debt relief are disproportionately shown to women. Discrimination againstwomen is endemic in credit markets. Research evidences that women are denied credit opportunitiesmore often, and are o ered worse terms for credit, if o ered any credit at all. Importantly, in theUnited States, credit discrimination disproportionately impacts Black and African American women.Persons who do not identify their gender identity to Facebook, or who identify as neither maleor female, are rarely, if ever, shown credit ads of any type. When users either do not identify agender on their Facebook pro le, or identify as non-binary and/or trans, Facebook counts them ashaving an "unknown" gender identity. Across every class of advertisement, people identi ed ashaving an "unknown" gender, on average, are less than 1 percent of the total demographic shownany ad, if shown ads at all.Discrimination against LGBPQ, transgender and/or non-binary persons is systematic and per-vasive but hard to measure in the United States [9, 27]. Based on our audit ndings, Facebookneeds to: (a) publicly explain if the HEC advertising portals are designed to send ads to everygender by default, and (b) speci cally, if the default options on the portal disproportionately excludenon-binary and/or persons who are transgender from receiving ads for economic opportunities.Facebook should consult with LGBTQ+ advocacy, policy and community organizations to discussways to measure and account for how ads for economic opportunities are distributed among personswho identify their gender as non-binary and/or transgender, gender identities speci ed by users,and those who choose not to identify. We recommend that Facebook consult the HCI Guidelinesfor Gender Equity and Inclusivity developed by Morgan Klaus Scheuerman and coauthors [26]. Race and Ethnicity . Facebook claims that their platform does not allow advertisers to targetusers by race or ethnicity, including when advertisements are for economic opportunities. At thesame time, Facebook provides tools to advertisers to help them learn which user attributes areproxies for demographic traits, including race and ethnicity. Meanwhile, the Facebook Ad LibraryAPI does not provide researchers any information about the distribution of ads by race or ethnicity.Facebook claims the reason why the Ad Library does not provide this data is the privacy of users.However, the Ad Library API only provides aggregate and anonymous demographic data at thestate and not town, city or county level. Therefore, it would be nearly impossible to identify anyindividual Facebook users from statistics describing how ads were distributed by race and ethnicity.If Facebook cares for the privacy of users, and speci cally about the privacy of data about theirrace and ethnicity, Facebook should not provide advertisers tools that help them target users byrace and ethnicity. Given Facebook’s track record on civil rights, until then, we recommend thatFacebook make aggregate and anonymous data publicly available via the Ad Library API about thedistribution of advertisements by race and ethnicity at the state level.Facebook allows advertisers to learn about and use proxy attributes to target users by race, butprovides no information about the distribution of ads by race and ethnicity. Auditors, therefore,and the public, have no way to analyze if patterns of racial discrimination in credit markets are une 2020, MD4SG Kingsley, et al. re ected in the distribution of economic opportunities on Facebook platforms. Responsibility for Harm . In civil rights lawsuits, Facebook has argued that advertisers are toblame for discrimination on its platform, and that it merely provides “neutral” tools to advertisers.However, in a settlement of legal charges, Facebook promised to, and subsequently changed thedesign of the advertising portal.Advertisers must now use the HEC advertising portal when publishing ads for housing, employ-ment, and credit (so-called HEC ads). The HEC portal does not allow advertisers to target usersby gender and age. The HEC portal also limits the proxy attributes that advertisers can use totarget users. If changes to the advertising portal indeed prevented advertisers from using protecteddemographic or proxy attributes to target users, then we would expect that no HEC ads would bedistributed only to one gender or age demographic group.Our audit reveals that ads are not only disproportionately sent to one gender identity, but thedistribution of ads is skewed by age, and importantly, some ads are still only sent to one gender. Thatads for economic opportunities are still only sent to one demographic, while others are excluded,suggests that the changes to the Facebook advertising portal do not prevent discrimination in thedistribution of HEC ads.One possible explanation of this is that advertisers did not disclose that their ads were HECrelated, and so were not subject to limitations of targeting. Second, it is possible the ad-distributionalgorithm improperly optimized who should see advertisements, leading to systematic biases. Finally,it is possible that advertisers still managed to nd proxy variables that allowed discrimination (seebelow).Regardless of the speci c cause, our research suggests that much of this bias can be reduced. Wediscuss some potential mitigation approaches below. Proxy Targeting.

Even if advertisers cannot target speci c genders, races and ethnicities, theFacebook Audience Insights page allows advertisers to nd proxies for these various attributesby displaying data like device activity, work, and education that can be ltered by gender, age,location, and multicultural a nity. While this may be a useful tool for analyzing and predictingthe impact of a given ad, it also introduces the possibility of targeting not constrained by law. Thus,Facebook allows advertisers to reinforce current biases and inequalities by identifying proxies forvarious groups of people through Audience Insights data.The quantitative impact of proxy targeting on the demographic distribution of ad viewers maybe investigated more deeply in the future to determine if these proxies are a workaround forthe regulations on discrimination and targeting. More investigation into ways to restrict proxytargeting should also be considered. New questions and opportunities

Our work also suggests new questions for fairness anduser rights. We hope, by writing this paper, to start a discussion about when disproportionatedistribution is irrelevant, and when it is harmful. For instance, is a 1 percent di erence betweenthe percentage of the demographics that were shown content advertising reasonable? What about15 to 20 percent di erence? In the remaining sections, we discuss considerations of harm, fairnessand (in)equity in allocating resources. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Stereotypes in Algorithmic decision-making.

Algorithms are used in the decision making processin various di erent scenarios and contexts, including but not limited to: providing credit, hiringprocesses, housing loans, and more generally, advertisement distributions. Algorithms predict thefuture behavior of individuals using imperfect information/data that they have from past behaviorof other individuals who belong to the same socio-cultural group [25]. If we take the example ofcredit advertisements, then we can see that this statistical discrimination perpetuates a reinforcingcycle over iterations. If the quali cation of a certain group, for example women, was historicallylower than a dominant group, then the algorithm will not preference individuals from the dis-advantaged group in the distribution of credit ads among demographics. Further, as algorithmsare written, developed and tested by humans, it is possible for personal biases and personally heldstereotypes to seep in.The distribution of advertisements for economic opportunities on Facebook re ect social stereo-types perpetuated in the United States. For example, advertisements for new credit lines are shownmore to men, while advertisements for debt relief are shown more to women. Di erent types ofjob ads, which we brie y explored in this paper, are also distributed with an algorithm that showssigns of stereotyping. While women on the whole received more job ads than men, the positionsadvertised were traditionally roles lled by women like “secretary” or “nurse” whereas men weremore likely to receive ads for traditionally masculine work such as construction. Thus, the divisionof labor by gender reinforces the general stereotypes of each gender. Legal and Ethical responsibilities of Platforms.

In accordance with Section 230 of the CommunicationsDecency Act, “No provider or user of an interactive computer service shall be treated as the publisheror speaker of any information provided by another information content provider” . In other words,a platform cannot in theory be held accountable for the content which is published on it – be itindividual user posts or large-scale advertisement campaigns. However, since these platforms arethe ones providing the tools for advertisements to be distributed, it is our belief that they should beresponsible for designing these tools such that they cannot be misused. une 2020, MD4SG Kingsley, et al. Although Facebook limits the direct targeting options for HEC ads, it still relies on advertisersto self-disclose that their ad falls into one of these categories, thus allowing an opportunity forexploitation. Proxy targeting also o ers a workaround for some of the targeting limitations onprotected classes. It can be argued that, since such actions are outwardly prohibited by the platform,Facebook should not be held accountable when advertisers choose to disobey its rules. However, our ndings and the above discussion show that such misuses are not uncommon, and thus demonstratestructural aws in the way Facebook handles advertisements on its platform. Indeed, the companyhas been previously involved in lawsuits on the premise of unfair advertisement distribution despitenot being the entity which publishes the ads [24, 42].The unfair distribution is important to consider, particularly when it comes to protected classessuch as gender or race, because it may lead to reinforcement of existing societal inequalities andbiases. If one ad for a STEM position is disproportionately targeted towards men, it will likelyresult in a man being hired for this position; if a hundred of di erent ads are, it might then providefurther support for the stereotype of women not being interested in STEM jobs, and the realityof them not getting equal opportunities to apply for such positions. Because of this, we believethat it is important to hold large platforms accountable for the way they allow their often-massiveuser bases to be reached by marketing campaigns. Advertisers may choose to post ads which aredishonestly classi ed or targeted beyond the permitted categories using proxies, but these exploitsare available solely because of the way Facebook functions as an advertising platform. This makesit Facebook’s responsibility to maintain and update the tools it o ers to advertisers if it wants tolower the availability of such exploits.Ultimately, our software and method for auditing Facebook advertising supports and also con-tributes new directions to the algorithm and digital platform auditing literature. First, analyzingthe demographic distribution of economic opportunity advertisements across advertisers, adver-tisement campaigns and over time is important. Preliminary ndings from our Facebook auditindicate that evidence produced from advertising experiments do not necessarily generalize todescribe overall trends in the distribution of advertisements on platforms. In other words, whileexperiments are useful for learning how features of digital platforms and ads could bias distri-butions on platforms, the studies do not necessarily inform us about how ads are distributed inthe wild among demographic groups. Collecting and analyzing data provided by digital platformscan allow us to compliment experimental methods with data about the demographic distributionof ads. We applaud Facebook for making this data available, suggest that Facebook assess andmake improvements to the Ad Library API, and encourage digital platforms that distribute orimpact the economic opportunities people have, to create and release data APIs for transparency,accountability and auditing of their platform. Over the course of history, legalized and illegal discrimination has segregated markets, institutionsand communities [16, 20, 37]. At the heart of concern is that digital platforms and their advertisingnetworks increasingly decide how to allocate economic opportunities among demographics, and thatthe historical nature of societal inequity is re ected in the outcomes. Disparities in the distributionof economic opportunity in the United States is a historic problem, disproportionately impactingpersons who identify as Black and African American, Latino/a/x, LGBQ+, and disabled. Genderand age discrimination compounds the impact of historic and present-day discrimination. Digitalplatforms that also discriminate compound the impact of prior biases in society over time, and havealso called into question the legitimacy of judiciary or legal systems, the possibility of criminal andcivil justice, and the democratic process itself. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Remarkably, (digital) discrimination is not always viewed as overt or hostile. It sometimesmanifests as an (algorithmic) preference for a familiar or favored group(s). At the same time,discrimination is also sometimes overt. For example, our audit uncovered systemic bias in theallocation of opportunities on a major advertising platform, Facebook. In particular, personsclassi ed as having a custom gender, meaning they either did not wish to disclose their genderidentity to Facebook, and/or do not identify as male or female, were shown few if any advertisementsfor housing, employment or credit (so-called HEC ads). Women were also disadvantaged. Mostcredit ads were distributed to a greater percentage of men. Finally, the distribution of most HECads evidenced an age bias. These audit ndings are problematic and indicate that digital platformsand advertising networks reproduce historical and present-day societal biases that marginalizedemographic groups. Future research should investigate the extent to which our audit ndingsgeneralize to platforms and advertising networks where regulated economic opportunities areadvertised.Measuring harm in the distribution of economic opportunities that are advertised on digitalplatforms is hard. First, access to platform and particularly demographic distribution data is limited.Second, while legal rules establish causes for action under the law, U.S. federal rules tend toward aminimum standard, and ignore undesirable bias on digital platforms. Third, beyond legal minimums,we should also care about equity in the allocation of opportunities. However, it is challenging tomeasure and determine if a distribution is acceptable. People, communities, as well as societies,have di erent de nitions of what is a permissible allocation of economic opportunities in society.For example, research shows that while "most White Americans accept basic principles of equalopportunity, at the same time, [they] resist the implementation of policies that would increaseequality directly"[20].Bias in advertising on digital platforms could be addressed by removing interface features thatallow advertisers to target users by demographic traits. However, it is debatable whether removingsuch features would su ce to redistribute the balance of economic opportunity ads. Instead, digitalplatforms and advertising networks might need to take a proactive human-centered approach.For example, digital platforms could add features that allow users to instruct the platform to sendthem advertisements for economic opportunities only or most often when opportunities are moreurgently needed, such as when unemployed or searching for work. Compare this to the statusquo, at present, digital platforms permit advertisers to send job opportunities only to the alreadyemployed.In researching if the allocation of advertising is a social problem, we adopted and emphasizethe sentiment that "roles for computing in social change" include "diagnostic" work [1]. That is,research which uncovers undesired outcomes but forgoes technological solution-ism. In this role,the aim of computing work is to produce evidence for broader e orts that seek to create sociallyjust systems. In this paper, we demonstrated a methodology to document and monitor how a digitalplatform allocates economic opportunities. Our ndings indicated that digital platforms cannotsimply, as they have done, tell advertisers not to use demographic targeting if their ads are forhousing, employment or credit. Instead, advertising must actively monitored. In addition, platformoperators must implement mechanisms that actually prevent advertisers from violating norms andpolicies in the rst place. Governments also have a role in improving the status quo. Incentivesfor making platform and advertising network data available to third-party auditors are needed.Penalties under the law that discourage audits by criminalizing certain digital audit methods (e.g.web scraping) need to be removed. Finally, government regulators could ease business concerns Devah Pager and Diana Kara n. January 2009. “Bayesian Bigot? Statistical Discrimination, Stereotypes, and EmployerDecision Making.” Annals of the American Academy, AAPSS, 621.25 une 2020, MD4SG Kingsley, et al. by issuing regulatory and legal guidance, and by providing technical assistance to help platformoperators comply with established rules. REFERENCES [1] Rediet Abebe, Solon Barocas, Jon Kleinberg, Karen Levy, Manish Raghaven, and David G. Robinson. 2020. Roles forComputing in Social Change. In

FaCCt (previously called Fat) . arXiv.[2] ACLU. March 2019.

Summary Of Settlements between Civil Rights Advocates and Facebook

Berkeley Journal of Employment and Labor Law , Vol. 40.[4] Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. 2019.Discrimination through optimization: How Facebook’s ad delivery can lead to skewed outcomes. arXiv (April 2019).[5] Athanasios Andreou, Giridhari Venkatadri, Oana Goga, Krishna Gummadi, Partick Loiseau, and Alan Mislove. 2018.Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebook’s Explanations.

HAL archives-ouvertes

Proceedings of the Fourteenth International AAAI Conference onthe Web and Social Media (ICSWSM 2020) .[8] Stijn Baert. 2017. Hiring Discrimination: An Overview of (Almost) All Correspondence Experiments Since 2005.

IZADiscussion Papers, No. 10738, Institute of Labor Economics (IZA) (2017).[9] Amanda K. Baumle, Lee M.V. Badgett, and Steven Boutcher. 2020. New Research on Sexual Orientation and GenderIdentity Discrimination: E ect of State Policy on Charges Filed at the EEOC. Journal of Homosexuality (2020).[10] Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial GenderClassi cation. In Proceedings of Machine Learning Research nance.gov/about-us/blog/what-you-need-know-about-equal-credit-opportunity-act-and-how-it-can-help-you-why-it-was-passed-and-what-it/[12] Patricia G Devine, Patrick S Forscher, Anthony J Austin, and William TL Cox. 2012. Long-term reduction in implicitrace bias: A prejudice habit-breaking intervention. Journal of experimental social psychology

PNSA

CHI

Proceedings of the 29th Conference onNeural Information Processing Systems . ACM, Barcelona, Spain.[20] Anthony G. Greenwald and Thomas F. Pettigrew. 2014. With Malice Toward None and Charity for Some: IngroupFavoritism Enables Discrimination.

American Psyhcologist (October 2014), 669–684.[21] Anja Lambrecht and Catherine Tucker. 2019. Algorithmic Bias? An Empirical Study of Apparent Gender-BasedDiscrimination in the Display of STEM Career Ads.

Management Science

65, 7 (2019), 2966–2981.[22] Karen Levy and Solon Barocas. 2018. Designing Against Discrimination in Online Markets.

Berkeley Technology LawJournal

32, 3 (2018).[23] Linfeng Li, Tawanna R. Dillahunt, and Tanya Rosenblat. 2019. Does Driving as a Form of ’Gig Work’ Mitigate Low-Skilled Job Seekers’ Negative Long-Term Unemployment E ects?. In Proceedings of the ACM on Human-ComputerInteraction (CSCW ’19, Vol. 3) . ACM, Article 156. 26 uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG [24] Jenner & Block LLP. April 29, 2019. HUD Brings Housing Discrimination Charge Against Facebook.

Lexology (April29, 2019).[25] Daniel McNamara. January 2019. Algorithmic Stereotypes: Implications for Fairness of Generalizing from Past Data.In : AIES ’19: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society . https://dl.acm.org/doi/pdf/10.1145/3306618.3314312[26] Oliver L. Haimson Foad Hamidid Stacy M. Branham Morgan Klaus Scheuerman, Katta Spiel. date modi The Economic Case for LGBT Equality: Why Fair and Equal Treatment Bene ts Us All Social Science One (April 2019).[32] Komal S. Patel. 2020. Testing the Limits of the First Amendment: How Online Civil Rights Testing is Protected SpeechActivity.

Columbia Law Review

IOWA LAW REVIEW (2020). https://ssrn.com/abstract=3347959[34] Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable Auditing: Investigating the Impact of Publicly NamingBiased Performance of Commercial AI Products. In

Proceedings of the AAAI/ACM Conference on Arti cial Intelligence,Ethics, and Society (AIES) . AAAI and ACM, Honolulu, Hawaii.[35] Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, JamilaSmith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI Accountability Gap: De ning an End-to-EndFramework for Internal Algorithmic Auditing. In Proceedings of the ACM Conference on Fairness, Accountability, andTransparency (FAccT) . ACM, Barcelona, Spain.[36] Gaurang Rami. 2018. Trends and Factors A ecting to Female Labour Force Participation Rate in India. Journal ofEmerging Technologies and Innovative Research

5, 11 (2018).[37] Peter Romer-Friedman. February 2020. Testimony of Peter Romer-Friedman Before the U.S. House of RepresentativesCommittee on Education and Labor, Subcommittee on Civil Rights and Human Services. The Future of Work: ProtectingWorkers’ Civil Rights in the Digital Age.[38] Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cédric Langbort. 2014. Auditing Algorithms : ResearchMethods for Detecting Discrimination on Internet Platforms.[39] Piotr Sapiezynski, Avijit Ghosh, Levi Kaplan, Alan Mislove, and Aaron Rieke. 2019. Algorithms That "Do not SeeColor": Comparing Biases in Lookalike and Special Ad Audiences. arXiv (December 2019).[40] Piotr Sapiezynski, Valentin Kassarnig, Christo Wilson, Sune Lehmann, and Alan Mislove. 2017. Academic performanceprediction in a gender-imbalanced environment. In

FATREC . Como, Italy.[41] Opiotennione v. Facebook. June 12, 2020. Plainti ’s Opposition to Defendant’s Motion to Dismiss.[42] Opiotennione v. Facebook. March 31, 2020. Amended Complaint for Violations of State Laws: Class Action Demandfor Jury Trial.[43] Opiotennione v. Facebook. May 8, 2020. Defendant Facebook, Inc.’s Notice of Motion to Dismiss Plainti ’s FirstAmended Complaint. 27 une 2020, MD4SG Kingsley, et al. A APPENDIXA.1 Description of Advertisement Database

Table 12, on the following page, lists the variables or features that are available for advertisementsin our database. uditing Digital Platforms for Discrimination in Economic Opportunity Advertising June 2020, MD4SG Table 13. Variables in Ad Database

Variable Description archiveID

ID that locates the ad in the Ad Library ad_creation_time

Time and date (UTC) that the ad was created text

Text displayed in the main body of the ad url_caption

If an ad has an embedded website link,this is the text that appears in the link url_description

If an ad has an embedded website link,this is the text the appears with the link url_title

If an ad has an embedded website link,this is the text the appears as the link Title ad_delivery_start_time

Time and date (UTC) an ad started running ad_delivery_stop_time

Time and date (UTC) an ad stopped running embedded_url

A permanent URL link to the location of the ad in the Ad Library currency

Currency used to pay for the ad funding_entity

Name of the person or organization that paid for the ad impressions

Minimum & Maximum impressions potential_reach

Minimum and maximum potential audience size page_id

ID for the Facebook page that ran the ad page_name

Name of the Facebook page that ran the ad publisher_platforms

List of platforms that the ad was displayed on region_distribution

Geographic distribution by U.S. state. Given as a percentage spend

Minimum & Maximum money spent on an ad age age range for the observations in the instance/database row gender gender identity for the observations in the instance/database row percentage_demographic percentage of the total demographic shown an ad predicted_label

Predicted marketing class of the ad ad_subClass

Predicted subclass of the ad marketing class max_percentage

Maximum percentage of the demographic sent an ad max_genderDemographic

Gender that is greatest percentage of demographic max_ageDemographic

Age group that is the greatest percentage of demographic startTimeDayOfWeek

Day of the week that the ad started running stopTimeDayOfWeek

Day of the week that the ad stopped running adStartWeek

Week of the year that the ad started running adStopWeek

Week of the year that the ad stopped running adDuration_Seconds

Number of seconds that ad ran on the platform adDuration_Minutes

Number of minutes that ad ran on the platform adDuration_Hours

Number of hours that ad ran on the platform adDuration_Days

Number of days that ad ran on the platform adDuration_Weeks

Number of weeks that ad ran on the platform adDuration_Months

Number of months that ad ran on the platform adStart_Semester

Whether ad started running in the 1st or 2nd half of the year adStop_Semester

Whether ad stopped running in the 1st or 2nd half of the year startQuarter

Financial quarter that ad started running on the platform stopQuarter