[PDF] Toward Fairness in AI for People with Disabilities: A Research Roadmap

Abstract

AI technologies have the potential to dramatically impact the lives of people with disabilities (PWD). Indeed, improving the lives of PWD is a motivator for many state-of-the-art AI systems, such as automated speech recognition tools that can caption videos for people who are deaf and hard of hearing, or language prediction algorithms that can augment communication for people with speech or cognitive disabilities. However, widely deployed AI systems may not work properly for PWD, or worse, may actively discriminate against them. These considerations regarding fairness in AI for PWD have thus far received little attention. In this position paper, we identify potential areas of concern regarding how several AI technology categories may impact particular disability constituencies if care is not taken in their design, development, and testing. We intend for this risk assessment of how various classes of AI might interact with various classes of disability to provide a roadmap for future research that is needed to gather data, test these hypotheses, and build more inclusive algorithms.

Full PDF

TToward Fairness in AI for People with Disabilities:A Research Roadmap

Anhong Guo , , Ece Kamar , Jennifer Wortman Vaughan ,Hanna Wallach , Meredith Ringel Morris Microsoft Research, Redmond, WA & New York, NY, USA Human-Computer Interaction Institute, Carnegie Mellon University, Pittsburgh, PA, [email protected], {eckamar, jenn, wallach, merrie}@microsoft.com

ABSTRACT

AI technologies have the potential to dramatically impact thelives of people with disabilities (PWD). Indeed, improvingthe lives of PWD is a motivator for many state-of-the-art AIsystems, such as automated speech recognition tools that cancaption videos for people who are deaf and hard of hearing, orlanguage prediction algorithms that can augment communica-tion for people with speech or cognitive disabilities. However,widely deployed AI systems may not work properly for PWD,or worse, may actively discriminate against them. These con-siderations regarding fairness in AI for PWD have thus farreceived little attention. In this position paper, we identify po-tential areas of concern regarding how several AI technologycategories may impact particular disability constituencies ifcare is not taken in their design, development, and testing. Weintend for this risk assessment of how various classes of AImight interact with various classes of disability to provide aroadmap for future research that is needed to gather data, testthese hypotheses, and build more inclusive algorithms.

Author Keywords

Artiﬁcial intelligence; machine learning; data; disability;accessibility; inclusion; AI fairness; AI bias; ethical AI.

CCS Concepts • Computing methodologies → Artiﬁcial intelligence; • Human-centered computing → Accessibility; • Social andprofessional topics → Codes of ethics; People with disabil-ities;

INTRODUCTION

As AI systems increasingly pervade modern life, ensuring thatthey work fairly for all is an important challenge. Researchershave identiﬁed unfair gender and racial bias in existing AIsystems [2, 7, 9]. To understand how AI systems work acrossdifferent groups of people, it is necessary to develop inclusivetools and practices for evaluation and to identify cases inwhich homogeneous, non-inclusive data [9] or data reﬂectingnegative historical biases [2, 7] is used for system training.

ACM ASSETS 2019 Workshop onAI Fairness for People with Disabilities

Although improving the lives of people with disabilities(PWD) is a motivator for many state-of-the-art AI systems,and although such systems have the potential to mitigate manydisabling conditions [6], considerations regarding fairness inAI for PWD have thus far received little attention [73]. Fair-ness issues for PWD may be more difﬁcult to remedy thanfairness issues for other groups, particularly where people withparticular classes of disability may represent a relatively smallproportion of a population. Even if included in training andevaluation data, they may be overlooked as outliers by currentAI techniques [73]. Such issues threaten to lock PWD out ofaccess to key technologies (e.g., if voice-activated smart speak-ers do not recognize input from people with speech disabili-ties), inadvertently amplify existing stereotypes against them(e.g., if a chatbot learns to mimic someone with a disability),or even actively endanger their safety (e.g., if self-driving carsare not trained to recognize pedestrians using wheelchairs).We propose the following research agenda to identify and rem-edy shortcomings of AI systems for PWD: (1) Identify waysin which inclusion issues for PWD may impact AI systems;(2) Test inclusion hypotheses to understand failure scenariosand the extent to which existing bias mitigation techniques(e.g., [18, 33, 37]) work; (3) Create benchmark datasets tosupport replication and inclusion (and handle the complexethical issues that creating such datasets for vulnerable groupsmight involve); and (4) Innovate new modeling, bias mitiga-tion, and error measurement techniques in order to addressany shortcomings of status quo methods with respect to PWD.In this position paper, we take a step toward the ﬁrst of thesegoals by reﬂecting on ways in which current key classes of AIsystems may necessitate particular consideration with respectto different classes of disability. Systematically studying theextent to which these interactions exist in practice, or demon-strating that they deﬁnitely do not, is an important next steptoward creating AI inclusive of PWD; however, articulatingthe extent of a problem is a necessary precursor to remediation. Throughout this paper, we use people-ﬁrst language as suggested bythe ACM SIGACCESS guidelines [32], but we recognize that somepeople may choose identity-ﬁrst language or other terminology. Notethat we use the term “disability” in accordance with the social modelof disability [62], which emphasizes that an impairment (i.e., dueto a health condition or even a particular situational context) resultsin disability due to non-accommodating social or environmentalconditions; under this model, AI systems could either mitigate oramplify disability depending on how they are designed. a r X i v : . [ c s . C Y ] A ug urthermore, we note that the question of whether it is evenethical to build certain categories of AI is an important one(and may be dependent on use context). Our mention of vari-ous classes of AI is not an endorsement of whether we thinksuch systems should be built, but is simply describing howthey may interact with disability. Indeed, there is a largerethical discussion to be had on how limiting some types ofAI with negative associations (like synthetic voices that couldbe used for deepfakes [11]) might disenfranchise PWD whocould beneﬁt from such tech (i.e., by limiting the opportunityto realistically reproduce the voice of someone who can nolonger speak). RISK ASSESSMENT OF EXISTING AI SYSTEMS FOR PWD

Here, we group existing classes of AI systems by relatedfunctionalities, and identify disability constituencies for whomthese systems may be problematic. This risk assessment ismeant as a starting point to spark further research, and maynot be exhaustive. For example, as new AI technologies aredeveloped they would require consideration with respect todisability. Additionally, while we strove to anticipate ways inwhich classes of AI may fail for some disability groups, wemay not have exhaustively identiﬁed all such groups; indeed,the “long tail” of disability and potential co-occurrence ofmultiple disabilities are two of many reasons that ensuring AIinclusion for PWD is particularly challenging [73].

Computer Vision

Computer vision systems analyze still or video camera inputsto identify patterns, such as the presence and attributes of faces,bodies, or objects. Disabilities that may impact a person’sphysical appearance (facial features, facial expressions, bodysize or proportions, presence of assistive equipment, atypicalmotion properties) are important to consider when designingand testing the fairness of computer vision algorithms.

Face Recognition

Face recognition systems include capabilities for identifyingthe presence of a face and/or making inferences about its prop-erties, including face detection , identiﬁcation (i.e., to guessthe identity of a speciﬁc person), veriﬁcation (i.e., to validatea claimed identity), and analysis (e.g., gender classiﬁcation,emotion analysis). Face recognition systems are already usedin a wide variety of scenarios, including biometric authen-tication [3, 52], security systems [21], criminal justice [61],interview support software [34], and social/entertainment ap-plications [23], many of which are controversial.We hypothesize that such techniques may not work well forpeople with differences in facial features and expressions ifthey were not considered when gathering training data andevaluating models. For instance, various aspects of facial anal-ysis software may not work well for people with conditionssuch as Down syndrome, achondroplasia, cleft lip/palate, orother conditions that result in characteristic facial differences.Such systems may also fail for people who are blind, whichmay not only result in differences in eye anatomy, but mayalso result in a person wearing medical or cosmetic aids suchas dark glasses, and may produce unanticipated behaviors,such as a person not holding their face toward a camera at the expected angle. Emotion processing algorithms may mis-interpret the facial expressions of someone with autism orWilliams syndrome, who may not emote in a conventionalmanner; expression interpretation may also be problematicfor people who have experienced stroke, Parkinson’s disease,Bell’s Palsy, or other conditions that restrict facial movements. Body Recognition

Body recognition systems include capabilities for identifyingthe presence of a body and/or making inferences about itsproperties, such as body detection, identiﬁcation, veriﬁcation,and analysis. Body recognition systems can power applica-tions using gesture recognition (e.g., in VR and AR [4, 49] orgaming [47]), or gait analysis (e.g., for biometric authentica-tion [78], sports biomechanics [54], and path predictions usedby self-driving vehicles [74]).Body recognition systems may not work well for PWD char-acterized by body shape, posture, or mobility differences. Forexample, gesture recognition systems are unlikely to workwell for people with differences in morphology (e.g., a personwith an amputated arm may be unable to perform bimanualgestures, or may grip a device differently than expected; a per-son with polydactyly’s style of touching a screen may registeran unanticipated pattern). Failure of gesture recognition sys-tems is also likely in cases where disability affects the natureof motion itself, such as for someone who experiences tremoror spastic motion [56, 57]. Fatigue may also impact gestureperformance (and therefore recognition accuracy) over time,particularly for groups that may be more susceptible to fatiguesuch as due to disability or advanced age. The scheduling ofmedications whose main- or side-effects mitigate or amplifymotor symptoms such as tremor may also result in differentialgesture performance within or across days.People who are unable to move at all or who have severelyrestricted motion (e.g., people with ALS or quadriplegia), maybe locked out of using certain technologies if body recogni-tion is the only permitted interaction. Further, body recogni-tion systems may not work well for people with mobility ormorphology differences; for example, if a self-driving car’spedestrian-detection algorithm does not include examples ofpeople with posture differences such as due to cerebral palsy,Parkinson’s disease, advanced age, or who use wheelchairsduring its training and evaluation, it may not correctly identifysuch people as objects to avoid, or may incorrectly estimatethe speed and trajectory of those who move differently thanexpected, similar to Uber’s recent self-driving car accidentthat killed a pedestrian walking a bicycle [14]. Object, Scene, and Text Recognition

Object, scene, and optical character recognition (OCR) sys-tems recognize common objects, logos, text, handwriting, etc.,and output labels, captions, and/or properties (i.e., location,activity, relationship). Systems taking advantage of these capa-bilities have been widely adopted by PWD, particularly people Many gesture systems use computer vision [40, 49], but some useother sensors, such as capacitive touchscreens [15], accelerometerswithin devices [30, 42], etc.; body and mobility differences maycreate problems regardless of sensor type, though different sensorclasses may have pros and cons for particular populations.

Speech Systems

We use the term “speech systems” to refer to AI systemsthat recognize the content (i.e., words) and/or properties (i.e.,prosody, speaker demographics) of speech, or that generatespeech from symbolic inputs such as text, Speech SynthesisMarkup Language (SSML), or other encodings. Disabilitiesthat may impact the content or clarity of a user’s speech, aswell as those impacting the ability to perceive sound, mayreduce the accuracy and usability of speech systems.

Speech Recognition

Automatic Speech Recognition (ASR) systems take in speechand output text. ASR systems have the potential to be im-portant accessibility tools for people who are deaf or hard ofhearing (DHH), such as by producing captions that can beoverlaid as subtitles on videos [24, 76], or possibly even us-ing augmented reality to live-caption face-to-face speech [35].Speech input is also also useful for people who have difﬁcultyusing their hands to control traditional input devices [5].ASR may not work correctly for people with atypical speech.ASR systems are known to have bias; for instance, many sys-tems perform better for men than women [58, 66, 68]. Today,many ASR systems do not work well for some older adults,due to differences in pitch, pacing, and clarity of speech bypeople of very advanced ages, since they are not commonlyrepresented in the training and evaluation of the systems [67].People with accents, including accents due to disability (e.g.,“deaf accent”), also face challenges using current ASR tools[20, 27, 68], though it is possible to train personalized modelsfor such groups [16, 75]. Speech disabilities such as disarthrya,as well as the use of speech-generating augmentative and al-ternative communication (AAC) devices, can also negativelyimpact ASR functionality [38]. Further, people who are unableto speak at all (i.e., some people who are deaf, people with some forms of aphasia), may be locked out of using ASR tech-nologies. Additionally, error metrics used to evaluate manyASR systems, such as Word Error Rate, may not be adequateto capture the end-user experience of such tools, particularlyfor users with disabilities that may prevent them from verify-ing the system’s output (i.e., someone who is profoundly deafmust trust the output of ASR captioning).

Speech Generation

Speech generation technologies include technologies such astext to speech (TTS) systems that aim to generate realisticaudio from symbolic inputs such as text, SSML, or othermarkup, as well as emerging AI tools such as voice fonts [10,48], which aim to realistically mimic the sound of a particularspeaker. TTS systems have been widely deployed in voiceassistants such as Cortana, Alexa, Siri, and the Google Assis-tant; TTS is also key to many assistive technologies, includingscreen readers used by people who are blind and AAC de-vices used by people with speech and motor disabilities. Voicebanking to create personalized voice fonts may be particularlyvalued by people with degenerative conditions that result inprogressive loss of speaking abilities (e.g., ALS) [19, 38].System defaults for what constitutes comprehensible speakingrates may need adjustments for particular disability segments;development of error metrics related to comprehension mayneed inclusion of such populations in order to account fordiverse user needs – for instance, people with cognitive or in-tellectual disabilities may require slower speech rates, whereaspeople with visual impairments may ﬁnd rates too slow [77].Text-based prediction techniques are often deeply intertwinedwith speech generation in the case of AAC technologies; thechoice of training and evaluation corpora for prediction mayneed to be adapted to be relevant to the topical needs and de-sired speech attributes of AAC users, supporting expressivityand authentic self-representation [38].

Speaker Analysis

Speaker analysis systems include capabilities for speaker iden-tiﬁcation, speaker veriﬁcation, and making inferences aboutthe speaker’s attributes such as age, gender, and emotion.Speaker analysis systems have a wide range of applicationsincluding biometric authentication [59], enhancing speechtranscription [72], and personalization [26]. Speaker analysissystems also have the potential to be important accessibilitytools for people who are DHH, such as by supporting soundawareness through visualizations [36].Speaker recognition and speech analysis tools that make in-ferences about a user’s personal characteristics (i.e., gender,age) may not work well for PWD that signiﬁcantly impact thesound of speech (e.g., dysarthria). Analysis tools that attemptto infer emotional state from prosodic features are likely tofail for speakers with atypical prosody, such as people withautism or some types of dementia.

Text Processing

Text processing systems perform functions related to under-standing the content of text data, including tasks such as textanalysis and translation. Text processing systems are likely to3ave accuracy and fairness challenges for people with cogni-tive and/or intellectual disabilities; systems for minority lan-guages used by disability subcommunities, such as AmericanSign Language, are also a concern [8].

Text Analysis

Text analysis systems take text as input, and may attempt todetect content properties (e.g., key phrases, named entities,language) and/or author properties (e.g., sentiment, personal-ity, demographics). Text analysis is broadly applied in recordmanagement, information retrieval, and pattern mining. Textanalysis systems have the potential to be helpful for PWD thatimpact reading and writing, such as dyslexia, dysgraphia, orother cognitive differences, such as through visual illustrationand focused highlighting [50] or through intelligent spelling,grammar correction, and word or phrase suggestions [28].Cognitive and intellectual disabilities are likely to impact theefﬁcacy and utility of many aspects of text analysis systems.For example, there is some evidence that spelling correctionand query rewriting tools may not accurately handle dyslexicspelling [53, 65]. Further, people with autism may expressemotion differently in writing than people who are neurotypi-cal, resulting in incorrect classiﬁcations about their emotionalstate or personality. If these metrics are used as input to anautomatic hiring system [71] or automatic essay grading sys-tems used with many standardized aptitude tests, text analysissystems can have accuracy and fairness challenges for peoplewith cognitive and/or intellectual disabilities.

Integrative AI

In addition to the aforementioned classes of systems for vision,speech, and text processing, which were focused on singlemodels, many complex AI systems are architectures integrat-ing several models together to achieve more complex behavior.Here, we discuss two common examples of integrative AI:Information Retrieval and Conversational Agents.

Information Retrieval

Information retrieval (IR) tools, such as those that power websearch engines, rely on AI for a variety of purposes, includingquery rewriting, autocompletion suggestions, spelling cor-rections, search result ranking, content summarization, andquestion answering. The input and output of IR systems canhave many formats, e.g., image, video, sound, or text.It is likely that many IR systems may inadvertently amplifyexisting biases against PWD, such as through returning stereo-typed and/or over- and under-represented content in searchresults (a problem that has been documented with respect togender in image search results [39] and word embeddings [7]).AI systems for advertising, both content-based (i.e., related tothe current search query) and behavior-based (i.e., related to auser’s personal characteristics), are also a key component ofmany commercial IR systems, as well as other online ecosys-tems (e.g., social media). Advertising algorithms and othertypes of recommender systems may hold particular risk forPWD by actively propagating discriminatory behavior such asthrough differential pricing for products and services and/ordifferential exposure to employment or other opportunities (anissue for which Facebook recently encountered legal trouble, by allowing housing ads that may have differentiated amongprotected demographics, including PWD [70]). IR systemsmay pose particular challenges for people with cognitive orintellectual disabilities if not trained and tested with thesegroups; for example, people with dyslexia have reported thatstatus quo query completion and result ranking techniquesmay not match their abilities [53].

Conversational Agents

Conversational agents provide conversational experiences toend users for various practical applications, including cus-tomer service [69], education [13], and health support [22].They are also powered by a variety of models, e.g., ASR, textanalysis, TTS, and/or speaker analysis. Conversational agentshave the potential to reduce users’ workload when completingunfamiliar tasks [29], and could potentially provide cognitiveassistance to people with dementia or intellectual disabilitiesthat impact memory or executive functioning [43].If not carefully built, conversational agents could amplify exist-ing biases against PWD, such as through returning stereotypedcontent in conversations (e.g., Microsoft shut down the chatbotTay because it started generating hate speech learned from co-ordinated malicious users [46]). Further, conversational agentsmay not work well for people with cognitive and/or intellec-tual disabilities, resulting in poor user experience. Trainingconversational agents on corpora that include data from peo-ple with a variety of cognitive and intellectual capabilities, aswell as testing with similarly diverse audiences, is particularlyimportant. For example, conversational agents may need tocorrectly interpret atypical spelling or phrasing from userswith dyslexia, or may need to adjust their vocabulary level tobe understood by someone with dementia. Further, conver-sational agents may need to support conversation in a user’spreferred expressive medium, which may not be written lan-guage for some disability segments – i.e., it may be importantto support communication via sign languages (for people whoare deaf) or via pictures and/or icons (for people with aphasiaor autism).

Other AI Techniques

In addition to assessing risk factors for particular classes of AIapplications, it is also worth considering that many AI tech-niques and practices that comprise the building blocks of suchsystems may lead to biases against PWD, such as techniquesfor outlier detection, practices of evaluating systems throughaggregate metrics, deﬁnition of objective functions, and usingtraining data that do not capture the true use cases or the truecomplexity of the real world.Outlier detection algorithms ﬂag outlier input, typically forpunitive action, such as fraud detection. Lack of or low rep-resentation in training and evaluation data may erroneouslyresult in people with a variety of disabilities being inadver-tently ﬂagged by anomaly detection tools, even when theiractions should constitute legitimate system inputs. For ex-ample, many systems use task completion time as a signalfor automatically determining input legitimacy, ranging fromCAPTCHAs that aim to distinguish humans from bots to on-line crowd labor markets that aim to distinguish legitimate4orkers from spammers [79]. However, many types of dis-ability might manifest in atypical task performance timing,including the use of screen reader or magniﬁer tools by peo-ple with vision impairments, difﬁculty performing quick andaccurate motions by people with a variety of motor-limitingconditions, people accessing devices through switch inputsdue to motor limitations, slow reading times due to cognitivedisabilities such as dyslexia, etc.A common approach in evaluating AI systems is measuringperformance with aggregate metrics such as accuracy, areaunder the curve (AUC), or mean square error (MSE). Aggre-gate metrics hide how performance varies across groups, inparticular performance drops for small classes such as PWD[60]. Objective functions that aim to maximize aggregate met-rics will likely fail to prioritize performance for PWD. Recentwork has introduced techniques that expand the objective func-tions for model training with terms that penalize performancediscrepancies between subgroups [1].Most AI systems are trained with existing datasets (i.e., datascraped from public corpora such as Flickr images [12]). Insome cases, existing data sets may fail to capture the complex-ity of the real world and may lack representation of diversegroups, such as PWD. This may lead to blind spots in AImodels [41]. Actively curating inclusive datasets may be par-ticularly important not only for training, but also for testingAI systems against known benchmarks.

DISCUSSION

Our research roadmap for increasing fairness in AI for PWD in-cluded four proposed steps; this position paper mostly focusedon the ﬁrst: identifying ways in which (lack of) inclusion intraining and evaluation of AI systems may negatively impactsuch systems’ fairness for PWD. To address this, we discussedways in which common categories of AI may need to accountfor various types of disabilities.Regarding the types of potential harm caused by unfair AI,most of our examples are related to quality of service [9], likevoice-activated smart speakers that may not recognize inputfrom people with speech disabilities. Others are related to harms of allocation [2], like using an incorrect prediction ofthe emotional state or personality of someone with autism asinput into an automatic hiring system, or denigration [46],like erroneously ﬂagging inputs from PWD as invalid outliers.Additional potential harms include stereotyping [7] and over-or under-representation [39]; IR systems may inadvertentlyamplify existing biases against PWD by returning stereotypedand/or poorly represented content in search results. For issuesrelated to allocation, quality of service, and representation,measuring objective fairness metrics through benchmarkingcould be sufﬁcient to reveal bias, while issues related to stereo-typing and denigration might require additional qualitativeinvestigations. More thorough considerations of all types ofharms with regard to PWD is important for future work.In some cases, as indicated by the referenced citations, ev-idence already exists of problems for certain classes of AIfor certain disability groups. For others, we have proposedhypotheses based on our knowledge of the domain space and analogous error cases for other minority user groups; our useof cautionary language such as “may cause” or “is likely” re-ﬂects this uncertainty. Per point

CONCLUSION

In this position paper, we have reﬂected on the ways in whichcurrent classes of AI systems, as well as several techniquesthat are the building blocks of AI, may limit the efﬁcacy andfairness of these systems for people with disabilities. Ul-timately, our goal is the creation of new design guidelines,datasets, algorithmic techniques, and error metrics that canhelp AI systems realize their enormous potential to beneﬁtPWD, while avoiding the possible pitfalls we have outlinedhere. We hope this paper provides a research roadmap that canguide AI researchers and practitioners in creating systems thatare fair to and effective for PWD.5

EFERENCES

1. Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík,John Langford, and Hanna Wallach. 2018. A reductionsapproach to fair classiﬁcation. arXiv preprintarXiv:1803.02453 (2018).2. Julia Angwin, Jeff Larson, Surya Mattu, Lauren Kirchner,and ProPublica. 2016. Machine Bias: There’s softwareused across the country to predict future criminals. Andit’s biased against blacks. (2016). Retrieved July 3, 2019from

3. Apple Inc. 2019a. About Face ID advanced technology.(2019). Retrieved July 3, 2019 from https://support.apple.com/en-us/HT208108

4. Apple Inc. 2019b. Augmented Reality – ARKit 3 – AppleDeveloper. (2019). Retrieved July 3, 2019 from https://developer.apple.com/augmented-reality/arkit/

5. Apple Inc. 2019c. macOS Catalina – Introducing VoiceControl. Your all-access to all devices. (2019). RetrievedJuly 3, 2019 from

6. Jeffrey P. Bigham and Patrick Carrington. 2018. Learningfrom the Front: People with Disabilities as EarlyAdopters of AI. In

HCIC 2018 .7. Tolga Bolukbasi, Kai-Wei Chang, James Y Zou,Venkatesh Saligrama, and Adam T Kalai. 2016. Man is tocomputer programmer as woman is to homemaker?debiasing word embeddings. In

Advances in neuralinformation processing systems . 4349–4357.8. Danielle Bragg, Oscar Koller, Mary Bellard, LarwanBerke, Patrick Boudreault, Annelies Braffort, NaomiCaselli, Matt Huenerfauth, Hernisa Kacorrim, TessaVerhoef, Christian Vogler, and Meredith Ringel Morris.2019. Sign Language Recognition, Generation, andTranslation: An Interdisciplinary Perspective. In

Proceedings of the 21st International ACM SIGACCESSConference on Computers and Accessibility . ACM.9. Joy Buolamwini and Timnit Gebru. 2018. Gender shades:Intersectional accuracy disparities in commercial genderclassiﬁcation. In

Conference on Fairness, Accountabilityand Transparency . 77–91.10. Min Chu, Yong Zhao, and Sheng Zhao. 2010. Providingpersonalized voice font for text-to-speech applications.(April 6 2010). US Patent 7,693,719.11. CNN Business. 2019. Deepfake videos: Inside thePentagon’s race against disinformation. (2019). RetrievedJuly 3, 2019 from

12. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li,and Li Fei-Fei. 2009. Imagenet: A large-scalehierarchical image database. In . Ieee, 248–255.13. Duolingo. 2019. Say hello to the Bots. The mostadvanced way to learn a language. (2019). Retrieved July3, 2019 from http://bots.duolingo.com

14. The Economist. 2018. Why Uber’s self-driving car killeda pedestrian. (2018). Retrieved July 3, 2019 from

15. John Greer Elias, Wayne Carl Westerman, andMyra Mary Haggerty. 2010. Multi-touch gesturedictionary. (Nov. 23 2010). US Patent 7,840,912.16. Engadget. 2019. Google trains its AI to accommodatespeech impairments. (2019). Retrieved July 3, 2019 from

17. Facebook Artiﬁcial Intelligence. 2019. Does objectrecognition work for everyone? A new method to assessbias in CV systems. (2019). Retrieved July 3, 2019 from https://ai.facebook.com/blog/new-way-to-assess-ai-bias-in-object-recognition-systems/

18. Michael Feldman, Sorelle A Friedler, John Moeller,Carlos Scheidegger, and Suresh Venkatasubramanian.2015. Certifying and removing disparate impact. In

Proceedings of the 21th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining .ACM, 259–268.19. Alexander J. Fiannaca, Ann Paradiso, Jon Campbell, andMeredith Ringel Morris. 2018. Voicesetting: VoiceAuthoring UIs for Improved Expressivity inAugmentative Communication. In

Proceedings of the2018 CHI Conference on Human Factors in ComputingSystems (CHI ’18) . ACM, New York, NY, USA, Article283, 12 pages.

DOI: http://dx.doi.org/10.1145/3173574.3173857

20. Raymond Fok, Harmanpreet Kaur, Skanda Palani,Martez E. Mott, and Walter S. Lasecki. 2018. TowardsMore Robust Speech Interactions for Deaf and Hard ofHearing Users. In

Proceedings of the 20th InternationalACM SIGACCESS Conference on Computers andAccessibility (ASSETS ’18) . ACM, New York, NY, USA,57–67.

DOI: http://dx.doi.org/10.1145/3234695.3236343

21. Australian Border Force. 2019. SmartGates. (2019).Retrieved July 3, 2019 from

22. Russell Fulmer, Angela Joerin, Breanna Gentile, LysanneLakerink, and Michiel Rauws. 2018. Using PsychologicalArtiﬁcial Intelligence (Tess) to Relieve Symptoms ofDepression and Anxiety: Randomized Controlled Trial.

JMIR mental health

5, 4 (2018), e64.23. Google. 2019a. Google Photos Help – Search by people,things, & places in your photos. (2019). Retrieved July 3,2019 from https://support.google.com/photos/answer/6128838?co=GENIE.Platform%3DAndroid&hl=en

24. Google. 2019b. TouTube Help – Use automaticcaptioning. (2019). Retrieved July 3, 2019 from https://support.google.com/youtube/answer/6373554?hl=en

65. Google. 2019c. With Lookout, discover yoursurroundings with the help of AI. (2019). Retrieved July3, 2019 from

26. Google Assistant. 2017. Tomato, tomahto. Google Homenow supports multiple users. (2017). Retrieved July 3,2019 from https://blog.google/products/assistant/tomato-tomahto-google-home-now-supports-multiple-users/

27. Linda G Gottermeier and S Kushalnagar Raja. 2016. UserEvaluation of Automatic Speech Recognition Systems forDeaf-Hearing Interactions at School and Work.

AudiologyToday

28, 2 (2016), 20–34.28. Grammarly Inc. 2019. Great Writing, Simpliﬁed. (2019).Retrieved July 3, 2019 from

29. Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu,and Jeffrey P. Bigham. 2019. StateLens: A ReverseEngineering Solution for Making Existing DynamicTouchscreens Accessible. In

Proceedings of the 32thAnnual Symposium on User Interface Software andTechnology (UIST ’19) . ACM, New York, NY, USA.

DOI: http://dx.doi.org/10.1145/3332165.3347873

30. Anhong Guo and Tim Paek. 2016. Exploring Tilt forNo-touch, Wrist-only Interactions on Smartwatches. In

Proceedings of the 18th International Conference onHuman-Computer Interaction with Mobile Devices andServices (MobileHCI ’16) . ACM, New York, NY, USA,17–28.

DOI: http://dx.doi.org/10.1145/2935334.2935345

31. Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo,Chi Lin, Kristen Grauman, Jiebo Luo, and Jeffrey P.Bigham. 2018. VizWiz Grand Challenge: AnsweringVisual Questions from Blind People. In

Proceedings ofthe IEEE Conference on Computer Vision and PatternRecognition . 3608–3617.32. Vicki L. Hanson, Anna Cavender, and Shari Trewin. 2015.Writing About Accessibility.

Interactions

22, 6 (Oct.2015), 62–65.

DOI: http://dx.doi.org/10.1145/2828432

33. Moritz Hardt, Eric Price, Nati Srebro, and others. 2016.Equality of opportunity in supervised learning. In

Advances in neural information processing systems .3315–3323.34. HireVue. 2019. HireVue - Hiring Intelligence |Assessment & Video Interview Software. (2019).Retrieved July 3, 2019 from

35. Dhruv Jain, Bonnie Chinh, Leah Findlater, RajaKushalnagar, and Jon Froehlich. 2018. ExploringAugmented Reality Approaches to Real-Time Captioning:A Preliminary Autoethnographic Study. In

Proceedingsof the 2018 ACM Conference Companion Publication onDesigning Interactive Systems (DIS ’18 Companion) .ACM, New York, NY, USA, 7–11.

DOI: http://dx.doi.org/10.1145/3197391.3205404

36. Dhruv Jain, Leah Findlater, Jamie Gilkeson, BenjaminHolland, Ramani Duraiswami, Dmitry Zotkin, Christian Vogler, and Jon E. Froehlich. 2015. Head-MountedDisplay Visualizations to Support Sound Awareness forthe Deaf and Hard of Hearing. In

Proceedings of the 33rdAnnual ACM Conference on Human Factors inComputing Systems (CHI ’15) . ACM, New York, NY,USA, 241–250.

DOI: http://dx.doi.org/10.1145/2702123.2702393

37. Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, andJun Sakuma. 2012. Fairness-aware classiﬁer withprejudice remover regularizer. In

Joint EuropeanConference on Machine Learning and KnowledgeDiscovery in Databases . Springer, 35–50.38. Shaun K. Kane, Meredith Ringel Morris, Ann Paradiso,and Jon Campbell. 2017. “At Times Avuncular andCantankerous, with the Reﬂexes of a Mongoose”:Understanding Self-Expression Through Augmentativeand Alternative Communication Devices. In

Proceedingsof the 2017 ACM Conference on Computer SupportedCooperative Work and Social Computing (CSCW ’17) .ACM, New York, NY, USA, 1166–1179.

DOI: http://dx.doi.org/10.1145/2998181.2998284

39. Matthew Kay, Cynthia Matuszek, and Sean A. Munson.2015. Unequal Representation and Gender Stereotypes inImage Search Results for Occupations. In

Proceedings ofthe 33rd Annual ACM Conference on Human Factors inComputing Systems (CHI ’15) . ACM, New York, NY,USA, 3819–3828.

DOI: http://dx.doi.org/10.1145/2702123.2702520

40. David Kim, Otmar Hilliges, Shahram Izadi, Alex D.Butler, Jiawen Chen, Iason Oikonomidis, and PatrickOlivier. 2012. Digits: Freehand 3D InteractionsAnywhere Using a Wrist-worn Gloveless Sensor. In

Proceedings of the 25th Annual ACM Symposium on UserInterface Software and Technology (UIST ’12) . ACM,New York, NY, USA, 167–176.

DOI: http://dx.doi.org/10.1145/2380116.2380139

41. Himabindu Lakkaraju, Ece Kamar, Rich Caruana, andEric Horvitz. 2017. Identifying unknown unknowns in theopen world: Representations and policies for guidedexploration. In

Thirty-First AAAI Conference on ArtiﬁcialIntelligence .42. Gierad Laput, Robert Xiao, and Chris Harrison. 2016.ViBand: High-Fidelity Bio-Acoustic Sensing UsingCommodity Smartwatch Accelerometers. In

Proceedingsof the 29th Annual Symposium on User InterfaceSoftware and Technology (UIST ’16) . ACM, New York,NY, USA, 321–333.

DOI: http://dx.doi.org/10.1145/2984511.2984582

43. Clayton Lewis. 2005. HCI for people with cognitivedisabilities.

ACM SIGACCESS Accessibility andComputing

83 (2005), 12–17.44. LookTel. 2019. LookTel Money Reader. (2019).Retrieved July 3, 2019 from

75. Haley MacLeod, Cynthia L. Bennett, Meredith RingelMorris, and Edward Cutrell. 2017. Understanding BlindPeople’s Experiences with Computer-Generated Captionsof Social Media Images. In

Proceedings of the 2017 CHIConference on Human Factors in Computing Systems(CHI ’17) . ACM, New York, NY, USA, 5988–5999.

DOI: http://dx.doi.org/10.1145/3025453.3025814

46. Microsoft. 2016. Learning from Tay’s introduction.(2016). Retrieved July 3, 2019 from https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/

47. Microsoft. 2019a. Azure Kinect DK – BODYTRACKING SDK. (2019). Retrieved July 3, 2019 from https://azure.microsoft.com/en-us/services/kinect-dk/

48. Microsoft. 2019b. Custom Voice. (2019). Retrieved July3, 2019 from https://speech.microsoft.com/customvoice

49. Microsoft. 2019c. Gestures – Mixed Reaity. (2019).Retrieved July 3, 2019 from https://docs.microsoft.com/en-us/windows/mixed-reality/gestures

50. Microsoft. 2019d. Immersive Reader – An AI Service thathelps users read and comprehend text. (2019). RetrievedJuly 3, 2019 from https://azure.microsoft.com/en-us/services/cognitive-services/immersive-reader/

51. Microsoft. 2019e. Seeing AI. (2019). Retrieved July 3,2019 from

52. Microsoft. 2019f. Windows Hello: Discover facialrecognition on Windows 10. (2019). Retrieved July 3,2019 from

53. Meredith Ringel Morris, Adam Fourney, Abdullah Ali,and Laura Vonessen. 2018. Understanding the Needs ofSearchers with Dyslexia. In

Proceedings of the 2018 CHIConference on Human Factors in Computing Systems(CHI ’18) . ACM, New York, NY, USA, Article 35, 12pages.

DOI: http://dx.doi.org/10.1145/3173574.3173609

54. Motion Analysis. 2019. Sports Biomechanics. (2019).Retrieved July 3, 2019 from https://motionanalysis.com/industry/sports-biomechanics/

55. Martez E. Mott, Jane E., Cynthia L. Bennett, EdwardCutrell, and Meredith Ringel Morris. 2018.Understanding the Accessibility of SmartphonePhotography for People with Motor Impairments. In

Proceedings of the 2018 CHI Conference on HumanFactors in Computing Systems (CHI ’18) . ACM, NewYork, NY, USA, Article 520, 12 pages.

DOI: http://dx.doi.org/10.1145/3173574.3174094

56. Martez E. Mott, Radu-Daniel Vatavu, Shaun K. Kane,and Jacob O. Wobbrock. 2016. Smart Touch: ImprovingTouch Accuracy for People with Motor Impairments withTemplate Matching. In

Proceedings of the 2016 CHIConference on Human Factors in Computing Systems(CHI ’16) . ACM, New York, NY, USA, 1934–1946.

DOI: http://dx.doi.org/10.1145/2858036.2858390

57. Martez E. Mott and Jacob O. Wobbrock. 2019. ClusterTouch: Improving Touch Accuracy on Smartphones forPeople with Motor and Situational Impairments. In

Proceedings of the 2019 CHI Conference on HumanFactors in Computing Systems (CHI ’19) . ACM, NewYork, NY, USA, Article 27, 14 pages.

DOI: http://dx.doi.org/10.1145/3290605.3300257

58. Antony Nicol, Chris Casey, and Stuart MacFarlane. 2002.Children are ready for speech technology-but is thetechnology ready for them.

Interaction Design andChildren, Eindhoven, The Netherlands (2002).59. Nuance Communications, Inc. 2019. Every voice matters:Our system knows who is talking and why. (2019).Retrieved July 3, 2019 from

60. Besmira Nushi, Ece Kamar, and Eric Horvitz. 2018.Towards accountable AI: Hybrid human-machineanalyses for characterizing system failure. In

Sixth AAAIConference on Human Computation and Crowdsourcing .61. Federal Bureau of Investigation. 2019. Next GenerationIdentiﬁcation (NGI). (2019). Retrieved July 3, 2019 from

62. Mike Oliver. 2013. The social model of disability: Thirtyyears on.

Disability & society

28, 7 (2013), 1024–1026.63. OrCam. 2019. OrCam MyEye 2 – For the Blind andVisually Impaired. (2019). Retrieved July 3, 2019 from

64. KNFB Reader. 2018. KNFB Reader gives you easyaccess to print and ﬁles, anytime, anywhere. (2018).Retrieved July 3, 2019 from https://knfbreader.com

65. Luz Rello, Miguel Ballesteros, and Jeffrey P. Bigham.2015. A Spellchecker for Dyslexia. In

Proceedings of the17th International ACM SIGACCESS Conference onComputers & . ACM,New York, NY, USA, 39–47.

DOI: http://dx.doi.org/10.1145/2700648.2809850

66. James A Rodger and Parag C Pendharkar. 2004. A ﬁeldstudy of the impact of gender and user’s technicalexperience on the performance of voice-activated medicaltracking application.

International Journal ofHuman-Computer Studies

60, 5-6 (2004), 529–544.67. S. Schlögl, G. Chollet, M. Garschall, M. Tscheligi, and G.Legouverneur. 2013. Exploring Voice User Interfaces forSeniors. In

Proceedings of the 6th InternationalConference on PErvasive Technologies Related toAssistive Environments (PETRA ’13) . ACM, New York,NY, USA, Article 52, 2 pages.

DOI: http://dx.doi.org/10.1145/2504335.2504391

68. Rachael Tatman. 2017. Gender and dialect bias inYouTube’s automatic captions. In

Proceedings of the FirstACL Workshop on Ethics in Natural LanguageProcessing . 53–59.89. TechCrunch. 2016. Facebook launches Messengerplatform with chatbots. (2016). Retrieved July 3, 2019from https://techcrunch.com/2016/04/12/agents-on-messenger/

70. The New York Times. 2019. Facebook Engages inHousing Discrimination With Its Ad Practices, U.S. Says.(2019). Retrieved July 3, 2019 from

71. TopResume. 2019. Get to Know the 5 Most PopularPre-Employment Personality Tests. (2019). RetrievedJuly 3, 2019 from

72. Sue E Tranter and Douglas A Reynolds. 2006. Anoverview of automatic speaker diarization systems.

IEEETransactions on audio, speech, and language processing

14, 5 (2006), 1557–1565.73. Shari Trewin. 2018. AI Fairness for People withDisabilities: Point of View. arXiv preprintarXiv:1811.10670 (2018).74. University of Michigan. 2019. Teaching self-driving carsto predict pedestrian movement. (2019). Retrieved July 3,2019 from https://news.umich.edu/teaching-self-driving-cars-to-predict-pedestrian-movement/

75. VentureBeat. 2019. How Microsoft is using AI toimprove accessibility. (2019). Retrieved July 3, 2019 from https://venturebeat.com/2019/05/06/how-microsoft-is-using-ai-to-improve-accessibility/

76. The Verge. 2019. Android Q’s Live Caption feature addsreal-time subtitles to any audio or video playing on yourphone. (2019). Retrieved July 3, 2019 from

77. Alexandra Vtyurina, Adam Fourney, Meredith RingelMorris, Leah Findlater, and Ryen White. 2019. VERSE:Bridging Screen Readers and Voice Assistants forEnhanced Eyes-Free Web Search. In

Proceedings of the21st International ACM SIGACCESS Conference onComputers and Accessibility . ACM.78. Liang Wang, Tieniu Tan, Huazhong Ning, and WeimingHu. 2003. Silhouette analysis-based gait recognition forhuman identiﬁcation.

IEEE transactions on patternanalysis and machine intelligence

25, 12 (2003),1505–1518.79. Kathryn Zyskowski, Meredith Ringel Morris, Jeffrey P.Bigham, Mary L. Gray, and Shaun K. Kane. 2015.Accessible Crowdwork?: Understanding the Value in andChallenge of Microtask Employment for People withDisabilities. In

Proceedings of the 18th ACM Conferenceon Computer Supported Cooperative Work & . ACM, New York, NY, USA,1682–1693.

DOI: http://dx.doi.org/10.1145/2675133.2675158http://dx.doi.org/10.1145/2675133.2675158