[PDF] AI Development for the Public Interest: From Abstraction Traps to Sociotechnical Risks

Abstract

Despite interest in communicating ethical problems and social contexts within the undergraduate curriculum to advance Public Interest Technology (PIT) goals, interventions at the graduate level remain largely unexplored. This may be due to the conflicting ways through which distinct Artificial Intelligence (AI) research tracks conceive of their interface with social contexts. In this paper we track the historical emergence of sociotechnical inquiry in three distinct subfields of AI research: AI Safety, Fair Machine Learning (Fair ML) and Human-in-the-Loop (HIL) Autonomy. We show that for each subfield, perceptions of PIT stem from the particular dangers faced by past integration of technical systems within a normative social order. We further interrogate how these histories dictate the response of each subfield to conceptual traps, as defined in the Science and Technology Studies literature. Finally, through a comparative analysis of these currently siloed fields, we present a roadmap for a unified approach to sociotechnical graduate pedagogy in AI.

Full PDF

AAI Development for the Public Interest: FromAbstraction Traps to Sociotechnical Risks

McKane Andrus ∗ , Sarah Dean † , Thomas Krendl Gilbert ‡ , Nathan Lambert † and Tom Zick § Authors arranged alphabetically. ∗ Partnership on AI, San Francisco, CA. † Department of Electrical Engineering and Computer Sciences, University of California, Berkeley. ‡ Center for Human-Compatible AI, University of California, Berkeley. § Berkman Klein Center for Internet and Society, Harvard University. [email protected], { dean sarah, tg340, nol } @berkeley.edu, [email protected] Abstract —Despite interest in communicating ethical problemsand social contexts within the undergraduate curriculum toadvance Public Interest Technology (PIT) goals, interventionsat the graduate level remain largely unexplored. This may bedue to the conﬂicting ways through which distinct ArtiﬁcialIntelligence (AI) research tracks conceive of their interface withsocial contexts. In this paper we track the historical emergence ofsociotechnical inquiry in three distinct subﬁelds of AI research:AI Safety, Fair Machine Learning (Fair ML) and Human-In-the-Loop (HIL) Autonomy. We show that for each subﬁeld,perceptions of PIT stem from the particular dangers faced by pastintegration of technical systems within a normative social order.We further interrogate how these histories dictate the responseof each subﬁeld to conceptual traps, as deﬁned in the Scienceand Technology Studies literature. Finally, through a comparativeanalysis of these currently siloed ﬁelds, we present a roadmapfor a uniﬁed approach to sociotechnical graduate pedogogy inAI.

I. I

NTRODUCTION

Recent years have seen an increasing public awareness ofthe profound implications of widespread artiﬁcial intelligence(AI) and large scale data collection. It is now common for bothlarge tech companies and academic researchers to motivatetheir work on AI as interfacing with the “public interest,”matching external scrutiny with new technical approaches tomaking systems fair, secure, or provably beneﬁcial. However,developing systems in the public interest requires researchersand designers to confront what has been elsewhere referred toas the “sociotechnical gap,” or the divide between the intendedsocial outcomes of a system and what is actually achievedthrough technical methods [1].Interventions in Computer Science (CS) education havemade strides towards providing students with frameworkswithin which to evaluate technical systems in social contexts[2], [3]. These curricular modiﬁcations have drawn on ﬁeldslike Law, Philosophy, and Science and Technology Studies(STS) to create both dedicated and integrated courseworkpromoting human contexts and ethics in CS [4]. However, asthe majority of these courses are currently offered at the un-dergraduate level, graduate students may not reap the beneﬁtsof such reforms [4]. Given the role of graduate students as notonly teachers, but drivers of cutting edge research and futuredecision makers in industry and academia, interventions aimedat them may play an outsized role in forwarding PIT goals.

AI Safety

HILAutonomyFair ML

Sociotechnics “ i n e q u a lit y ” “ e x ti n c ti o n ” “a cc i d e n t ” Fig. 1. Three contemporary areas of artiﬁcial intelligence research whoseoverlapping forms of sociotechnical inquiry remain problematically deﬁned:AI Safety, Fair Machine Learning, and Human-in-the-Loop Autonomy.

It is challenging to pin down what it would mean to traina graduate AI researcher to address the sociotechnical gap.A key source of tension is the place of the sociotechnicalin AI development: while practitioners claim to be workingon technical solutions to social problems, theoretical andmethodological formulations of the sociotechnical are incon-sistent across prominent AI subﬁelds, making it unclear ifcurrent initiatives in pedagogy and research are advancing,undermining, or neglecting the public interest.In this paper, we go beyond coursework to analyze thehistorical and technical shifts behind the current conceptionof sociotechincal risks in prominent AI subﬁelds. We lookto existing research domains that grapple with the social andtechnical spheres at distinct levels of abstraction, and examinehow their limitations and insights reﬂect nascent, if problem-atic, forms of inquiry into the sociotechnical. We assess currentresearch in the socially-oriented subﬁelds of AI highlighted inFig. 1, namely AI Safety; Fairness in Machine Learning (FairML); and Human-in-the-Loop (HIL) Autonomy. AI Safety a r X i v : . [ c s . C Y ] F e b ocuses on value-alignment of future systems and cautionsagainst developing AI systems fully integrated with and incontrol of society. Fair ML works to reduce bias in algorithmswith potentially deleterious effects on individuals or groups.HIL Autonomy is a term we use to encompass emerging workon both human-robot interaction and cyber-physical systems.These research areas explore how to optimize interactionsof autonomous systems with human intentions in the loop.By tracing the history of these subﬁelds in a comparativefashion, we are able to characterize their distinct orientationstowards sociotechnical challenges, highlighting both insightsand blindspots. The goal of this analysis is not to capture eachsubﬁeld’s research agenda exhaustively, which is far beyondour scope. Instead, it is to highlight how these agendas claimrelative access to some feature of the sociotechnical in theway they represent social problems as technically tractable. Indoing so, they claim legitimacy and authority with respect topublic problems.We next reﬁne this comparative history with lenses bor-rowed from the Science and Technology Studies (STS) lit-erature, emphasizing the ways the subﬁelds interface withsociotechnical risks. In particular, we portray how certainrisks are deferred within each subﬁeld’s agenda. This deferraloften takes the form of skillfully avoiding “abstraction traps”that have been recently highlighted by [5]. Such avoidanceis important from a technical standpoint. However, true en-gagement with the sociotechnical requires reﬂexively revealingand resolving risks beyond the piecemeal formalisms that havedeﬁned each subﬁeld’s historical trajectory. We conclude witha brief sketch of pedagogical interventions towards this goal.Beyond classroom ethics curricula, we propose an agenda forclinical engagement with problems in the public interest as apart of graduate training in AI. This training would inculcatea direct appreciation of sociotechnical inquiry in parallel withthe acquisition of speciﬁc technical skillsets. It would preparepractitioners to evaluate their own toolkits discursively, ratherthan just mathematically or computationally.II. E MERGING S OCIOTECHNICAL S UBFIELDS OF

AI:S

AFETY , F

AIRNESS , R

ESILIENCY

Recent technical work grappling with the societal implica-tions of AI includes developing provably safe and beneﬁcial ar-tiﬁcial intelligence (AI Safety), mitigating classiﬁcation harmsfor vulnerable populations through fair machine learning (FairML), and designing resilient autonomy in robotics and cyber-physical systems (HIL Autonomy). At present, these areasconstitute heterogeneous technical subﬁelds with substantiveoverlaps, but lack discursive engagement and cross-pollination.Below, we outline the history and motivating concerns ofeach subﬁeld and identify key developments and convenings.We highlight how the technical research agendas stem fromdistinctly sociotechnical concerns and will require interdisci-plinary engagement to fully map out their stakes. By placingthese subﬁelds’ agendas in the context of the local sociotech-nical risks (respectively extinction , inequity , and accident ), weargue that the representative technical formations ( forecasting and value alignment , fairness criteria and accountability , controllability and reachability ) iterate on those risks with-out reﬂexively interrogating or normatively addressing them.Throughout this work we use the term AI system to refer to atechnical system with signiﬁcant automated components (e.g.automated decision making systems or self driving cars). Wenote that AI is also a distinct ﬁeld of academic study, furtherdiscussed in this section. A. AI Safety

The ﬁeld of artiﬁcial intelligence (AI) has often situateditself within a wider disciplinary context. Famously, at thefoundational summits at Dartmouth and MIT, computer sci-entists, logicians, and psychologists came together to charta course for artiﬁcial intelligence to arrive at human-likecapabilities [6]. That course, which laid the groundwork for“good-old-fashioned” AI, had an incredible number of hiccupsand was eventually overwritten. A similarly diverse groupof disciplinary representatives moved the ﬁeld away fromsymbolic and comprehensive logical reasoning to either moresituated, interactionist understandings of cognition [7], [8]or to biologically-inspired, connectionist strategies of learn-ing [9]. Following these interdisciplinary developments, therehas been a growing concern about the capabilities of AIsystems to endanger humans and society writ large [10],[11], [12]. Motivated both by longstanding concerns aboutthe possibility of a “technological singularity” [13] as well asrecent expansive applications of machine learning in criticalinfrastructure domains, many AI Safety promoters fear thatAI researchers are approaching a level of capability that willexpand beyond their control [13], [14].Belief in the prospect of an arbitrarily capable intelligentagent beyond designer control has raised the prospect of extinction , whether of humanity or all natural life, as aclear and present danger for AI development, serving as thisﬁeld’s distinct sociotechnical risk scenario. Regardless of thelikelihood of such a scenario, the nascent ﬁeld of AI Safetyhas arisen to preemptively confront these dangers. AI Safetytakes a rather radical approach to the type of systems levelthinking that we discuss, often viewing technical developmentson a much longer and wider timescale—see, for example,the published work on AI arms races and potential researchagendas to avert them [12], [15]. As such, a common feature ofAI Safety research is forecasting future AI capabilities againstvarious time horizons.Despite recent high-proﬁle endorsements from computerscientists and philosophers such as Stuart Russell [16] andNick Bostrom [14], AI Safety is still a nascent researchcommunity. At present there is no independent conference forthis ﬁeld, although workshops and panels on AI Safety havebecome a regular ﬁxture of larger AI venues such as NeurIPS,ICML, and AAAI, while speciﬁc AI-safety oriented researchlabs (e.g. CHAI, OpenAI) host invited technical presentationson a semiweekly or monthly basis. The ﬁeld has also attractedinterest from research centers and philanthropic organizationsedicated to the study and mitigation of long-term existentialrisk, as well as industry leaders in AI.Central to this work is a motivation to align AI systems withintended rather than speciﬁed rewards, as humans struggleto make explicit the rich normative context of their owngoals and behaviors. Through this shift, AI Safety adjuststhe framing of classical AI development towards “provablybeneﬁcial” rather than merely optimal systems. Under thisframing, researchers focus largely on the problem of valuealignment , i.e. whether or not an AI agent’s programmedobjective matches those of relevant humans or humanity asa whole [17]. For example, by understanding the problem ofaligning an AI agent with a human collaborator as a problemof inverse reinforcement learning, researchers seek to solvethis issue with a largely technical approach by borrowing coreprinciples from economic game theory [18].Considered as a whole, extended sociotechnical inquiryin AI Safety remains limited to catastrophic risk evaluationin cases where humanity’s survival is at stake—a scale ofconcern that is not often found in engineering disciplines.Moreover, rigorous formal work often relies on intuition frommechanism design (e.g. an objectives-ﬁrst approach, perfectlyrational agents) whose assumptions inherit some of the formallimitations of and controversies surrounding prospect theoryand social choice theory. Stemming from AI Safety, we seevigorous discussions surrounding AI Policy [19], ethics [20],and even reﬂexive interrogations as a practice in forecasting[21]. While lacking some qualities of sociotechnical inquiry,in particular a deeply reﬂexive methodology and historicalorientation, we see potential to pivot these discussions awayfrom narrowly-framed thought experiments about paperclip-maximizing robots [11] towards comparative investigations ofthe normative stakes of distinct AI-society interfaces.

B. Fairness in Machine Learning

The ﬁeld of machine learning (ML) emerged in the late1950s with the design of a self-improving program for playingcheckers [22] and quickly found success with static tasksin pattern classiﬁcation, including applications like handwrit-ing recognition [23]. ML techniques work by detecting andexploiting statistical correlations in data, towards increasingsome measure of performance. A prominent early machinelearning algorithm was the perceptron [24], an example ofsupervised classiﬁcation, perhaps the most prevalent form ofML. In this setting, a classiﬁer (or model) is trained withlabelled examples, and its performance is measured by itsaccuracy in labelling new instances. The perceptron spurredthe development of deep learning techniques mid-century [25];however, they soon fell out of favor, only having great successin recent decades in the form of neural networks via theincreasing availability of computation and data. Many ML al-gorithms require large datasets for good performance, tying theﬁeld closely with “big data.” However, optimizing predictiveaccuracy does not generally ensure beneﬁcial outcomes whenpredictions are used to make decisions, a problem that be- comes stark when individuals are harmed by the classiﬁcationof an ML system.The inequality resulting from system classiﬁcations is thecentral sociotechnical risk of concern to practitioners in thisthe subﬁeld of Fair ML. A growing awareness of the possi-bility for bias in data-driven systems developed over the pastﬁfteen years, starting in the data mining community [26] andechoing older concerns of bias in computer systems [27]. Theresulting interest in ensuring “fairness” was further catalyzedby high proﬁle civil society investigation (e.g. ProPublica’sMachine Bias study, which highlighted racial inequalities inthe use of ML in pretrial detention) and legal arguments thatsuch systems could violate anti-discrimination law [28]. Atthe same time, researchers began to investigate model “ex-plainability” in light of procedural concerns around the blackbox nature of deep neural networks. The research communityaround Fairness in ML began to crystallize with the ICMLworkshop on Fairness, Accountability, and Transparency inML (FAT/ML), and has since grown into the ACM confer-ence on Fairness, Accountability, and Transparency (FAccT)established in 2017.By shifting the focus to fairness properties of learnedmodels, Fair ML adjusts the framing of the ML pipelineaway from a single metric of performance. There are broadlytwo approaches: individual fairness, which is concerned withsimilar people receiving similar treatment [29], and groupfairness which focuses on group parity in acceptance or errorrates [30]. The details of deﬁning and choosing among these fairness criteria amount to normative judgements about whichbiases must be mitigated, with some criteria being impossibleto satisfy simultaneously. Much technical work in this areafocuses on algorithmic methods for achieving fairness criteriathrough either pre-processing on the input data [31], in-processing on the model parameters during training [32], orpost-processing on model outputs [33].The Fair ML community is oriented towards the sociotech-nical, engaging actively with critiques from STS perspec-tives. FAccT is a strong locus of interdisciplinary thoughtwithin computer science, and the addition of transparency and accountability to the title opens the door to a widerrange of interventions. Building upon model-focused conceptslike explainability, blendings of technical and legal conceptsof recourse [34] and contestability [35] widen the frame toexplicitly consider the reaction of individuals to their classi-ﬁcation. Similarly, there have been multiple calls to re-centerstakeholders to understand how explanations are interpretedand if they are even serving their intended purpose [36], [37].The community is increasingly open to discussing scenariosin which technical intervention, like the police use of facialrecognition, is not desired. This encompasses both technicalresistance [38] and procedural approaches to delineating thevalid uses of data [39] and models [40].

C. Human-in-the-Loop Autonomy

As many of the earliest robotic systems were remotely oper-ated by technicians, the ﬁeld of robotics has always had prob-ems of human-robot interaction (HRI) at its core [41]. Earlywork was closely related to the study of human factors, aninterdisciplinary endeavor drawing on engineering psychology,ergonomics, and accident analysis [42]. With advancementsin robotic capabilities and increasing autonomy, the interac-tion paradigm grew beyond just teleoperation to supervisorycontrol . HRI emerged as a distinct multidisciplinary ﬁeld inthe 1990s with the establishment of the IEEE InternationalSymposium on Robot & Human Interactive Communication.Modern work in this area includes modeling interaction fromthe perspective of the autonomous agent (i.e. robot) rather thanjust the human overseer. By incorporating principles from thesocial sciences and cognitive psychology, HRI uses predictionsand models of human behavior to optimize and plan. Thiswork mitigates the sociotechnical risk of accidents – deﬁnedspeciﬁcally as states in which physical difﬁculties or mishapsoccur. Such physical risks are mitigated by making modelsrobust to these potentially-dangerous conditions.Digital technology has advanced to the point that manysystems are endowed with autonomy beyond the traditionalnotion of a robotic agent, including trafﬁc signal networksat the power grid. We thus consider the subﬁeld of

HILAutonomy to be the cutting edge research that incorporateshuman behaviors into robotics and cyber-physical systems.This subﬁeld proceeds in two directions: 1) innovations inphysical interactions via sensing and behavior prediction; 2)designing for system resiliency in the context of complicatedor unstable environments. These boundaries are blurring in theface of increasingly computational methods and the prospec-tive market penetration of new technologies. For example, thedesign of automated vehicles (AVs) poses challenges alongmany fronts. For more ﬂuent and adaptable behaviors likemerging, algorithmic HRI attempts to formalize models forone-on-one interactions. At the same time, AVs pose the riskphysical harm, so further lines of work integrate these humanmodels to ensure safety despite the possibility of difﬁcult-to-predict actions. Finally, population-level effects (e.g. AVrouting on trafﬁc throughput and induced demand) requiredeeper investigation into interaction with the social layer.The emerging subﬁeld of HIL Autonomy uses ideas fromclassical control theory while trying to quantify and capturethe risk and uncertainty of working with humans [43], [44].It thus inherits some of the culture around verifying safetyand robustness through a combination of mathematical toolsand physical redundancy, due to a history of safety-criticalapplications in domains like aerospace. Technical work in thisarea typically entails including the human as part of an under-actuated dynamical system [45], [46], such as a un-modeleddisturbance. Through this lens, human-induced uncertaintyis mitigated by predicting behavior in a structured manner,maintaining the safety of the system through mathematicalrobustness guarantees [47]. To make this concrete, a lane-change maneuver in an AV might include both an aggressivedriving plan that takes likely human behaviors into account aswell as a reachability safety criterion which could be activatedvia feedback if observed human behavior falls outside of the expected distribution. At a higher level of planning, the lanechange maneuver may only be directed if it is expected to beadvantageous for global trafﬁc patterns.The extent to which HIL Autonomy engages with thesociotechnical is thus far limited. Human-centered researchfocuses on localized one-to-one interactions, while researchconsidering more global interactions remains largely in therealm of the technical. However, the critical “alt.HRI” trackat the ACM/IEEE International Conference on Human-RobotInteraction indicates an emerging interest in how roboticsystems interact with society more broadly. In such venues,questions are raised surrounding how robots interact withsocial constructions of race [48], [49] and issues of robot-community integration are being studied in settings rang-ing from healthcare [50] to gardening [51]. There is alsowork which considers the incorporation of social values intocyber-physical systems, e.g. fair electricity pricing for smartgrids [52]. While our identiﬁcation of this emerging subﬁeld isperhaps more speculative than the previous two, the physicalrealization of AI technologies will remain a crucial site ofsociotechnical inquiry.III. S

OCIOTECHNICAL I NTEGRATION

While the subﬁelds of AI Safety, Fair ML, and HIL Auton-omy each consider problems at the interface of technologyand human or social factors, there are differences whicharise in part to their disparate histories. One difference isin time-scales. AI Safety is primarily concerned with longterm outcomes of mis-aligned AI development, while Fair MLfocuses on practical implementations of individual models andalgorithms with imperfect datasets. HIL Autonomy bisects thetwo, with both longer term considerations of how numerousautonomous agents will re-deﬁne how humans interact in theenvironment and short term focus on maintaining safety, e.g. inthe presence of unexpected adverse road conditions. Anotherdifference arises from how the subﬁelds position themselvesat different levels of abstraction. HIL Autonomy is physicallygrounded, with a history closely tied with embodied interactionwith humans and the social layer, while Fair ML is sociallygrounded, and has strong instincts for sociotechnical dialogand historical situatedness. On the other hand, AI Safetypositions itself at the highest level of generality, relegatingmachine learning to the status of a tool and interpretingrobotics as an application of formal guarantees.For these subﬁelds to place their sociotechnical inquiryon ﬁrmer foundations, it will be necessary to establish morereﬂexive relationships with their inherited assumptions aboutrisk. Each subﬁeld interprets itself as ﬁlling well-deﬁnedsociotechnical gaps, i.e. that there is a discernible dividebetween social problems and technical agendas. But in fact,the way these subﬁelds have deﬁned and worked on those gapsis itself problematic, piecemeal, and lacking in deﬁnition, i.e.it is normatively indeterminate. Reﬂexive inquiry is needednot to ﬁll those gaps, but to deﬁne and interpret them morerichly, so that their salience and urgency can be evaluated.t a minimum, researchers and practitioners must learnto see behind their own technical abstractions to the socialreality they assume, recognize that this reality may have beenproblematically deﬁned, and learn to inquire into these deﬁ-nitions directly, perhaps with the aid of new transdisciplinarytools. We now provide a high-level summary of this agenda,moving from a comparison of common technical traps to moreindeterminate conceptions of sociotechnical risk.

A. Grappling with Shortcomings in Framing

Each of the subﬁelds discussed in the previous sectionseeks to expand the technical framing of their parent ﬁeldto include human and social factors. In [5], the framing trap is introduced as the failure to model the full system of interest(e.g. with respect to a notion of fairness or safety). Technicalresearchers are at risk of falling into this trap whenever theydraw a bounding box around the system that they study.Often, the consequences of this trap manifest as the portabilitytrap [5], which occurs when technical solutions designed forone domain or environment are misapplied to another context.Technical researchers are at risk of falling into this trapwhenever they mistakenly view a bounding box as appropriateto a new context.The subﬁelds of AI Safety, Fair ML, and HIL Autonomycan be viewed as attempts to avoid the framing trap. In theﬁelds of AI, ML, and robotics, the workﬂow often entails featurization by deﬁning data or inputs/outputs, optimization by ﬁtting a model or designing a control policy, and then integration into the larger system. Researchers in the emergingsub-ﬁelds are beginning to understand the downsides of thisunidirectional workﬂow, and the necessity of interrogatingthe modelling choices made at each step. For example, AISafety questions the way that features are used to deﬁneoptimization objectives in light of potentially catastrophiceffects of integration, while Fair ML questions the inequalitiesarising from model optimization.Still, sometimes the frame is not opened wide enough. Forexample, by failing to account for the larger system in whichrisk assessments are used, approaches to Fair ML may mistak-enly treat loaning decisions the same way they treat pretrialdetention, despite salient differences between the ﬁnancialand criminal justice systems. By adopting more rigorouslya heterogeneous engineering approach [5], researchers andpractitioners can explicitly determine which properties are nottied to the technical objects under design but to their socialcontexts. For example, the aerospace industry is an engineeringdomain with considerable heterogeneity—an awareness of theregulatory context, from the ﬂight deck procedures to airtrafﬁc control, is necessary for the development of ﬂighttechnologies.

B. Abstraction Traps in AI Research

To motivate a stronger cross-disciplinary discourse amongand outside of these subﬁelds, we now make further use ofthe framework of abstraction traps provided by [5] to pointsystematically to shortcomings and highlight potential new areas of inquiry. Alongside the framing and portability traps,we discuss: the formalism trap, the ripple effect trap, and thesolutionism trap.The formalism trap occurs when mathematical formalismsfail to capture important parts of the human context. Forexample, the fairness of a system is often judged by proceduralrather than technical elements, and the perceived reliabilitymay depend more on predictability rather than formally veri-ﬁed safety. All of the discussed subﬁelds are posed to fall intothe formalism trap, which requires a deeper engagement withsociotechnical complexities to avoid. Ultimately, the validityand desirability of speciﬁc metrics arising from mathematicalabstractions will be determined through intimate reference tosocial context rather than technical parsimony. If systems arenot ﬂexible enough to allow for public input, the validity canbe compromised.The ripple effect trap occurs when there is a failure tounderstand how technology affects the social system intowhich it is inserted. AI Safety considers ripple effects tosome extent, but in a narrowly formal manner. For example,while automated vehicles are known to affect trafﬁc, road,and even infrastructure design, most technical research hasfocused on incorporating these as features to be modeledrather than questioning the status of AVs as the dominantform of future mobility. Engagement across the entire so-ciotechnical stack requires understanding social phenomenalike the “reinforcement politics” of dominant groups usingtechnology to remain in power and “reactivity” like gamingand adversarial behavior. If a system encourages people tobehave in an adversarial manner, it may call for utilizing richerdesign principles to promote cooperation, rather than merelythrowing more advanced AI methods at the assumed dynamics.Finally, the solutionism trap occurs when designers mistak-enly believe that technical solutions alone can solve complexsociological and political problems. For example, while thelegal community has encouraged technical ﬁelds to buildsystems that are reliably safe and fair, these interventionsmust be speciﬁed in terms of norms that can be appropriatelyinternalized by practitioners. The General Data ProtectionRegulation has had a mixed reception—while it did articulatenormative landmarks for subﬁelds to pay attention to, someof its requirements (e.g. consent as a legal basis for dataprocessing) were highly underspeciﬁed. This speciﬁcationvacuum empowered prominent private actors to advance theirown standards in a way that is ethically questionable butpolitically effective, achieving market buy-in from enoughother actors before the law can catch up [53], [54]. Technicalpractitioners will need the ability to stand up and contestwould-be standards publicly, rather than relying on the law tointerpret systems before their sociotechnical scope has beenappropriately modeled. To avoid the solutionism trap, it isimportant to maintain a robust culture of questioning whichproblems should be addressed, and why these problems andnot others: in the form of humility or a “ﬁrst, do no harm”perspective. . From Avoiding Traps to Anticipating Risks

An important initial step for grappling with abstractiontraps is for technical practitioners in ﬁelds of AI Safety, FairML, and HIL Autonomy to consider them explicitly whenattempting to solve and formulate problems. In following with[5], we ﬁnd it most helpful to consider the traps in reverseorder: is it worth designing a technical solution? Can weadequately reason about how the technology will affect itssocial context? Can the desired properties of the system becaptured by mathematical abstractions? Are the technical toolsappropriate to the context? And are all relevant actors includedin the framing? By considering these questions, researchersand practitioners will be encouraged to grapple with the pluraltemporalities deﬁned by ongoing sociotechnical engagement through the validation of assumptions behind featurization,optimization, and integration.While researchers in AI Safety, Fair ML, and HIL Auton-omy are well positioned to begin asking these questions, it isonly a ﬁrst step. There is an inherent vulnerability in applyingcomputational decision heuristics to vital social domains.Autonomous AI systems introduce possibilities of catastrophicfailure and normative incommensurability to contexts thatwere previously accessible only to human judgment and whichwe may never be able to exhaustively specify or completelyunderstand. Beyond the mere avoidance of conceptual traps ,practitioners must learn to anticipate sociotechnical risks asintegral to the endeavor of building AI systems that interfacewith social reality.The distinct intuitive approaches to risk taken by eachof the examined subﬁelds ( extinction , inequality , and acci-dent ) stem from alternative histories of the sorts of dangersfaced when integrating systems within a normative socialorder. In other words, while these research communities haveadopted tools and mathematical formalisms that purport torepresent and work on discrete social phenomena, in fact thetools themselves are sociotechnical interventions, and theirelaboration is justiﬁed according to historically-sedimentedperceptions of risk. Rather than systems that represent andaffect speciﬁc social objects (e.g. people, institutions), weadvocate for the concept of AI as a process of elaboratingnormative commitments whose technical reﬁnement generatesunprecedented positions [55]. From these positions, novelsociotechnical questions can be revealed, resolved, or deferred.IV. T OWARDS C LINICAL T RAINING FOR G RADUATE P EDAGOGY

How can researchers and practitioners learn to anticipatesociotechnical risks? Awareness of abstraction traps may cor-roborate an appreciation of risks, but it does not provide thetools with which to anticipate or understand them. For ex-ample, pedagogical reforms based on coursework drawn fromScience and Technology Studies, Philosophy, and Law mayinspire a requisite caution in technical practitioners. However,this caution remains insufﬁcient to deﬁne the problem spaceof appropriate uses of AI. Instead, it will be necessary toencourage the coordination of technical and social scientists on these matters. In what follows, we interrogate this byevaluating possible reforms in graduate pedagogy.Graduate students are a fruitful site for intervention forthree primary reasons: 1) their educational role in shapingthe next generation of engineers, 2) their role in pushingforward emerging areas of research and 3) their future asmanagement and decision-makers at technology companies.Students should have the ability to recognize when a singledevelopment pipeline is trying to engage in multiple abstrac-tions simultaneously because its metaphors are confused (e.g.the fact that certain AI Safety formalisms [18], understoodin terms of a principal-agent game, can function both as aform of mechanism design and as a kind of interface betweenuser and robot). It is further important that they have theability to contest, merge, or even dissolve these frames ifnecessary. This will entail a major cultural transition in howthe goals of graduate training are deﬁned, moving away fromfailure-avoidance engineering in controlled environments tothe responsible integration of technology in human contexts.While there are efforts to widen the scope of a technical edu-cation and augment it with political and ethical training [56],[57], a truly sociotechnical graduate education would teachthe skills of how to draw a technical bounding box as well ashow to communicate those decisions to the publics that willhave to reckon with the potential beneﬁts and harms of newtechnology. Education cannot carve up the world into speciﬁcproblem domains, but it could help coordinate concerns in aconstructive manner that enables the development of context-appropriate validation metrics, as others have begun to do bysynthesizing common technical pitfalls [5].While coursework lays the foundations for research, itcannot provide a descriptive ontology that would exhaustivelycapture sociotechnical risks in advance of active inquiry.Anticipating and mitigating such risks requires an immersionin the relevant social context, becoming richly familiar withits phenomenology from the human standpoint. Only by doingthis is it possible to register the system speciﬁcation interms of the concrete normative stakes rather than abstractapproximations of optimal behavior. This entails an ontolog-ical shift away from a purely mechanistic description of thedomain in favor of a clinician’s perspective, comparable inscope and signiﬁcance to the emergence of modern medicaland legal clinics [58], [59], [60]. We believe a distinctlyclinical approach to social problems—engaged and prolongedconsultation, direct provision of service, relationships withclients, hands-on education overseen by professors—is the bestapproach.Technical work will always rely on abstraction and framingto describe the environment in which a system is designedto function. It falls on technical researchers and practitionersto understand how to specify such a bounding box , decidewhich frames and abstractions are valid and tractable as wellas commensurate with stakeholder concerns, and articulatetheir choices to relevant communities with varying technicalbackgrounds. We see this “clinician’s eye,” entailing effectiveframing and communication, as the most promising potentialutcome of reforming AI pedagogy at the graduate level, anddefer further investigation of clinical approaches in the contextof CS education to future work.V. C

ONCLUSION

The work of deﬁning “sociotechnical” problems in AIdevelopment is ongoing. Systems themselves often make sym-bolic reference to situations, environments, or objects thatare assumed to lie behind their representations unreﬂectively,allowing the same mathematical structures to propagate with-out interrogating key metaphorical frames. This norm resultsin a practice incommensurate with other expert professions’standards of liability. Along with the inconsistency betweensubﬁelds, this makes it hard to deﬁne what constitutes anAI expert and how responsibility should be assigned whensystems fail. Looking from the outside in, the legal and philo-sophical communities cannot enforce standards that are neitherbacked up by established forms of expertise nor immediatelytranslatable outside the context of technical-mathematical for-malism, meaning case law and abstract ethics cannot fullydetermine or guide sociotechnical regulation.Given this normative indeterminacy, we argue there is noready-made delineation of which technical tools are suitedto which social problems, and instead look to prospectiveinterventions nurturing new forms of inquiry into inheritednotions of risk. On this view, interventions would embracethe notion that elaborating on sociotechnical problems andprocedures is essential to the task itself, and practitionerswould understand the sociotechnical simply as part of whatthey do. We argue this is the more sure path to effective normsfor distinct subﬁelds of AI development, and thus to the aimsof Public Interest Technology.R

EFERENCES[1] M. S. Ackerman, “The Intellectual Challenge of CSCW: TheGap Between Social Requirements and Technical Feasibility,”

Human–Computer Interaction

ACM Trans. Comput. Educ. , vol. 19, no. 4, Aug. 2019.[Online]. Available: https://doi.org/10.1145/3341164[3] M. Skirpan, N. Beard, S. Bhaduri, C. Fiesler, and T. Yeh, “Ethicseducation in context: A case study of novel ethics activities for the csclassroom,” in

Proceedings of the 49th ACM Technical Symposium onComputer Science Education , ser. SIGCSE ’18. New York, NY, USA:Association for Computing Machinery, 2018, p. 940–945. [Online].Available: https://doi.org/10.1145/3159450.3159573[4] C. Fiesler, N. Garrett, and N. Beard, “What do we teach when weteach tech ethics? a syllabi analysis,” in

Proceedings of the 51st ACMTechnical Symposium on Computer Science Education , ser. SIGCSE ’20.New York, NY, USA: Association for Computing Machinery, 2020, p.289–295. [Online]. Available: https://doi.org/10.1145/3328778.3366825[5] A. D. Selbst, D. Boyd, S. A. Friedler, S. Venkatasubramanian, andJ. Vertesi, “Fairness and abstraction in sociotechnical systems,” in

Proceedings of the Conference on Fairness, Accountability, and Trans-parency , 2019, pp. 59–68.[6] S. Russell and P. Norvig, “Artiﬁcial intelligence: a modern approach,”2002.[7] H. L. Dreyfus, L. Hubert et al. , What computers still can’t do: A critiqueof artiﬁcial reason . MIT press, 1992. [8] P. Agre and P. E. Agre,

Computation and human experience . CambridgeUniversity Press, 1997.[9] R. Sun, “Connectionism and neural networks,”

The Cambridge hand-book of artiﬁcial intelligence , p. 108, 2014.[10] N. Bostrom, “Ethical issues in advanced artiﬁcial intelligence,”

Scienceﬁction and philosophy: from time travel to superintelligence , pp. 277–284, 2003.[11] E. Yudkowsky et al. , “Artiﬁcial intelligence as a positive and negativefactor in global risk,”

Global catastrophic risks , vol. 1, no. 303, p. 184,2008.[12] S. Armstrong, N. Bostrom, and C. Shulman, “Racing to the precipice:a model of artiﬁcial intelligence development,”

AI & society , vol. 31,no. 2, pp. 201–206, 2016.[13] R. Kurzweil,

The singularity is near: When humans transcend biology .Penguin, 2005.[14] N. Bostrom,

Superintelligence . Dunod, 2017.[15] A. Ramamoorthy and R. Yampolskiy, “Beyond mad? the race forartiﬁcial general intelligence.”[16] S. Russell,

Human compatible: Artiﬁcial intelligence and the problemof control . Penguin, 2019.[17] N. Soares and B. Fallenstein, “Agent foundations for aligning machineintelligence with human interests: a technical research agenda,” in

TheTechnological Singularity . Springer, 2017, pp. 103–125.[18] D. Hadﬁeld-Menell, S. J. Russell, P. Abbeel, and A. Dragan, “Coopera-tive inverse reinforcement learning,” in

Advances in neural informationprocessing systems , 2016, pp. 3909–3917.[19] M. Brundage, S. Avin, J. Wang, H. Belﬁeld, G. Krueger, G. Hadﬁeld,H. Khlaaf, J. Yang, H. Toner, R. Fong et al. , “Toward trustworthyai development: mechanisms for supporting veriﬁable claims,” arXivpreprint arXiv:2004.07213 , 2020.[20] I. Gabriel, “Artiﬁcial intelligence, values and alignment,” arXiv preprintarXiv:2001.09768 , 2020.[21] K. Grace, J. Salvatier, A. Dafoe, B. Zhang, and O. Evans, “When willai exceed human performance? evidence from ai experts,”

Journal ofArtiﬁcial Intelligence Research , vol. 62, pp. 729–754, 2018.[22] A. L. Samuel, “Some studies in machine learning using the game ofcheckers,”

IBM Journal of research and development , vol. 3, no. 3, pp.210–229, 1959.[23] N. JNILSSON, “Learning machines: Foundations of trainable patternclassifying systems,” 1965.[24] F. Rosenblatt,

The perceptron, a perceiving and recognizing automatonProject Para . Cornell Aeronautical Laboratory, 1957.[25] M. Olazaran, “A sociological study of the ofﬁcial history of the percep-trons controversy,”

Social Studies of Science , vol. 26, no. 3, pp. 611–659,1996.[26] D. Pedreshi, S. Ruggieri, and F. Turini, “Discrimination-aware data min-ing,” in

Proceedings of the 14th ACM SIGKDD international conferenceon Knowledge discovery and data mining , 2008, pp. 560–568.[27] B. Friedman and H. Nissenbaum, “Bias in computer systems,”

ACMTransactions on Information Systems (TOIS) , vol. 14, no. 3, pp. 330–347, 1996.[28] S. Barocas and A. D. Selbst, “Big data’s disparate impact,”

Calif. L.Rev. , vol. 104, p. 671, 2016.[29] C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, “Fairnessthrough awareness,” in

Proceedings of the 3rd innovations in theoreticalcomputer science conference . ACM, 2012, pp. 214–226.[30] S. Barocas, M. Hardt, and A. Narayanan,

Fairness and Machine Learn-ing

Advances in Neural Information Processing Systems , 2017, pp. 3992–4001.[32] M. B. Zafar, I. Valera, M. Gomez-Rodriguez, and K. P. Gummadi,“Fairness constraints: A ﬂexible approach for fair classiﬁcation.”

J.Mach. Learn. Res. , vol. 20, no. 75, pp. 1–42, 2019.[33] M. Hardt, E. Price, and N. Srebro, “Equality of opportunity in supervisedlearning,” in

Advances in neural information processing systems , 2016,pp. 3315–3323.[34] B. Ustun, A. Spangher, and Y. Liu, “Actionable recourse in linear classi-ﬁcation,” in

Proceedings of the Conference on Fairness, Accountability,and Transparency , 2019, pp. 10–19.[35] D. K. Mulligan, D. Kluttz, and N. Kohli, “Shaping our tools: Contesta-bility as a means to promote responsible algorithmic decision makingin the professions,”

Available at SSRN 3311894 , 2019.36] T. Miller, “Explanation in artiﬁcial intelligence: Insights from the socialsciences,”

Artiﬁcial Intelligence , vol. 267, pp. 1–38, 2019.[37] U. Bhatt, M. Andrus, A. Weller, and A. Xiang, “Machine learning ex-plainability for external stakeholders,” arXiv preprint arXiv:2007.05408 ,2020.[38] B. Kulynych, R. Overdorf, C. Troncoso, and S. G¨urses, “Pots: protectiveoptimization technologies,” in

Proceedings of the 2020 Conference onFairness, Accountability, and Transparency , 2020, pp. 177–188.[39] T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach,H. Daum´e III, and K. Crawford, “Datasheets for datasets,” arXiv preprintarXiv:1803.09010 , 2018.[40] M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchin-son, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for modelreporting,” in

Proceedings of the conference on fairness, accountability,and transparency , 2019, pp. 220–229.[41] M. A. Goodrich and A. C. Schultz,

Human-robot interaction: a survey .Now Publishers Inc, 2008.[42] L. Bainbridge, “Ironies of automation,” in

Analysis, design and evalua-tion of man–machine systems . Elsevier, 1983, pp. 129–135.[43] R. Baheti and H. Gill, “Cyber-physical systems,”

The impact of controltechnology , vol. 12, no. 1, pp. 161–166, 2011.[44] A. Banerjee, K. K. Venkatasubramanian, T. Mukherjee, and S. K. S.Gupta, “Ensuring safety, security, and sustainability of mission-criticalcyber–physical systems,”

Proceedings of the IEEE , vol. 100, no. 1, pp.283–299, 2011.[45] D. Sadigh, A. D. Dragan, S. Sastry, and S. A. Seshia, “Active preference-based learning of reward functions.” in

Robotics: Science and Systems ,2017.[46] C. Wu, A. M. Bayen, and A. Mehta, “Stabilizing trafﬁc with autonomousvehicles,” in . IEEE, 2018, pp. 1–7.[47] A. Bajcsy, S. L. Herbert, D. Fridovich-Keil, J. F. Fisac, S. Deglurkar,A. D. Dragan, and C. J. Tomlin, “A scalable framework for real-time multi-robot, multi-human collision avoidance,” in

InternationalConference on Robotics and Automation . IEEE, 2019, pp. 936–943.[48] C. Bartneck, K. Yogeeswaran, Q. M. Ser, G. Woodward, R. Sparrow,S. Wang, and F. Eyssel, “Robots and racism,” in

Proceedings of the2018 ACM/IEEE international conference on human-robot interaction ,2018, pp. 196–204.[49] R. Sparrow, “Robotics has a race problem,”

Science, Technology, &Human; Values , vol. 45, no. 3, pp. 538–560, 2020.[50] D. Herath, J. McFarlane, E. A. Jochum, J. B. Grant, and P. Tresset,“Arts+ health: New approaches to arts and robots in health care,” in

Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction , 2020, pp. 1–7.[51] G. B. Verne, “Adapting to a robot: Adapting gardening and the gardento ﬁt a robot lawn mower,” in

Companion of the 2020 ACM/IEEEInternational Conference on Human-Robot Interaction , 2020, pp. 34–42.[52] H. T. Javed, M. O. Beg, H. Mujtaba, H. Majeed, and M. Asim, “Fairnessin real-time energy pricing for smart grid using unsupervised learning,”

The Computer Journal , vol. 62, no. 3, pp. 414–429, 2019.[53] C. Utz, M. Degeling, S. Fahl, F. Schaub, and T. Holz, “(Un)informedConsent: Studying GDPR Consent Notices in the Field,” in

Proceedings of the 2019 ACM SIGSAC Conference on Computer andCommunications Security . London United Kingdom: ACM, Nov.2019, pp. 973–990. [Online]. Available: https://dl.acm.org/doi/10.1145/3319535.3354212[54] M. Nouwens, I. Liccardi, M. Veale, D. Karger, and L. Kagal, “DarkPatterns Post-GDPR: Scraping Consent Interface Designs and Demon-strating their Inﬂuence,” p. 12.[55] R. Dobbe, T. K. Gilbert, and Y. Mintz, “Hard choices in artiﬁcialintelligence: Addressing normative uncertainty through sociotechnicalcommitments,” arXiv preprint arXiv:1911.09005 , 2019.[56] C. Barabas, C. Doyle, J. Rubinovitz, and K. Dinakar, “Studying up:reorienting the study of algorithmic fairness around issues of power,”in

Proceedings of the 2020 Conference on Fairness, Accountability, andTransparency , 2020, pp. 167–176.[57] J. Moore, “Towards a more representative politics in the ethics ofcomputer science,” in

Proceedings of the 2020 Conference on Fairness,Accountability, and Transparency , 2020, pp. 414–424.[58] T. N. Bonner,

Becoming a physician: medical education in Britain,France, Germany, and the United States, 1750-1945 . JHU Press, 2000. [59] M. C. Romano, “The history of legal clinics in the us, europe and aroundthe world,”

Diritto & Questioni Pubbliche , vol. 16, p. 27, 2016.[60] R. S. Haydock, “Clinical legal education: the history and developmentof a law clinic,”