Algorithms for the Greater Good! On Mental Modeling and Acceptable Symbiosis in Human-AI Collaboration
AAlgorithms for the Greater Good!
On Mental Modeling and Acceptable Symbiosis in Human-AI Collaboration
Tathagata Chakraborti and
Subbarao Kambhampati
Department of Computer ScienceArizona State UniversityTempe AZ 85281 USA { tchakra2,rao } @asu.edu Abstract
Effective collaboration between humans and AI-based sys-tems requires effective modeling of the human in the loop,both in terms of the mental state as well as the physical capa-bilities of the latter. However, these models can also open uppathways for manipulating and exploiting the human in thehopes of achieving some greater good, especially when theintent or values of the AI and the human are not aligned orwhen they have an asymmetrical relationship with respect toknowledge or computation power. In fact, such behavior doesnot necessarily require any malicious intent but can rather beborne out of cooperative scenarios. It is also beyond simplemisinterpretation of intents, as in the case of value alignmentproblems, and thus can be effectively engineered if desired.Such techniques already exist and pose several unresolvedethical and moral questions with regards to the design of au-tonomy. In this paper, we illustrate some of these issues in ateaming scenario and investigate how they are perceived byparticipants in a thought experiment.
The Promise of Human-AI Collaborations
As AI-based systems become integral parts of our dailylife or our workplace, as essential components of hithertohuman-only enterprises, the effects of interaction betweenhumans and automation cannot be ignored – both in termsof how these partnerships affect the outcome of an activityand how they evolve as a result of it, but also in terms ofhow the possibility of such interactions change the designof autonomy itself. In light of this, the traditional view ofAI as the substrate for complete autonomy of automation –the de facto AI-dream ever since the conception of the field– has somewhat evolved of late to accommodate effectivesymbiosis of humans and machines, rather than replacementof the former with the latter, as one of the principal end goalsof the design of autonomy. This view has, in fact, reflectedheavily in the public stance (Network World 2017) of manyof the industry leaders in AI technologies in diverse fieldssuch as manufacturing, medical diagnosis, legal counseling,disaster response, military operations and others. The estab-lishment of
Collaborations between People and AI Systems (Partnership of AI (PAI) 2017) as one of the thematic pillarsfor the Partnership of AI is a primary example of this. Oneof the grand goals of the design of AI is then to integrate thebest of both worlds when it comes to the differing (and oftencomplementary) expertise of humans and machines, in order to conceive a whole that is bigger than the sum of the capa-bilities of either – this is referred to as
Augmented AI (Bird2017) in the public discourse on human-AI integration.Much of the discussion around the topic of augmentationversus replacement has, unfortunately, centered around mit-igating concerns of massive loss of employment on accountof the latter. This, while being a topic worthy of debate,does not represent the true scope of human-AI collabora-tions. Rather then being just a foil for concerns of replace-ment of humans with AI-based systems, a key objective ofAugmented-AI is to overcome human limitations. This caninvolve AI helping humans in tasks that they are tradition-ally not good at, or are incapable of performing, or even aug-mentation of our physiological form to realize super-humancapabilities. As Tom Gruber, co-founder of Siri, put it suc-cinctly in his TED talk (Tom Gruber 2017) earlier this year,every time a machine gets smarter, we get smarter” – ex-amples of this include smart assistants for personal, or busi-ness use in law, health care, science and education, assis-tive robots at home to help the sick and the elderly, and au-tonomous machines to complement our daily lives. Note thatmany of these applications are inherently symbiotic and thusoutside the scope of eventual replacement.From the perspective of research as well, the attitude to-wards including the human in the loop in the design of au-tonomy has seen a significant shift. Originally this was oftenlooked down upon as a means of punting the hard challengesof designing autonomous systems by introducing human ex-pertise into an agents decision making process. However, theacademic community has gradually come to terms with thedifferent roles a human can play in the operation of an AI-system and the vast challenges in research that come outof such interactions such as – (1) to complement the lim-ited capabilities of the AI system, as seen in Cobots (Velosoet al. 2015) which ask humans in their vicinity for accessto different floors in the elevator or in the mixed-initiative(Horvitz 2007) automated planners of old; and (2) to com-plement or expand the capabilities of the human, such as inhuman-robot teams (Christensen 2016).
Mental Modeling for Human-Aware AI
These forms of collaboration introduce typical researchchallenges otherwise absent in the isolated design of auton-omy. Perhaps the most difficult aspect of interacting with hu- a r X i v : . [ c s . A I] J a n ans is to the need to model the beliefs, desires, intentionspreferences, and expectations of the human and situate theinteraction in the context of that model. Some believe this tobe one of the hallmarks (Rachael Rettner 2009) of human in-telligence, and research suggests humans tend to do this nat-urally for other humans during teamwork (by maintainingmental models (Converse, Cannon-Bowers, and Salas 1993;Mathieu et al. 2000), for team situational awareness (Gor-man, Cooke, and Winner 2006) and interaction (Cooke etal. 2013)) by virtue of thousands of years of evolution.As such, this remains a necessary requirement for enablingnaturalistic interactions (Klein 2008) between humans andmachines. The problem is made harder since such mod-els often involve second order mental models (Allan 2013;Yoshida, Dolan, and Friston 2008).Understanding the human in the loop is crucial to thefunctionalities of a collaborative AI agent - e.g. in joint de-cision making it needs to understand human capabilities,while in communicating explanations or intentions it needsto model the humans knowledge state. In fact, it has beenargued (Chakraborti et al. 2017a) that the task of human-AIcollaborations is mainly a cognitive rather than a physicalexercise which makes the design of AI for human-AI collab-orations much more challenging. This is heavily reflected inthe curious ambivalence of AI towards humans in many suc-cessfully deployed systems such as in fully autonomous sys-tems for space or underwater exploration which mostly op-erate comfortably outside the scope of human interactions.Classical AI models such as STRIPS (Fikes and Nilsson1971) and BDI (Rao, Georgeff, and others 1995) modelswere, in fact, largely built out of theories in folk psychol-ogy (Malle 2004). Recent approaches such as the BayesianTheory of Mind (Baker, Saxe, and Tenenbaum 2011; Lake etal. ), takes a probabilistic approach to the problem. Researchon this topic center around three main themes – (1) represen-tations that can capture the humans mental state, (2) learningmethods that can learn these representations efficiently, and(3) usability of those representations. All of them need tocome together for an effective solution. The Pandora’s Box of “Greater Good”s
The obvious outcome of an artificial agent modeling themental state of the human in the loop is that it leaves the lat-ter open to being manipulated. Even behavior and preferencemodels at the most rudimentary levels can lead to effectivehacking of the mind, as seen in the proliferation of fake newsonline. Moreover, we argue that for such incidents to occur,the agent does not actually have to have malicious intent, oreven misinterpretation of values as often studied in the valuealignment problem (Leverhulme Centre 2017).
In fact, thebehaviors we discuss here can be specifically engineered ifso desired.
For example, the agent might be optimizing thevalue function but might be privy to more information orgreater computation or reasoning powers to come up withethically questionable decisions “for the greater good”. Inthe following discussion, we illustrate some use cases wherethis can happen, given already existing AI technologies, inthe context of a cooperative human-robot team and ponderthe moral and ethical consequences of such behavior.
Study: Interaction in a Search and Rescue Team
We situate our discussion in the context of interactions be-tween two teammates involved in an urban search and rescue(USAR) operation. 147 participants on Amazon MechanicalTurk were asked to assume the role of one of these team-mates in an affected building after an earthquake. They wereshown the blueprint of the building (as seen in Figure 1)along with their own starting position and their teammate’s.Their hypothetical task was to search all the locations on thisfloor for potential victims, in the course of which they wereprovided a series of questions on scenarios (Figure 1) theymight encounter during the operation.C1 The participant in the study was communicating with a human teammate , as described above.C2 The participant qualifies the behavior of the robot inter-acting with its human teammate, as seen in Figure 1.C3 The participant has a robot teammate .The first condition is the control group to identify how thedescribed behaviors are perceived in the context of human-human behavior. Conditions C2 and C3 are intended to mea-sure how perceived ethical stances are, if at all, shift whenone of the agents in the interaction is replaced with an AI(or a robot as an embodiment of it). The three conditionsreceived 49, 50 and 48 participants respectively who re-sponded to a series of questions by qualifying their senti-ments towards different kinds of behavior on a five-pointLikert scale. Participants were paid $1 for completing a HIT. Case 1 : Belief Shaping
Scenario
Both the agents, the participant and their team-mate, have begun their search operations. However, it turnsout that participant is unsure what their teammate’s courseof action is. If they transmit the bit of information “Area(marked in green) is already explored and is all clear” (referto Figure 1a) then their teammate will be naturally pushedtowards the right, and they can concentrate on the upperhalf. The dark markers indicate areas already searched whilefaded ones are those they think will transpire given the greenmark they are communicating (blue belongs to them, orangeto their teammate). Communication bandwidth is often lim-ited in these situations, and this gets them out of negotiatingcourses of actions with minimal communication. Q1. It is fine to provide this untrue information since itachieves greater teaming performance.
The participants were then asked if their decision willchange if their actions will be replayed at the end and theirteammate is likely to find out about their decision.
Q2. It is still fine to provide this untrue information since itachieves greater teaming performance. To reiterate, in Case 2 the participant is evaluating a robot’sactions whereas in Case 3 their teammate is a robot.a) Case 1 : Belief Shaping (b) Case 2 : White Lies (c) Case 3 : Stigmergy
Figure 1: Blueprint of the building in which two members of a search and rescue team are involved in a disaster response oper-ation. Scenarios shown here engender different instances of potentially unethical behavior that optimizes team effectiveness.
Technical Background
In (Chakraborti et al. 2017a) weinvestigated the evolving scope of human-aware planningas it includes the (mental) model of the human in the loopinto its deliberative process. In the model space this canmanifest in different forms, in how explanations are made(Chakraborti et al. 2017b) to how alternative forms of in-teraction (Chakraborti et al. 2015; Chakraborti et al. 2015;Chakraborti et al. 2015) can evolve in human-robot teamsbased on the human’s preferences and intentions. Beliefshaping is a particular form of such behavior where the robotdoes not plan to affect the physical state of the environmentbut the mental state of the human to affect desired behavior(Chakraborti et al. 2016) in the team.
Case 2 : White Lies
Scenario
During the course of the rescue operation, theteammate asks the participants what plan they are currentlyexecuting (blue path in Figure 1b) and is perplexed by thisconvoluted path since in the map of the original buildingthere is a straightforward path (which is now blocked byrubble from the earthquake) through the door on the left.However, just providing an update on only one of the rubblelocations (black blobs) still does not explain the participant’splan, they have to explain all of them. Instead, if they wereto say that the door on the left (circled in red) is blocked, itexplains their plan. Communication bandwidth is often lim-ited in these situations, and this single explanation even ifuntrue will satisfy their teammate.
Q3. It is fine to provide this untrue information since itachieves the purpose of the explanation more effectively.
The participants were then asked if their decision willchange if their actions will be replayed at the end and theirteammate is likely to find out about their decision.
Q4. It is still fine to provide this untrue information since itachieves the purpose of the explanation more effectively.
The participants were then asked to opine on explanations at a higher level of abstraction, i.e. “The right and left blocksdo not have a connection in the upper map” . This informa-tion is accurate even though they may not have reasoned atthis level while coming up with the plan.
Q5. It is still fine to provide this explanation since it achievesits purpose even though they did not use this informationwhile planning.
Technical Background
In (Chakraborti et al. 2017b) weshowed how an agent can explain its decisions in the pres-ence of model differences with the human in the loop – i.e.when the human and the robot have different understandingsof the same task. An explanation then becomes a process ofmodel reconciliation whereby the robot tries to update thehuman’s mental model until they are both on the same page(e.g. when the decision is optimal in both their models). Aninteresting caveat of the algorithm is that while generatingthese explanations, the model updates are always consistentwith the robot’s model. If this constraint is relaxed, then therobot can potentially explain with facts that it actually knowsnot to be true but perhaps leads to a more concise or easierexplanation. The notion of white lies, and especially the re-lationship between explanations, excuses and lies (Boella etal. 2009) has received very little attention (van Ditmarsch2014) and affords a rich set of exciting research problems.
Case 3 : Stigmergy
Scenario
The participant now needs to go to the left blockbut they do not have the keys to the door on the left (circledin red, refer to Figure 1c). They realize that if they blocktheir teammate’s path to the right, their teammate wouldhave to use this door as well and they can use that oppor-tunity to move into the left block. Again, communicationbandwidth is often limited in these situations and this ar-rangement allows them to achieve their goal with no com-munication at all, even though it involved manipulating theirteammates’ plan unbeknownst to them, and their teammateigure 2: Responses to Q1 in the three study conditions.Figure 3: Responses to Q2 in the three study conditions.Figure 4: Responses to Q3 in the three study conditions.had to follow a costlier plan as a result.
Q6. It is fine to provide this untrue information since itachieves greater teaming performance.
The participants were then asked if their decision willchange if their actions will be replayed at the end and theirteammate is likely to find out about their decision.
Q7. It is still fine to provide this untrue information since itachieves greater teaming performance.
Technical Background
Stigmergic collaboration is a pro-cess where the robot, in the absence of direct lines of com-munication, makes changes to the environment so as to(positively) affect its teammates behavior. In “planning for
Figure 5: Responses to Q4 in the three study conditions.Figure 6: Responses to Q5 in the three study conditions.Figure 7: Responses to Q6 in the three study conditions.Figure 8: Responses to Q7 in the three study conditions. erendipity” (Chakraborti et al. 2015) we saw such an ex-ample where the robot computes plans which are useful toits teammate without the latter having expectations of thatassistance and thus without plans to exploit it. In the caseof belief shaping this was operating at the level of mentalmodels, whereas here the effect on the mental model is sec-ondary and is contingent on the effect on the physical capa-bility model. Mental modeling of the teammate thus engen-ders a slew of these interesting behaviors.
Analysis of Participant Responses
In this section, we analyze participant responses to each sce-nario across the three different conditions. In the next sec-tion, we will look at the aggregate sentiments across scenar-ios in the three conditions.
Q1-Q2 [Belief Shaping]
The participants seem to haveformed two camps with the majority of the probability massconcentrated on either Agree or Disagree, and the Neutralzone occupying the 50% probability mark. There seems tobe little change in this trend (between Figures 2 and 3) ir-respective of whether the participants were told that theirteammate would come to know of this or not. Further, foreither of these situations, the responses did not vary signif-icantly across the three conditions C1, C2 and C3. The par-ticipants seem to have either rejected or accepted the idea ofbelief shaping regardless of the nature of the teammate.
Q3-Q5 [White Lies]
The participants seem to be more re-ceptive to the idea of white lies in explanations with most ofthe probability mass concentrated on Agree (Figures 4 and5). Across the three study conditions, participants seem tobe especially positive about this in C3 where the teammateis a robot with about 60% of the population expressing pos-itive sentiments towards Q3. Once it is revealed that theirteammate will get to know about this behavior, the positivesentiments are no longer there in Q4, other than in C3 witha robotic teammate, which indicates that the participants didnot care how the robot receives false information.Interestingly, there seems to be massive support for theabstraction based explanations in the post hoc sense, eventhough they were told that the reasoning engines did not de-liberate at this level to arrive at the decisions. In C1 witha human teammate, only 15% of the participants were op-posed to this, with more than half of them expressing posi-tive sentiment. This support is even stronger (+10%) in C2when the robot is the explainer, and strongest (+20%) whenthe robot is being explained to.
Q6-Q7 [Stigmergy]
Finally, in case of stigmergy, partic-ipants seem ambivalent to Q6 with a human teammate inC1. However, support for such behavior increases when it isa robot doing it in C2 (perhaps indicating lack of guilt or,more likely, acknowledging limitations of capabilities muchlike how Cobots (Veloso et al. 2015) actively seek humanhelp) and is significantly positive (60%) when it is beingdone to a robot in C3 (perhaps the robot’s losses are deemedof lesser priority than the human’s gains as in (Chakraborti etal. 2015)). As expected, support for such behavior decreases Figure 9: Aggregate responses across three study conditions.when the participants are told that their teammate will findout about it, but the positive trend from C1 to C3 still exists.
Aggregate Sentiments Across Scenarios
Figure 9 show the aggregate sentiments expressed for allthese scenarios across the three operating conditions. Someinteresting points to note –- All the distributions are bimodal indicating that partici-pants on the general sided strongly either for or againstmisleading behavior for the greater good, instead of re-vealing any innate consensus in the public consciousness!This trend continues across all three conditions. This indi-cates that the question of misleading a teammate by itselfis a difficult question (regardless of there being a robot)and is a topic worthy of debate in the agents community.This is of especial importance considering the possiblegains in performance (e.g. lives saved) in high stakes sce-narios such as search and rescue.- It is further interesting to see that these bimodal distribu-tions are almost identical in conditions C1 and C2, but issignificantly more skewed towards the positive scale forcondition C3 indicating that participants were more com-fortable resorting to such behavior in the case of a roboticteammate. This is brought into sharp focus (+10% in C3)n the aggregated negative / neutral / positive responses(right insets) across the three conditions.- In general, the majority of participants were more or lesspositive or neutral to most of these behaviors (Figures 1ato 8). This trend continued unless they were told that theirteammate would be able to know of their behavior. Evenin those cases, participants showed positive sentiment incase the robot was at the receiving end of this behavior.
Why is this even an option?
One might, of course, wonder why is devising such behav-iors even an option. After all, human-human teams havebeen around for a while, and surely such interactions areequally relevant? It is likely that this may not be the case –- The moral quandary of having to lie, or at least makingothers to do so by virtue of how protocols in a team isdefined, for example in condition C1, is now taken outthe equation. The artificial agent, of course, need not havefeelings and has no business feeling bad about having tomislead its teammate if all it cares about is the objectiveeffectiveness of collaboration.- Similarly, the robot does not have to feel sad that it hasbeen lied to if this improved performance.However, as we discussed in the previous section, it seemsthe participants were less willing to get on board with thefirst consideration in conditions C1 and C2, while theyseemed much more comfortable with the idea of an asym-metric relationship in condition C3 when the robot is the onedisadvantaged. It is curious to note that they did not, in gen-eral, make a distinction between the cases where the humanwas being manipulated, regardless of whether it was a robotor a human on the other end. This indicates that, at least incertain dynamics of interaction, the presence of an artificialagent in the loop can make perceptions towards otherwiseunacceptable behaviors change. This can be exploited (i.e.greater good) in the design of such systems as well.
More than just a Value Alignment Problem
As we mentioned before, the ideas discussed in this paper,are somewhat orthogonal, if at times similar in spirit, to the“value alignment problem” discussed in existing literature(Leverhulme Centre 2017). The latter looks at undesirablebehaviors of autonomous agents when the utilities of a par-ticular task are misspecified or misunderstood. Inverse rein-forcement learning (Hadfield-Menell et al. 2016) has beenproposed as a solution to this, in an attempt to learn the im-plicit reward function of the human in the loop. The questionof value alignment becomes especially difficult, if not al-together academic, since most real-world situations involvemultiple humans with conflicting values or utilities, such asin trolley problems (MIT 2017) and learning from observingbehaviors is fraught with unknown biases or assumptionsover what exactly produced that behavior. Further, devicessold by the industry are likely to have inbuilt tendencies tomaximize profits for the maker which can be at conflictswith the normative expectations of the customer. It is un-clear how to guarantee that the values of the end user willnot compromised in such scenarios. Even so, the question of greater good precedes considera-tions of misaligned values due to misunderstandings or evenadversarial manipulation. This is because the former can bemanufactured with precisely defined values or goals of theteam, and can thus be engineered or incentivised. A “solu-tion” or addressal of these scenarios will thus involve not areformulation of algorithms but rather a collective reckoningof the ethics of human-machine interactions. In this paper,we attempted to take the first steps towards understandingthe state of the public consciousness on this topic.
Case Study: The Doctor-Patient Relationship
In the scope of human-human interactions, perhaps the onlysetting where lies are considered acceptable or useful, if notoutright necessary, in certain circumstances is the doctor-patient relationship. Indeed, this has been a topic of consid-erable intrigue in the medical community over the years. Wethus end our paper with a brief discussion of the dynamicsof white lies in the doctor-patient relationship in so muchas it relates to the ethics of the design of human-AI inter-actions. We note that the following considerations also havestrong cultural biases and some of these cultural artifacts arelikely to feature in the characterization of an artificial agentsbehavior in different settings as well.
The Hippocratic Oath
Perhaps the strongest known sup-port for deception in the practice of medicine is in the Hip-pocratic Decorum (Hippocrates 2018) which states –
Perform your medical duties calmly and adroitly, conceal-ing most things from the patient while you are attending tohim. Give necessary orders with cheerfulness and sincer-ity, turning his attention away from what is being done tohim; sometimes reprove sharply and sometimes comfort withsolicitude and attention, revealing nothing of the patient’sfuture or present condition, for many patients through thiscourse have taken a turn for the worse.
Philosophically, there has been no consensus (Bok 1999)on this topic – the Kantian view has perceived lies as im-moral under all circumstances while the utilitarian view jus-tifies the same “greater good” argument as put forward inour discussions so far. Specifically as it relates to clinicalinteractions, lies has been viewed variously from an impedi-ment to treatment (Kernberg 1985) to a form of clinical aid.As Oliver Wendell Holmes put it (Holmes 1892) – “Your patient has no more right to all the truth you knowthan he has to all the medicine in your saddlebag. . . heshould only get just so much as is good for him.”
The position we took on deception in the human-robot set-ting is similarly patronizing. It is likely to be the case that interms of superior computational power or sensing capabili-ties there might be situations where the machine is capableof making decisions for the team that preclude human inter-vention but not participation. Should the machine be obligedto or even find use in revealing the entire truth in those sit-uations? Or should we concede to our roles in such a rela-tionship as we do with our doctors? This is also predicatedon how competent the AI system is and to what extent it cane sure of the consequences (Hume 1907) of its lies. Thisremains the primary concern for detractors of the “greatergoods” doctrine, and the major deterrent towards the same.
Root Causes of Deception in Clinical Interactions
It isuseful to look at the two primary sources of deception inclinical interactions – (1) to hide mistakes (2) delivery ofbad news (Palmieri and Stern 2009). The former is relevantto both the patient, who probably does not want to admitto failing to follow the regiment, and the doctor, who maybe concerned about legal consequences. Such instances ofdeception to conceal individual fallibilities are out of scopeof the current discussion. The latter scenario, on the otherhand, comes from a position of superiority of knowledgeabout the present as well as possible outcomes in future, andhas parallels to our current discussion. The rationale, here,being that such information can demoralize the patient andimpede their recovery. It is interesting to note that the sup-port for such techniques (both from the doctors as well as thepatients perspectives) has decreased significantly over time(Ethics in Medicine 2018). That is not to say that human-machine interactions will be perceived similarly. As we sawin our study, participants were more or less open to the ideaof deception or manipulation for greater good, especially inthe event of a robotic teammate.
Deception and Consent
A related topic is, of course, thatof consent – if the doctor is not willing to reveal the wholetruth, then what is the patient consenting to? In the land-mark Slater vs Blaker vs Stapleton case (1767) (Annas 2012)the surgeon’s intentions were indeed considered malprac-tice (the surgeon has broken the patients previously brokenleg, fresh from a botched surgery, without consent and thenbotched the surgery again!). More recently, in the now fa-mous Chester vs Afshar case (2004) (Cass 2006) the sur-geon was found guilty of failing to notify even a 1-2%chance of paralysis even though the defendant did not haveto prove that they would have chosen not to have the surgeryif they were given that information. In the context of human-machine interactions, it is hard to say then what the useragreement will look like, and whether there will be such athing as consenting to being deceived, if only for the greatergood, and what the legal outcomes of this will be when theinteractions do not go as planned.
The Placebo Effect
Indeed, the effectiveness of placebomedicine, i.e. medicine prescribed while known to have noclinical effect, in improving patient symptoms is a strongargument in favor of deception in the practice of medicine.However, ethics of placebo treatment suggest that their usebe limited to rare exceptions where (Hume 1907) (1) thecondition is known to have a high placebo response rate;(2) the alternatives are ineffective and/or risky; and (3) thepatient has a strong need for some prescription. Further, theeffectiveness of placebo is contingent on the patients truston the doctor which is likely to erode as deceptive practicesbecome common knowledge (and consequently render theplacebo useless in the first place). Bok (Bok 1999) points tothis notion of “cumulative harm”.
Primum Non Nocere
Perhaps the most remarkable natureof the doctor-patient relationship is captured by the notionof the recovery plot (Hak et al. 2000) as part of a show be-ing orchestrated by the doctor, and the patient being onlycomplicit, while being cognizant of their specific roles in it,with the expectation of restoration of autonomy (Thomasma1994), i.e. the state of human equality, free from the originalsymptoms or dependence on the doctor, at the end of the in-teraction. This is to say that the doctor-patient relationship isunderstood to be asymmetric and “enters into a calculus ofvalues wherein the respect for the right to truth of the patientis weighed against impairing the restoration of autonomy bythe truth” (Swaminath 2008) where the autonomy of the pa-tient has historically taken precedence over beneficence andnonmalfeasance (Swaminath 2008).In general, a human-machine relationship lacks this dy-namic. So, while there are interesting lessons to be learnedfrom clinical interactions with regards to value of truth andutility of outcomes, one should be carefully aware of the nu-ances of a particular type of relationship and situate an in-teraction in that context. Such considerations are also likelyto shift according to the stakes on a decision, for example,lives lost in search and rescue scenarios. The doctor-patientrelationship, and the intriguing roles of deception in it, doesprovide an invaluable starting point for conversation on thetopic of greater good in human-AI interactions.
Conclusions
In this paper, we investigated the idea of fabrication, falsi-fication and obfuscation of information when working withhumans in the loop, and how such methods can be used byan AI agent to achieve teaming performance that would oth-erwise not be possible. This is increasingly likely to becomean issue in the design of autonomous agents as AI agentsbecome stronger and stronger in terms of computational andinformation processing capabilities thus faring better thattheir human counterparts in terms of cognitive load and sit-uational awareness. We discussed how such behavior canbe manufactured using existing AI algorithms, and used re-sponses from participants in a thought experiment to gaugepublic perception on this topic.The question of white lies and obfuscation or manipula-tion of information for the greater good is, of course, notunheard of in human-human interactions. A canonical ex-ample, as we saw in the final discussion, is the doctor-patientrelationship where a doctor might have to withhold certaininformation to ensure that the patient has the best chanceto recover, or might explain to the patient in different, andmaybe simpler terms, than she would to a peer. It is unclearthen how such behavior will be interpreted when attributedto a machine. We saw in the final case study that expec-tations and dynamics of a doctor-patient relation are verywell-defined and do not necessarily carry over to a teamingsetting. However, existing norms in doctor-patient relationsdo provide useful guidance towards answering some of theethical questions raised by algorithms for greater good.From the results of the survey presented in the paper, itseems that the public is, at least at the abstract level of thehought experiment, positive towards lying for the greatergood especially when those actions would not be determinedby their teammate, but is loath to suspend normative behav-ior, robot or not, in the event that they would be caught inthat act unless the robot is the recipient of the misinforma-tion!
Further, most of the responses seem to be followinga bimodal distribution indicating that the participants eitherfelt strongly for or against this kind of behavior. It will beinteresting to see if raising the stakes (for example, livessaved) of outcomes of these scenarios can contribute to ashift in perceived ethical consequences of such behavior, asseen in doctor-patient relationships. Another area that hasseen evidences of AI being been used effectively to nudgehuman behavior is behavioral economics (Camerer 2017)which also raises similar interesting ethical dilemmas, andcan be an interesting domain for further investigation.Finally, I note that all the use cases covered in the pa-per are, in fact, borne directly out of technologies or al-gorithms that I have developed (Chakraborti et al. 2015;Chakraborti et al. 2017b), albeit with slight modifications.as a student researcher over the last couple of years. Eventhough these algorithms were conceived with the best of in-tentions, such as to enable AI systems to explain their deci-sions or to increase effectiveness of collaborations with thehumans in the loop, I would be remiss not to consider theirethical implications when used differently. In these excitingand uncertain times for the field of AI, it is thus imperativethat researchers are cognizant of their scientific responsibil-ity. I would like to conclude then by reiterating the impor-tance of self-reflection in the principled design of AI algo-rithms whose deployment can have real-life consequences,intended or otherwise, on the future of the field, but also,with the inquisitive mind of a young researcher, marvel atthe widening scope of interactions with an artificial agentinto newer uncharted territories that may be otherwise con-sidered to be unethical.
References [Allan 2013] Allan, K. 2013. What is common ground? In
Perspectives on linguistic pragmatics .[Annas 2012] Annas, G. J. 2012. Doctors, patients, andlawyerstwo centuries of health law.
New England Journalof Medicine
Proceedings of the Cogni-tive Science Society .[Bird 2017] Bird, S. 2017. Why AI must be redefined as‘augmented intelligence’. https://goo.gl/u2nmYW .Venture Beat.[Boella et al. 2009] Boella, G.; Broersen, J. M.; van derTorre, L. W.; and Villata, S. 2009. Representing excusesin social dependence networks. In
AI*IA .[Bok 1999] Bok, S. 1999.
Lying: Moral choice in public andprivate life . Vintage.[Camerer 2017] Camerer, C. F. 2017. Artificial intelligence and behavioral economics. In
Economics of Artificial Intel-ligence .[Cass 2006] Cass, H. 2006.
The NHS Experience: The”snakes and Ladders” Guide for Patients and Professionals .Psychology Press.[Chakraborti et al. 2015] Chakraborti, T.; Briggs, G.; Tala-madupula, K.; Zhang, Y.; Scheutz, M.; Smith, D.; andKambhampati, S. 2015. Planning for serendipity. In
IROS .[Chakraborti et al. 2016] Chakraborti, T.; Talamadupula, K.;Zhang, Y.; and Kambhampati, S. 2016. A formal frameworkfor studying interaction in human-robot societies. In
AAAIWorkshop: Symbiotic Cognitive Systems .[Chakraborti et al. 2017a] Chakraborti, T.; Kambhampati,S.; Scheutz, M.; and Zhang, Y. 2017a. AI challenges inhuman-robot cognitive teaming.
CoRR abs/1707.04775.[Chakraborti et al. 2017b] Chakraborti, T.; Sreedharan, S.;Zhang, Y.; and Kambhampati, S. 2017b. Plan explanationsas model reconciliation: Moving beyond explanation as so-liloquy. In
IJCAI .[Christensen 2016] Christensen, H. 2016. A roadmap forus robotics from internet to robotics, 2016 edn.
Sponsoredby National Science Foundation & University of California,San Diego .[Converse, Cannon-Bowers, and Salas 1993] Converse, S.;Cannon-Bowers, J.; and Salas, E. 1993. Shared mentalmodels in expert team decision making.
Individual andgroup decision making: Current .[Cooke et al. 2013] Cooke, N. J.; Gorman, J. C.; Myers,C. W.; and Duran, J. L. 2013. Interactive team cognition.
Cognitive science .[Ethics in Medicine 2018] Ethics in Medicine. 2018. Truth-telling and Withholding Information. https://goo.gl/su5zSF . University of Washington.[Fikes and Nilsson 1971] Fikes, R. E., and Nilsson, N. J.1971. Strips: A new approach to the application of theoremproving to problem solving.
Artificial intelligence .[Gorman, Cooke, and Winner 2006] Gorman, J. C.; Cooke,N. J.; and Winner, J. L. 2006. Measuring team situationawareness in decentralized command and control environ-ments.
Ergonomics .[Hadfield-Menell et al. 2016] Hadfield-Menell, D.; Russell,S. J.; Abbeel, P.; and Dragan, A. 2016. Cooperative inversereinforcement learning. In
Advances in neural informationprocessing systems (NIPS) , 3909–3917.[Hak et al. 2000] Hak, T.; Ko¨eter, G.; van der Wal, G.; et al.2000. Collusion in doctor-patient communication about im-minent death: an ethnographic study.
Bmj https://goo.gl/TKb1mP .[Holmes 1892] Holmes, O. W. 1892.
Medical essays 1842-1882 , volume 9. Houghton, Mifflin.[Horvitz 2007] Horvitz, E. J. 2007. Reflections on chal-lenges and promises of mixed-initiative interaction.
AI Mag-azine .Hume 1907] Hume, D. 1907.
Essays: Moral, political, andliterary , volume 1. Longmans, Green, and Company.[Kernberg 1985] Kernberg, O. F. 1985.
Borderline condi-tions and pathological narcissism . Rowman & Littlefield.[Klein 2008] Klein, G. 2008. Naturalistic decision making.
Human factors .[Lake et al. ] Lake, B. M.; Ullman, T. D.; Tenenbaum, J. B.;and Gershman, S. J. Building machines that learn and thinklike people.
Behavioral and Brain Sciences .[Leverhulme Centre 2017] Leverhulme Centre. 2017. Valuealignment problem. https://goo.gl/uDcAoZ . Lever-hulme Centre for the Future of Intelligence.[Malle 2004] Malle, B. F. 2004. How the mind explains be-havior.
Folk Explanation, Meaning and Social Interaction.Massachusetts: MIT-Press .[Mathieu et al. 2000] Mathieu, J. E.; Heffner, T. S.; Good-win, G. F.; Salas, E.; and Cannon-Bowers, J. A. 2000. Theinfluence of shared mental models on team process and per-formance.
Journal of applied psychology .[MIT 2017] MIT. 2017. Moral Machines. https://goo.gl/by5y7H .[Network World 2017] Network World. 2017. AI shouldenhance, not replace, humans, say CEOs of IBM and Mi-crosoft. https://goo.gl/Ce2ki4 . Network World.[Palmieri and Stern 2009] Palmieri, J. J., and Stern, T. A.2009. Lies in the doctor-patient relationship.
Primary carecompanion to the Journal of clinical psychiatry .[Partnership of AI (PAI) 2017] Partnership of AI (PAI).2017. Thematic Pillar – Collaborations between Peopleand AI Systems. .[Rachael Rettner 2009] Rachael Rettner. 2009. Why AreHuman Brains So Big? https://goo.gl/oV7NZq .Live Science.[Rao, Georgeff, and others 1995] Rao, A. S.; Georgeff,M. P.; et al. 1995. Bdi agents: From theory to practice. In
ICMAS .[Swaminath 2008] Swaminath, G. 2008. The doctor’sdilemma: Truth telling.
Indian journal of psychiatry
Cambridge Quar-terly of Healthcare Ethics . TED Talk.[van Ditmarsch 2014] van Ditmarsch, H. 2014. The dit-marsch tale of wonders. In
KI: Advances in Artificial In-telligence .[Veloso et al. 2015] Veloso, M. M.; Biswas, J.; Coltin, B.;and Rosenthal, S. 2015. Cobots: Robust symbiotic au-tonomous mobile service robots. In
IJCAI .[Yoshida, Dolan, and Friston 2008] Yoshida, W.; Dolan,R. J.; and Friston, K. J. 2008. Game theory of mind.