A Computational Model of Commonsense Moral Decision Making
Richard Kim, Max Kleiman-Weiner, Andres Abeliuk, Edmond Awad, Sohan Dsouza, Josh Tenenbaum, Iyad Rahwan
AA Computational Model of Commonsense Moral Decision Making
Richard Kim Max Kleiman-Weiner Andr´es AbeliukEdmond Awad Sohan Dsouza Josh Tenenbaum Iyad Rahwan
Massachusetts Institute of Technology, Cambridge MA, USA
Abstract
We introduce a new computational model of moral deci-sion making, drawing on a recent theory of commonsensemoral learning via social dynamics. Our model describesmoral dilemmas as utility function that computes trade-offsin values over abstract moral dimensions, which provide in-terpretable parameter values when implemented in machine-led ethical decision-making. Moreover, characterizing the so-cial structures of individuals and groups as a hierarchicalBayesian model, we show that a useful description of an in-dividual’s moral values – as well as a group’s shared values– can be inferred from a limited amount of observed data. Fi-nally, we apply and evaluate our approach to data from theMoral Machine, a web application that collects human judg-ments on moral dilemmas involving autonomous vehicles.
Recent advances in machine learning, notably DeepLearning, have demonstrated impressive results in variousdomains of human intelligence, such as computer vision(Szegedy et al.), machine translation (Wu et al., 2016), andspeech generation (Oord et al., 2016). In domains as abstractas human emotion, Deep Learning has shown a proficientcapacity to detect human emotions in natural language text(Felbo et al., 2017). These achievements indicate that DeepLearning will be paving the way for AI in ethical decisionmaking.However, training Deep Learning models often requireshuman-labeled data numbering in the millions, and despiterecent advances that enables a model to be trained from asmall number of examples (Vinyals et al., 2016; Santoro etal., 2016), this constraint remains a key challenge in DeepLearning. In addition, Deep Learning models have been crit-icized as “blackbox” algorithms that defy attempts at inter-pretation (Lei, Barzilay, and Jaakkola, 2016). The viabilityof many Deep Learning algorithms for real-world applica-tions in business and government has come into question asa recent legislation in the EU, slated to take effect in 2018,will ban automated decisions, including those derived frommachine learning if they cause an “adverse legal effect” onthe persons concerned (Goodman and Flaxman, 2016).In contrast to Deep Learning algorithms, evidence fromstudies in human cognition suggests that humans are able tolearn and make predictions from a much smaller number ofnoisy and sparse examples (Tenenbaum et al., 2011). More-over, studies have shown that humans are able to internally rationalize their moral decisions and articulate reasons forthese (Haidt, 2001). Given this stark difference between thecurrent state of machine learning and human cognition, howcan we draw on the latest theories in cognitive science to de-sign AI with the capacity to learn moral values from limitedinteractions with humans and make decisions with explica-ble processes?A recent theory from the field of cognitive science pos-tulates that humans learn to make ethical decisions by ac-quiring abstract moral principles through observation and in-teraction with other humans in their environment (Kleiman-Weiner, Saxe, and Tenenbaum, 2017). This theory charac-terizes ethical decision as utility maximizing choice over aset of outcomes whose values are computed from weightspeople place on abstract moral concepts such as “kin” or“reciprocal relationship.” In addition, given the dynamics ofindividuals and their memberships in a group, the frame-work explains how an individual’s moral preferences, andthe actions resulting from them, lead to a development ofthe group’s shared moral principles (i.e. group norms).In this work we extend the framework introduced byKleiman-Weiner, Saxe, and Tenenbaum (2017) to explore acomputational model of the human mind in moral dilem-mas with binary-decisions. We characterize the decisionmaking in moral dilemmas as a utility function that com-putes the trade-offs of values perceived by humans in thechoices of the dilemma. These values are the weights thathumans put on abstract dimensions of the dilemma; we callthese weights moral principles . Furthermore, we representan individual agent as a member of a group with manyother agents that share similar moral principles; these sharedmoral principles of the group as an aggregate give rise to the group norm . Exploiting the hierarchical structure of individ-uals and group, we show how hierarchical Bayesian infer-ence (Gelman et al., 2013) can provide a powerful mecha-nism to rapidly infer individual moral principles as well asthe group norm with sparse and noisy data.We apply our model to the domain of autonomous vehi-cles (AV) through a data set from the Moral Machine, a webapplication that collects human judgments in ethical dilem-mas involving AV. A recent study on public sentiment onAV reveals that endowing AI with human moral values is an http://moralmachine.mit.edu/ a r X i v : . [ c s . A I] J a n mportant step before AV can undergo widespread marketadoption (Bonnefon, Shariff, and Rahwan, 2016). In light ofthis study, we view application of our model to understandhow the human mind perceives and resolves moral dilem-mas on the road as an important step towards building anAV with human moral values.This paper makes the following distinct contributions to-wards building an ethical AI: • Introducing a novel computational model of moral deci-sion making that characterizes moral dilemma as a trade-off of values along abstract moral dimensions. We showthat this model well-describes how the human mind pro-cesses moral dilemmas and provides an interpretable pro-cess for an AI agent to arrive at a decision in a moraldilemma. • Characterizing the social structure of individuals andgroups as a hierarchical Bayesian model, we show thatthe model can rapidly infer moral principles of individualsfrom limited number of observational data. Rapidly infer-ring other agents’ unique moral values will be crucial, asAI agents interact with other agents, including humans. • Demonstrating the model’s capacity to rapidly infergroup’s norms, characterized as prior over individualmoral preferences. Inferring shared moral values of agroup is an important step towards designing an AI agentthat makes socially optimal choices.
Moral Machine Data
Moral Machine is a web application built to collect andanalyze human perceptions of moral dilemmas involvingautonomous vehicles. As of October 2017, the applicationhas collected over million responses from over mil-lion unique respondents from over countries aroundthe world. Here, we briefly describe the design of moraldilemma and data structure in Moral Machine.Figure 1: Moral Machine interface. An example of a moraldilemma that features an AV with sudden brake failure, fac-ing a choice between either not changing course, resultingin the death of three elderly pedestrians crossing on a “donot cross” signal, or deliberately swerving, resulting in thedeath of three passengers; a child and two adults.In a typical Moral Machine session, a respondent is shown scenarios such as the example shown in Figure 1. In each scenario, the respondent is asked to choose one of two out-comes that have different ethical consequences with differ-ent trade-offs. A scenario can contain any random combi-nation of twenty characters (see Figure 2) that representsvarious demographic attributes found in a general popula-tion. In addition to the demographic factors, Moral Machinescenario also includes the factors of character’s status as apassenger or a pedestrian and its status as a pedestrian whois crossing on green light or red light.Figure 2: Twenty characters in Moral Machine represent var-ious demographic attributes such as gender, age, social sta-tus, fitness level, and species.In addition to the respondents’ decisions, data about theirresponse duration (in seconds) to each scenario and their ap-proximate geo-location is also collected. This allows us toinfer the country or region of access.Every scenario has two choices, which we represent as arandom variable Y with two realizable values { , } . A re-spondent’s choice to swerve (i.e., intervene) is representedas Y = 1 , and likewise, their choice to stay (i.e., not-intervene) is represented as Y = 0 . The respondent’s choiceyields a state of certain set of characters being saved overothers. The resultant state is represented by character vector Θ y ∈ N K , which denotes the resultant state of choice y .Figure 3: An example of vector representation of a state inthe Moral Machine character space.As an illustration, we show a vector representation of a re-sultant state of swerve in Figure 3. The vector element of oldman character is denoted by value of , representing two oldman characters that will be saved from the choice of swerve Y = 1 . In addition, the vector element of red light feature isdenoted by value of , representing three pedestrian who arecrossing the red light. oral Dilemma as Utility Function Jeremy Bentham, the founder of modern utilitarian ethics,described ethical decision in a moral dilemma as a utilitymaximizing decision over the sum of trade-offs over val-ues in the dilemma (Bentham, 1789). More recently, cog-nitive psychologists have formalized the idea of analyzingmoral dilemma using utility function that computes varioustrade-offs in the dilemma (Mikhail, 2007, 2011). Evidenceof moral decision making in young children suggests thatchildren base their moral judgments by computing trade-offof values over abstract concepts (Kohlberg, 1981).Using this framework, we can analyze how a respondentarrives to his/her decision based on the values that he/sheplaces on abstract dimensions of the moral dilemma, whichwe label moral principles . For instance, when a respondentchooses to save a female doctor character in a scenario overan adult male character, this decision is in part due to thevalue that respondent places on the abstract concept of doc-tor , a rare and valuable member in society who contributesto improvement of social welfare. The abstract concept of female gender also would be a factor in his or her decision.In Moral Machine, twenty characters share many abstractfeatures such as female , elderly , non-human , etc. Hence, theoriginal character vector Θ y can be decomposed into a newvector in the abstract feature space Λ y ∈ N D where D ≤ K via feature mapping F : Θ → Λ . In this work, we use alinear mapping F (Θ) = A Θ where A is a × binarymatrix such as the one shown in Figure 4.Figure 4: An example of a binary matrix A that decomposesthe characters in Moral Machine into abstract features. Blacksquares indicate the presence of abstract features in the char-acters.Shown in Figure 5, the original state vector in the MoralMachine character space Θ is mapped into a new state vectorin the abstract feature space Λ . We note that vector elementof old is denoted by value of representing three characterwith this feature.We define moral principles as weights w ∈ R D that re-spondent place along the D abstract dimensions Λ . Theseweights represent how the respondent values abstract fea-tures such as young , old , or doctor to compute utility valueof their choices. For simplicity, we model the utility value of Figure 5: Vector representation of abstract features of a sce-nario choice.a state as a linear combination of the features in the abstractdimension: u (Θ i ) = w (cid:62) F (Θ i ) (1)With utility values of the choice to not-intervene u (Θ ) and intervene u (Θ ) , respondent’s decision to intervene Y = 1 is seen as probabilistic outcome based on sigmoidfunction of net utility of the two choices: P ( Y = 1 | Θ) = 11 + e − U (Θ) (2)where U (Θ) = u (Θ ) − u (Θ ) . (3)We turn our attention to inferring individual moral prin-ciples of respondents from sparse and noisy observation oftheir decisions in moral dilemma. Hierarchical Moral Principles
Studies by anthropologists have shown that societies acrossdifferent regions and time periods hold widely divergentviews about what actions are ethical (Henrich et al., 2001;House et al., 2013; Blake et al., 2015). For example, certainsocieties strongly emphasize respect for the elderly whileothers focus on protecting the young. These views in a soci-ety are what we refer to as the society’s group norms .Nevertheless, even in a society with a homogeneous cul-tural and ethnic make-up, individual members of the groupcan hold unique and different moral standards (Graham,Haidt, and Nosek, 2009). How can we model the complexrelationship between the group norm and individual moralprinciples?We introduce hierarchical moral principles model, whichis an instance of hierarchical Bayesian model (Gelman etal., 2013). Returning to data in Moral Machine, consider N respondents that belong to a group g ∈ G . This group can bea country, a culture, or a region within which customs andnorms are shared.The moral principles of respondent i is drawn from a mul-tivariate normal distribution parameterized by the mean val-ues of the group w g on the D dimensions: w i ∼ N ormal D ( w g , Σ g ) , (4)where the diagonal of the covariance matrix Σ g representsthe in-group variance or differences between the membersf the group along the abstract dimensions. Higher variancevalue describes broader diversity of opinions along that cor-responding abstract dimension. In addition, covariance (off-diagonal) values capture the strength of relationship betweenthe values they place on abstraction dimension. As an exam-ple, a culture that highly values infancy should also highlyvalue pregnancy as they are intuitively closely related con-cepts. Covariance matrix allows the Bayesian learner to un-derstand related concepts and use the relationship to rapidlyapproximate the values of one dimension after inferring thatof a highly correlated dimension.Let w = { w , ..., w i , ..., w N } be a set of unique moralprinciples by N respondents. Each respondent i makes judg-ments on T scenarios Θ = { Θ , ..., Θ ti , ..., Θ TN } . Judgmentby respondent i is an instance of a random variable Y ti .Given the observation of the set of states Θ and the decisions Y , the posterior distribution over the set of moral principlesfollows: P ( w , w g , Σ g | Θ , Y ) ∝ P ( Θ , Y | w ) P ( w | w g , Σ g ) P ( w g ) P (Σ g ) (5)where the likelihood is P ( Θ , Y | w ) = N (cid:89) i =1 T (cid:89) t =1 p y ti ti (1 − p ti ) (1 − y ti ) (6)and p ti = P ( Y ti = 1 | Θ t ) is the probability that a respondentchooses to swerve in scenario t given Θ t as shown in Equa-tion 2. Graphical representation of the model is presented inFigure 6.Figure 6: Graphical representation of hierarchical Bayesianmodel of moral principles.As an illustration, we randomly sampled 99 respon-dents from Denmark, which equates to 1,287 response data.We specified prior over the covariance matrix P (Σ g ) withLKJ covariance matrix (Lewandowski, Kurowicka, and Joe,2009) with parameter η = 2 : Σ g ∼ LKJ ( η ) (7)and the prior over group weights P ( w g ) with w g ∼ N ormal D ( µ, Σ g ) (8) where µ = .We inferred the individual moral principles as well as thegroup values w g and the covariance matrices Σ g . These re-sults are shown in Figure 7. We note the variations in theinferred moral principles of three representative sub-sampleof Danish respondents. Predicting Individual Judgments
As an evaluation of our model, we performed out-of-sampleprediction test. We randomly selected ten-thousand respon-dents from the Moral Machine website who completed atleast one session, which contains thirteen scenarios. We fil-tered only the respondents’ first 13 scenarios to compile adata set consisting 130,000 decisions.We compared the predictive accuracy of the model againstthree benchmarks. Benchmark 1 models the collective val-ues of the characters in Moral Machine such that the utilityof a state is computed as u (Θ) = w c (cid:62) Θ (9)where w c ∈ R K . Benchmark 1 models the weights as w c ∼ N ormal K ( µ, σ I ) and does not include the group hierarchyor the covariance between the weights over the charactersand factors (e.g. traffic light, passenger, etc.).Benchmark 2, which builds upon Benchmark 1, modelsthe values along the abstracts moral dimensions Λ as w f ∼ N ormal D ( µ, σ I ) . The group hierarchy and the covariancebetween weights are ignored.Finally, benchmark 3 models the individual moral prin-ciples of each respondent as w li ∼ N ormal D ( µ, σ I ) , butdoes not include the hierarchical structure. Therefore, eachrespondent is viewed as an independent agent wherein infer-ring the values of one respondent provides no insight aboutthe values of another.To demonstrate the gains in accuracy, we tested themodels across different size of training data by vary-ing the number of sampled respondents along N =(4 , , , , , . We used the first eight judgmentsfrom each respondent as training data, and tested the accu-racy of predictions on the remaining five of the responsesper each agent. For our model, we assumed that sampled re-spondents of size N belong to one group.The results (Figure 8) shows that as the number of respon-dents (i.e. training data) grows larger, predictive accuracy ofour model, benchmark 1 and 2 improve. Accuracy of bench-mark 3 does not improve as the the number of respondentshave no bearing on inference of individual respondent’s val-ues. However, the hierarchical moral principles model showsconsistently improving accuracy rates along the increasingsize of the training data.We note that the margin of improvement between bench-mark 1 and benchmark 2 reveals the gain achieved fromabstraction and dimension reduction. The margin betweenbenchmark 2 and our model reveals the gain from includ-ing individual moral principles. Finally, the margin be-tween benchmark 3 and our model is indicative of the gainachieved by the group hierarchy. a) (b)(c) (d) (e) Figure 7: (a) Inferred group norm of sampled Danish respondents; (b) Inferred covariance matrix of the Danish respondents;(c-e) Individual moral principle values of three representative sub-sample of Danish respondents.
Response Time
Studies in human decision making find strong relationshipbetween the confidence level of the decision and reactiontime of the decision (i.e. reaction time) (Smith and Rat-cliff, 2004; Cain and Shea-Brown, 2012; Baron and G¨urc¸ay,2017). These studies show that human subjects in binary-decision tasks take longer time to arrive at a decision whenthere is lower level of evidence. In this section, we take thisapproach to show that our model accurately captures the re-lationship between reaction time and difficulty of a moraldilemma.We sampled 1727 respondents who accessed Moral Ma-chine from the US; which altogether correspond to 22,451judgments. In addition to the judgment decisions, we mea-sured response times (RT) in seconds that the respondentstook to arrive at their decisions. Due to the unsupervisednature of the experiments, respondents are free to stop andreengage at later time; as such, we eliminated responses thattook more than 120 seconds from our analysis. From thejudgment data, after inferring the moral principles of indi-vidual respondents, we computed the estimated the proba- bility of decision to swerve (i.e. p ti = P ( Y ti = 1 | Θ ti ) ) ofeach scenario as defined in Equation 2. We computed newmetric, certainty of decision , using | p ti − . | .Plotting the certainty of decision and response times ofthe scenarios (see Figure 9) reveals an intuitive pattern ofrelationship between two variables.Scenarios with higher certainty represent those that haveclear trade-offs in the dilemmas such that the respondentson average respond quicker to the dilemmas. Likewise, sce-narios with lower certainties are those that have ambiguoustrade-offs such that the respondents have less confidencesabout their decisions. Intuitively, resolving the ambiguity ofthe trade-offs takes greater cognitive costs, which is revealedas longer response times for the respondents.We view the relationship between response time and esti-mated certainty of decision from the model as a supportingevidence that the model is a robust representation of howpeople resolves moral dilemmas. In addition, the fact thatcognitive cost of value-based decision process is revealed intheir reaction times is an extra bit of information that couldbe used in the inference. For instance, we see a person mak-igure 8: Comparison of out-of-sample prediction accuracyrates of the hierarchical moral principles model and the threebenchmark models.Figure 9: Reaction time in seconds per estimated certaintyof decisions, which is defined as distance of the probabilityof judgment from the . probability of swerving.ing a quick decision; then we might also get informationabout the relative value difference between the two choices.In future work, we intend to integrate response time infor-mation into the process of learning itself to allow the learnerto infer even faster. Discussion
Drawing on a recent framework for modeling human moraldecisions, we proposed a computational model of how thehuman mind arrives at a decision in moral dilemma. Wedemonstrated the application of this model in the domainof autonomous vehicles using data from Moral Machine.We showed that hierarchical Bayesian inference provides apowerful mechanism to accurately infer individual prefer-ences as well as group norms along the abstract moral di-mensions. We concluded with a demonstration of the modelsuccessfully capturing the cognitive cost in resolving thetrade-offs in moral dilemmas. We show that moral dilem-mas that are unpredictable by the model are correlated withlong response times, where response times are a proxy ofhow difficult the dilemma is for the respondent because thesubject is indifferent between the two responses.In this work, we have left out any discussion about method to aggregate the individual moral principles and the groupnorms to design an AI agent that makes decisions that op-timize social utility of all other agents in the system. Re-cently, a paper by Noothigattu et al. (2017) introduced anovel method of aggregating individuals preferences suchthat the decision reached after the aggregation ensures globalutility maximization. We view this method as a naturallycomplement to our work.Another interesting extension of our work is to explore themechanism that maps the observable data on to the abstractfeature space. We formalized this process as feature map-ping F : Θ → Λ . Evidence from developmental psychologysuggests that children grow to acquire abstract knowledgeand form inductive constraints (Gopnik and Meltzoff, 1997;Carey, 2009). Non-parametric Bayesian processes such asthe Indian Buffet Process (Thomas L. Griffiths, 2005) andits variants (Rai and Daum´e, 2009) are promising models tocharacterize this learning mechanism in the moral domain.We used response time as a proxy to measure cognitivecost and proposed that the response time can be used asan extra information for more accurate inference over re-spondent’s individual moral principles. Combining our cur-rent model with drift diffusion model (Ratcliff and McKoon,2008) can lead to a richer model that describes confidenceand error in moral decision making. An AI agent needs tounderstand moral basis of people’s actions including whenthey are from socially inappropriate moral values as well aswhen they are mistakes. For instance, if an AI agent observesa person who spends a long time to make a ultimately wrongdecision, the AI agent should incorporate the person’s con-fidence level and error rates to make accurate inference thatthe person most likely made a mistake.Finally, we have inferred abstract moral principles andtested the model’s predictive power on the same source ofdata. However, the abstract dimensions of the characters andfactors in Moral Machine are not confined to the AV do-main. An interesting experiment would be to test the modelacross various moral dilemmas in difference contexts. Hi-erarchical Bayesian models has been applied in the domainof transfer learning. Demonstrating capacity to learn moralprinciples from one domain and apply these principles inother domains to make ethical decisions would show that adevelopment of human-like ethical AI system does not needto be domain specific. eferences Baron, J., and G¨urc¸ay, B. 2017. A meta-analysis ofresponse-time tests of the sequential two-systems modelof moral judgment.
Memory & Cognition
An Introduction to the Principles ofMorals and Legislation .Blake, P. R.; McAuliffe, K.; Corbit, J.; Callaghan, T. C.;Barry, O.; Bowie, A.; Kleutsch, L.; Kramer, K. L.; Ross,E.; Vongsachang, H.; Wrangham, R.; and Warneken, F.2015. The ontogeny of fairness in seven societies.
Nature
Science
The origin of concepts . Oxford UniversityPress.Felbo, B.; Mislove, A.; Søgaard, A.; Rahwan, I.; andLehmann, S. 2017. Using millions of emoji occurrencesto learn any-domain representations for detecting senti-ment, emotion and sarcasm. In
Conference on EmpiricalMethods in Natural Language Processing (EMNLP) .Gelman, A.; Carlin, J. B.; Stern, H. S.; Dunson, D. B.; Ve-htari, A.; and Rubin, D. B. 2013.
Bayesian Data Analysis,Third Edition . Chapman & Hall/CRC Texts in StatisticalScience. Taylor & Francis.Goodman, B., and Flaxman, S. 2016. European Union regu-lations on algorithmic decision-making and a "rightto explanation".Gopnik, A., and Meltzoff, A. N. 1997.
Words, thoughts, andtheories.
Learning, development, and conceptual change.Cambridge, MA, US: The MIT Press.Graham, J.; Haidt, J.; and Nosek, B. A. 2009. Liberals andconservatives rely on different sets of moral foundations.
Journal of Personality and Social Psychology .Haidt, J. 2001. The emotional dog and its rational tail: Asocial intuitionist approach to moral judgment.
Psycho-logical Review .Henrich, J.; Boyd, R.; Bowles, S.; Camerer, C.; Fehr, E.;Gintis, H.; and McElreath, R. 2001. In Search of HomoEconomicus: Behavioral Experiments in 15 Small-ScaleSocieties.
The American Economic Review
Proceedings of the National Academy ofSciences
Cognition thePhilosophy of Moral Development . Lei, T.; Barzilay, R.; and Jaakkola, T. 2016. RationalizingNeural Predictions.Lewandowski, D.; Kurowicka, D.; and Joe, H. 2009. Gen-erating random correlation matrices based on vines andextended onion method.
Journal of Multivariate Analysis
Trends in Cognitive Sciences
Elements of Moral Cognition . Cambridge:Cambridge University Press.Noothigattu, R.; Gaikwad, S. N. S.; Awad, E.; Dsouza, S.;Rahwan, I.; Ravikumar, P.; and Procaccia, A. D. 2017. AVoting-Based System for Ethical Decision Making.Oord, A. v. d.; Dieleman, S.; Zen, H.; Simonyan, K.;Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; andKavukcuoglu, K. 2016. WaveNet: A Generative Modelfor Raw Audio. 1–15.Rai, P., and Daum´e, H. 2009. The Infinite Hierarchical Fac-tor Regression Model.
Advances in Neural InformationProcessing Systems 21
Neural computation
Trends in neurosciences .Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.;Anguelov, D.; Erhan, D.; Vanhoucke, V.; and Rabinovich,A. Going Deeper with Convolutions.Tenenbaum, J. B.; Kemp, C.; Griffiths, T. L.; and Goodman,N. D. 2011. How to Grow a Mind: Statistics, Structure,and Abstraction.
Science