A Subjective Model of Human Decision Making Based on Quantum Decision Theory
AA Subjective Model of Human Decision Making Based onQuantum Decision Theory
Chenda Zhang Hedvig Kjellstr¨om
Division of Robotics, Perception and LearningKTH Royal Institute of Technology, Sweden [email protected] [email protected]
Abstract
Computer modeling of human decision making is of largeimportance for, e.g., sustainable transport, urban develop-ment, and online recommendation systems. In this paper wepresent a model for predicting the behavior of an individualduring a binary game under different amounts of risk, gain,and time pressure. The model is based on Quantum DecisionTheory (QDT), which has been shown to enable modelingof the irrational and subjective aspects of the decision mak-ing, not accounted for by the classical Cumulative ProspectTheory (CPT). Experiments on two different datasets showthat our QDT-based approach outperforms both a CPT-basedapproach and data driven approaches such as feed-forwardneural networks and random forests.
Computer modeling of human decision-making has a longhistory, but is today of unprecedented importance with ap-plications in areas such as sustainable transport, urban de-velopment, and online recommendation systems.The established methodology for modeling of decisionmaking under risk is Cumulative Prospect Theory (CPT)(Kahneman and Tversky 2013) in which value is a functionof the gains and losses of available actions that the decision-maker could take. Stochastic versions of CPT, such as CPTwith a logit function (logit-CPT) (Carbone 1997), accountfor the inherent randomness in human decision-making. Thetheory is coherent and comprehensive and has been shownto model a large class of human decision-making problems.However, there have been observations (Birnbaum 2004)of systematic deviations from the outcome predicted by util-ity theories, e.g., interference between different variablesin complex decision making. In other words, the observeddecision-makers did not behave completely rationally.In response to this, the Quantum Decision Theory (QDT)(Yukalov and Sornette 2008) was developed. QDT is sum-marized in Section 4 below and in the Appendix, but canshortly be expressed as: Each option for the decision-makercorresponds to a prospect probability , which is the sum oftwo factors: the utility factor representing the classical utilityof the option, and the attraction factor (i.e., the core additionof QDT) representing irrational and subjective aspects thataffect the prospect attractiveness for the decision-maker. Earlier QDT-based prediction methods (Favre et al. 2016;Yukalov and Sornette 2016a; Vincent et al. 2016) used at-traction factor functions that only captured one aspect ofthe decision maker’s state of mind. The contribution of thepresent paper is a multi-modal attraction factor function thattake multiple emotional and cognitive aspects into account,such as the framing effect (Tversky and Kahneman 1981)or stress caused by a time constraint. We propose four pa-rameterized attraction factor components corresponding todifferent irrational effects in the decision making process.This is further described in Section 5.Experiments, presented in Section 6, confirm that the in-clusion of the proposed attraction factor enabled our QDTmodel to foresee the behavior of individual players to ahigher degree than a CPT model without attraction factor.Our model also outperforms purely data driven methodssuch as feed-forward neural networks and random forests.
Prospect Theory, with its development Cumulative ProspectTheory (CPT) (Tversky and Kahneman 1992), can be re-garded as the classical approach to modeling human deci-sion making under risk. It features several parameters thatisolate and quantify psychological concepts such as lossaversion, subjective value functions for gains and losses,and probability weighting functions. These parameters canbe estimated from data on an individual level (Harless andCamerer 1994; Harrison and Rutstr¨om 2009) if there areenough data supported for each subject, or in a hierarchi-cal (Nilsson, Rieskamp, and Wagenmakers 2011; Murphyand ten Brincke 2018) fashion, first on the aggregate leveland then refined for each individual. Mixture models havebeen used (Harrison and Rutstr¨om 2009; Conte, Hey, andMoffatt 2011) to account for the considerable heterogeneityobserved in people’s risk-taking behavior.Despite its successes in modeling and predicting humandecisions under uncertainty, Birnbaum (2004) concludesthat CPT, even with well-fitted parameters, is unable to ac-count for a number of observed phenomena, such as interfer-ence between variables in complex decision making, the im-mediate regret caused by an unsuccessful decision , or stresscaused by a time constraint. This implies that the prospecttheory is intrinsically incomplete.In response to this, Yukalov and et al. (2008; 2009) intro- a r X i v : . [ c s . A I] J a n uced the Quantum Decision Theory (QDT) with a goal ofproviding a complete framework for modeling and predict-ing human decisions. Its use of Hilbert spaces constitutes ageneralization of the probability theory axiomatized by Kol-mogorov (2018) for real-valued probabilities to probabili-ties derived from algebraic complex number theory. By itsmathematical structure, QDT aims at encompassing the su-perposition processes occurring down to the neuronal level.Numerous behavioral patterns, including the behavioral phe-nomena unexplained by utility-theoretic approaches, are co-herently explained by QDT (Yukalov and Sornette 2016b).A summary of QDT can be found in Section 4 of this pa-per. The authors demonstrated that QDT has the potentialto explain the results of certain non-trivial decision prob-lems (Shafffi, Smith, and Osherson 1990), unaccounted forby traditional models.Favre and et al. 2016 applied QDT to a lottery game withchoices between a certain and risky lottery. The result wasin good agreement with the QDT predictions at the level ofgroups. However, on the individual level, the authors pointedout the need to calibrate the attraction factors and character-ize individual decision-makers.Yukalov and et al. (Yukalov and Sornette 2016a) extendedthis work to games with both gain and loss, and refinedthe methodology for computing attraction and utility factors.The performance by QDT was compared to the famous ex-periments with CPT done by Kahneman and Tversky (Kah-neman and Tversky 2013). The results show that the predic-tions were in good agreement with the empirical data on anaggregated level.The first calibration and parameterization of QDT on anempirical dataset was conducted in (Vincent et al. 2016).Their model combines logit-CPT to model the utility fac-tor with a constant absolute risk aversion (CARA) functionto account for the attraction factor. The CARA function rep-resents the decision-makers’ will to avoid large risks. Exper-iments on a binary lottery dataset showed the proposed QDTmodel to outperform a pure logit-CPT model on both groupand individual level. The difference was especially promi-nent for decision tasks with large risk, thanks to the CARAattraction factor function.Vincent et al. pointed out the need for the inclusion of fur-ther factors characterizing the state of mind of the decisionmaker, which will explain variations in the attraction factornot related to the structure of the decision problem. In thepresent paper, we proceed in this direction by including fac-tors affecting the player’s cognition, such as time pressureand memory effects. We use data collected by (Diederich, Wyszynski, and Traub2020) from two experiments, both involving games withtwo-choice options, a sure option (where there is a lowerbut quaranteed gain) and a gamble option (where there is ahigher gain but with some uncertainty). The game was al-ways fair, in that the sure option and the gamble option al-ways had the same expected value. The experiments werealso designed with time constraints and minimum scores. In addition, the sure option was presented either with apositive or negative connotation to it, introducing a fram-ing effect (Tversky and Kahneman 1981). For example, in agame with an initial amount of 100 points and winning prob-ability of 0.4, the sure option would be presented as ”Keep40” in the positive framing, and ”Lose 60” in the negative.The data contain details on the setup of every individualgame trials and response from the participant.
Dataset 1.
19 participants were involved in the first exper-iment. Each participant needed to take three game sessions,where each session contained four blocks of trials. The par-ticipants were reimbursed per point gained in the games. Thegames had four different point amounts (25, 50, 75 or 100)and four probabilities of winning (0.3, 0.4, 0.6, or 0.7).For each game trial, a sure and a gamble option were cre-ated as described above, with equal expected gain in the sureand gamble options and sure option presented either in again or loss context. The experiment designers also designedeight catch trials with non-equivalent sure and gamble op-tions, to validate the systematicity of the decision-making.There were thus in total ∗ ∗ different tri-als. Within each block, each game was presented twice in arandom order, resulting in 80 trials per experimental block.Moreover, two different response time constraints (1 s or3 s) and three levels of need (0, 2500, or 3500) were induced,defined as the minimum number of points needed in order tokeep the earned points during the games in this block. Theblocks of trials were repeated twice for each response timeconstraint and once for each need level, yielding ∗ ∗ game trials and in total ∗
960 = 18240 data points.
Dataset 2.
Experiment 2 used the same design, but with 58(new) participants. Each trial had initial amount 19, 20, 21,39, 40, 41, 59, 60, 61, 79, 80 or 81, and winning probability0.3, 0.4, 0.6, or 0.7, forming ∗ ∗ different gamesetups with either positive or negative framing. Each blockcontaining 96 regular game trials (with each setup presentonce) and 8 catch trials.There were two time limits (1 s or 3 s) and three needlevels (0, 2800, or 3600 points). Blocks were repeated foreach unique combination, yielding six blocks with a total of (96 + 8) ∗ observations per participant. In total,there were thus ∗
58 = 36192 data points.
Quantum Decision Theory (QDT) (Yukalov and Sornette2008) is an intrinsically probabilistic framework that is de-rived from treating prospects and decision-makers’ state ofmind as vectors in a complex Hilbert space. A more de-tailed explanation and mathematical derivation will is givenin Appendix, while we are trying to give a brief overview ofthe theory in this section. When facing a decision problem,the decision-maker is assumed to be in a decision-makerstate Φ , which is a superposition of prospect states π i , eachprospect state corresponds to an option in the decision prob-lem. When the decision-maker makes a decision, Φ collapseinto one of the prospect states, which means he/she choosesthe option corresponds to that prospect state. general formula for π i ’s probability under QDT is: P ( π i ) = f ( π i ) + q ( π i ) (1)”f” is the utility factor, and ”q” is the attraction factor. Thereare also some constraints applying to the factors within theQDT framework.Since P ( π i ) are probabilities, they must satisfy: (cid:88) i P ( π i ) = 1 , ≤ P ( π i ) ≤ (2)As (Yukalov and Sornette 2009) mentioned, the quantum de-cision theory reduces to the classical utility theory when theattraction factors vanish, thus the utility factors play the roleof classical probability and must satisfy: (cid:88) i f ( π i ) = 1 , ≤ f ( π i ) ≤ (3)The attraction factors characterizes the attractiveness of theprospect, which is based on irrational subconscious fac-tors. They follow the alternation law (Yukalov and Sornette2016a): (cid:88) i q ( π i ) = 0 , − ≤ q ( π i ) ≤ (4)We will now introduce the formulas we used for the utilityfactor and attraction factor. Utility factors.
We use the utility function from ProspectTheory (Kahneman and Tversky 2013). The utility function U A of a option A has the following formula in the gain onlyor loss only domain: U A = w (cid:0) p A (cid:1) v (cid:0) V A (cid:1) + (cid:0) − w (cid:0) p A (cid:1)(cid:1) v (cid:0) V A (cid:1) (5)The value function v ( x ) reflects how people value thegains and losses according to a reference points(often setto 0). While the weighting function w(p) reflects people’saltitude towards different probabilities, such as overestimatevery small probabilities. There are many different variationsof the value functions and the weighting functions. One goodcombination is a power value function combined with a Pr-elec II weighting function. The value function v(x) has thefollowing formula: v ( x ) = (cid:26) x α x ≥ α > − λ ( − x ) α x < λ > The weighting function is: w ( p ) = exp ( − δ ( − ln( p )) γ ) , δ > γ > (6)We propose that the decision maker does not always valuethe utility in the same way;there are chances that the optionwith lower utility function value appear to have higher util-ity in the decision maker’s perspective. Therefore we supple-mented the logistic function as a stochastic choice function.It mimics the partition function from statistical mechanics.The option with higher utility function value will have ahigher f value: f A = 11 + e ϕ ( U B − U A ) (7) The f value for the other option becomes: f B = 1 − f A (8)There are four parameters that can be fitted in the utilityfactor: α, δ, γ, φ . As the utility factor in our QDT model, Weset the reference point at 0, so all the options will only beconsidered in the gain domain. Because we want the utilityterm to only capture the utility of the option and no optionwill lead to any point loss, in experiment definition. Notethat we are not using CPT as the utility term as we are onlyusing it in the gain domain. However, in the baseline CPTmodel, we set the reference point at the expected value ofeach gamble, thus two possibilities of the gamble option fallinto different domains where the value function behaves dif-ferently. Attraction factors.
Following the derivation in Yukalov’swork in 2016 (Yukalov and Sornette 2016b), the attractionfactor has the following constraints: q A = min ( f A , f B ) cos (∆ A ) , q A + q B = 0 (9)where cos (∆ A ) represents the the argument of the uncer-tainty parts of the event in the Hilbert Space. When there areonly two options, cos (∆ A ) = − cos (∆ B ) . The exact for-mula for ∆ A is impossible to be determined as the irrationalpart of decision making is highly unstable and complex.However, we could approximate it with the help of ourprior knowledge of the decision problem, in addition to someobservations of human decision behaviors. We propose thatthe attraction factor could be estimated by assembling dif-ferent attraction components, where each component rep-resents one subconscious factor that affects the decision-maker during the decision-making process. Theoretically, ifwe could address all such subconscious factors and modelthem with exact formulas, then the cos (∆ A ) can be approx-imated perfectly. However, human minds are highly com-plex, and it is not realistic to find exact formulas that cor-rectly describe them. One practical method is to define pa-rameterized formulas for each attraction factor componentand fit the parameters for each person.We have addressed the primary subconscious factors inour decision problem and designed parameterized formulas.We will introduce some notations first: π A : The prospectstate corresponds to the gamble option, π B :The prospectstate corresponds to the sure option, STD: The standard de-viation of the option, the sure option would have 0 in thisterm, I framing =1 if the sure option is presented in a gainframe, =-1 when lose frame is presented. TL: the time limitset on making the decision, in our experiment, this is either 1or 3. I previous = 1 if the player won the last gamble, equals-1 if he/she lost. When the player chose the sure option in thelast game trial, I precious = 0 . S need : the difference betweenthe current score and the minimum needed score. S init : Theinitial amount of each game trial.• The framing effect component.
Framing effect is a cog-nitive bias that the decision-maker changes their opinionon an option based on whether the option is presented in apositive or negative frame.(Tversky and Kahneman 1981)or example, losing 200 dollars in a total of 1000 dol-lars could be described as ”Keep 800 dollars” in a gainframe or ”Lose 200 dollars” in a losing frame. Experi-ments show that people tend to avoid risk when a gainframe is presented while they tend to seek risk when alost frame is presented. In the experiments we study, thesure option has been presented in either gaining frame ora losing frame. Thus the framing effect is an importantsubconscious factor in the decision-making process. Wemodel this component with the following formula: q framing = − · I framing · ST D c , c ≥ (10)In a losing frame, q framing becomes greater when the un-certainty is larger and vice versa, reflecting people’s risk-seeking behavior when a losing frame is presented. Onthe other hand, in a gain frame, q framing becomes greaterwhen the risk in the option is small.• The time pressure component.
Time constraint plays animportant role during a decision making process. Peo-ple appeared to be less rational when they are makingdecisions under an extreme time pressure.(Svenson andMaule 1993) The shorter the time constraint is, the greaterthe pressure affecting the decision-maker. (Diederich,Wyszynski, and Traub 2020) observe that the framing ef-fect is stronger under a shorter time limit, while the timelimit itself does not affect the decision strongly. Thereforewe model the time pressure component as an amplifier tothe framing effect component, with the following form: q time = Exp ( − c · T L ) , c ≥ (11)• The memory effect component.
One of the most difficultparts of modeling human decision making is the involve-ment of memory. Different people have entirely differentmemories, thus have different feelings on the same op-tion. In our experiment, the strongest memory would bethe result of the previous game trial. The participant ismore willing to choose a sure option after a losing gametrial to avoid consecutively lost while seeking more riskafter winning a game to maintain the luck. The memoryeffect component is modeled as: q memory = c · I previous · ST D (12)One important property of this component is that it alsoincorporates the uncertainty of the gamble into the for-mula, where the decision maker would perceive the un-certainty in different ways based on the outcome of theprevious game trial.•
The need component.
In both experiment 1 and exper-iment 2, participants need to reach the minimum neededscores for getting the bonus reward. It is obvious that thisneed criterion would be an important subconscious fac-tor in the decision-making process. When there is still alarge gap between the current score and the target score,the decision-maker would be more inclined to choose thegamble option, and vice versa when the gap is small aschoosing the sure option would be enough to reach the minimum needed score. The formula for this componentis: q need = c · ( S need − · ST D · (1 − P gamble ) (13)The − · ST D · (1 − P gamble term accounts people’s feelingof expected incoming points, which is subtracted from thereal gap between current points and expected one.When combining the components, we assume an additiverelationship between q frame , q memory and q need . The valueof the cos (∆ A ) is between -1 and 1, thus we need a normal-ization function. In addition, cos (∆ A ) represents the relativedifference between two options in the mind space, thus weare only interested in the difference between attraction factorvalues. The approximation for cos (∆ A ) is then cos (∆ A ) = tanh ( a · ( q Atotal − q Btotal )) (14) q total = q time · q frame + q memory + q need (15) Next, we present the various methods employed. * Data processing.
The raw experiment data from(Diederich, Wyszynski, and Traub 2020) was in Mi-crosoft Excel format, where each row contains the completeinformation about one game trial, including the initialamounts, probability of winning the gamble, time limit andthe minimum needed score in the game block, etc. We alsoremoved three subject’s data from Dataset 2 because ofmissing data entries. Therefore there are only 55 subjects inDataset 2. We then process the data, for both Datasets 1 and2, with the following steps:1. Group the data points according to the subject ID.2. Calculate new attributes for each row, e.g., the gap be-tween the current score and the required minimum scoreor an entry that tells the result of the previous game trial,which is used to calculate the memory effect component.3. Shuffle the dataset and split into six data blocks. We usedk-fold cross-validation to evaluate our model. Each datablock will be used as a test dataset once while the otherfive are used for training.
Estimation of parameters.
Our estimation method andnotations are inspired from (Vincent et al. 2016). The re-sponse at each game trial is denoted as Φ ij . Φ ij = 1 if theplayer i chooses the gamble option in game trial j, and equals0 if the sure option is chosen in that game trial. We used amaximum likelihood method for estimating the parameters.The target function used was: Π qi = Π j P ij ( π A ) Φ ij P ij ( π b ) − Φ ij (16)This estimation is done for every participant. As our goalis evaluating different attraction factor components, the at-traction parameters needed to be estimated are also differ-ent. For example, if the attraction factor only consists of thememory effect component and the need component, only c * The source codes will be available upon request able 1: Accuracy comparison for different QDT models (a) Dataset 1
Attraction factor AccuracyNone . ± . Time+Frame . ± . Memory . ± . Need . ± . Time+Frame+Memory . ± . Time+Frame+Memory+Need . ± . (b) Dataset 2 Attraction factor AccuracyNone . ± . Time+Frame . ± . Memory . ± . Need . ± . Time+Frame+Memory . ± . Time+Frame+Memory+Need . ± . and c need to be estimated. However, the four utility pa-rameters are always estimated. Regularization.
One major difficulty for the parameterestimation at the individual level was over-fitting. Since theamount of data for each individual subject is limited, oneresponse could greatly impact the value of the estimation.To address this problem, we added a regularization term inthe target function, so it becomes Π qi = − log (Π j P ij ( π A ) Φ ij P ij ( π b ) − Φ ij ) + (cid:88) k | c k | (17)with the assumption that people with similar educationalbackground should value the utilities in a similar way, wealso tried to fit the utility parameters at the aggregated level,using the sum of the target functions at individual level asthe target function. But the result was not promising. Bothtwo estimation methods use the minimize function fromScipy.optimize library, with the Nelder Mead algorithm. Tol-erance were set to 1e-6 and maximum iterations of 3000.The starting point was determined using a simple local gridsearch. Evaluation.
In order to validate the model, we utilize K-fold cross-validation, where K=6 for both experiments 1 and2. We have also implemented Random Forest, XGboost, andfeed-forward neural network for the prediction problem andcompared the performance of our model with them. For eachof these machine learning models, we also used 6-fold cross-validation. The number of trees in the Random Forest was100. The artificial neural network is a fully connected feed-forward neural network with one hidden layer of 10 neurons,using relu activation function and solved using Adam op-timization algorithm. The models and hyperparameters aretuned using Dataiku AutoML framework with sk-learn asback-end machine learning library. The information avail-able for every model are the same as the QDT models.
We measure the performance of the model with two mea-sures, prediction accuracy and the ability to predict the op-tion probabilities.
Accuracy.
Figure 1 represents the prediction accuracy ofdifferent models. For the QDT models, We choose the op-tion with higher calculated option probability as the predic-tion result. The prediction accuracy is calculated as the pro-portion of correctly predicted game trials in the test dataset. Figure 1 demonstrates the accuracy comparison between theCPT model, three data-driven models, and three QDT mod-els with different attraction factor functions. All the mod-els perform better on the first experiment than the secondone. There could be two reasons, data size difference and re-peated game trials. Dataset 1 has more data points per sub-ject than Dataset 2 (960 versus 624), which means there ismore information available for each subject in Dataset 1.There are also repeated game trials in Dataset 1, that aregame trials with exactly the same settings. Although theplayer does not always choose the same option for the re-peated game trials, this should still provide extra informa-tion to the models, especially when the repeated trials ex-ist in both training and testing datasets. Among three data-driven models, ANN performs the worst and Random For-est achieves the highest accuracy for both two datasets. Thebiggest issue for ANN could be the small data size, there arejust not enough data points for it to perform well. Randomforest with 100 decision trees performs surprisingly well onDataset 1, achieved an average accuracy of . . On theother hand, all three QDT models perform better than theCPT model. QDT using all four attraction components per-forms the best among all the models, surpasses Random For-est’s performance in both datasets.We treat the time effect and frame effect attraction fac-tor component together as a single attraction component be-cause the time effect component only serves as a multiplierfor the frame effect component. Tables 1a and 1b demon-strate the accuracy performance for different attraction fac-tor components and their combinations. All three attractionFigure 1: Accuracy comparison for different models a) Dataset 1 (b) Dataset 2 Figure 2: The ability of predicting the probabilities of picking each option. Game trials are grouped into ten groups according toQDT calculated option probability for the gamble option. For each group, the proportion of game trials that the gamble optionhas been chose in the empirical data is plotted. The ideal result is that all ten points lie in the green interval. The vertical widthof the green interval is 0.1 as each group represents a probability interval of length 0.1.factor components, when working on its own, increases theaccuracy in comparison with the logit-CPT.Time plus frame effect attraction factor components per-form the best when using alone, while memory and needeffect components’ contribution are similar. When comb-ing the memory component with Time+Frame effect com-ponents, the performance increase is not significant for bothtwo datasets. After adding q need , the QDT model achievesthe highest prediction accuracy for both datasets.We would like to conclude that combining different at-traction factor components is useful, but the performanceimprovement seems not to be additive. Combing two usefulattraction factor components does not necessarily improvethe prediction accuracy.If the choices are probabilistic, there will also be a hardbarrier preventing the accuracy from being further increased.This is also mentioned by (Vincent et al. 2016) and (Murphyand ten Brincke 2018). This could also be an explanation tothe performance difference between the two datasets. Gametrials in dataset 1 could be more biased or easier to decidethan game trials in dataset 2. Option probabilities.
To evaluate the ability of calculat-ing the option probabilities, we group all the game trials intoten groups according to the predicted gamble option proba-Figure 3: The distribution of utility factor values and attrac-tion factor values are shown in the same plot. bilities. The first group contains games which the model pre-dicts the decision-maker will have 0-10% chance of pickingthe gamble option, the second group contains games with10-20% option probability, and so on. We then count the em-pirical proportion of choosing the gamble option within eachgroup and check if they actually fall into the calculated prob-ability interval. For example, for the first group, the idealcase would be only − of games trials received gam-ble option as response in the reality. If we assume a uniformdistribution of predicted option probabilities within each in-terval, the most optimal empirical choose proportion wouldbe the midpoint of each interval, that is, for the first in-terval, for the second.Figure 2 shows the empirical choice proportion againstthe predicted option probability. We observe that nearly allthe QDT models predict the option probability well, wherethe empirical statistics always stay within the acceptable in-terval. QDT using all four attraction factor components per-forms as best as the empirical statistics are closer to the mid-dle point of each interval. On the other hand, CPT with logitchoice function seems to overestimate the probabilities onboth two datasets. Effect of including catch trials.
Catch trials are game tri-als with unfair options. For example, a gamble biased catchtrial could have a gamble option of keeping 100 points with70 % chance and a sure option of keeping 40 points. Wefound out that including the catch trials in the training dataset boosts the performance of both CPT and QDT models.For Dataset 2, we observe a performance boost of approx-imately for both CPT and QDT models. The catch trialshas the potential to capture some influential subconsciouseffects. Because there exists a clearly better option, any re-sponse that chooses the ”wrong” option could be caused byirrational aspects of the decision maker’s state of mind, suchas great risk aversion under extreme time constraint or will-ing of avoiding uncertainty when the participant’s score isvery close to the minimum needed score. The amount ofsuch ”abnormal” responses in the catch trial reflect howuch do the subconscious effects impact the participant,thus including the catch trials brings performance boost. Factor values and probability distribution.
Figure 3shows the distribution of utility factor values f( π gamble ) andattraction factor values q( π gamble ) on an aggregated level fordataset 1. The utility factor f( π ) is defined between 0 and 1;thus, the orange line only exists in that interval. The util-ity factor values are distributed in a relatively uniform way,without a significant peak. There are comparably more gametrials that have a utility factor value greater than 0.6, whichindicates many participants in Dataset 1 are more inclinedtowards the gamble option without considering the impact ofsubconscious effects, even though every game trials are fair.On the other hand, the attraction factor values are distributedin a Gaussian-like distribution with a mean at zero. There arenearly no game trial and participant that gives a very extremeattraction factor values. The reason could be that the abso-lute value of the attraction factor is bounded by the minimumvalue between two utility factors, which means the absolutevalue of the attraction factor would never exceed 0.5. Be-sides, considering there are significantly many game trialsthat have polarized utility factor values, either a small gam-ble option utility factor or a small sure option utility factor,the small amplitudes of attraction factors are not hard to un-derstand. There are nearly equally many game trials on thepositive side and negative side, one primary reason could bethe intrinsic symmetric design of the experiments. E.g. forthe framing effect; half of the game trials have a sure optiondescribed in a losing frame, while another half with a gainframe description. Therefore the values of the frame effectcomponents will also be distributed symmetrically.Figure 4 demonstrates how the probability of the gambleoption is distributed. The probabilities predicted by QDTare more polarized than the ones by CPT; there are manymore game trials with very high or very low values of op-tion probability. In contrast, most of CPT’s option probabil-ities concentrate at a relatively average interval, between 0.5and 0.75. The attraction factors seem to add another level ofpreference on top of the classical utility preference, makingone option much more preferable than the other option. Simulations using QDT calculated option probabilities.
QDT is a probabilistic framework. The decision-maker doesnot make deterministic decisions. Therefore we use the op-Figure 4: Number of game trials in each probability interval. tion probabilities calculated by QDT and CPT to simulatethe experiments and compared them with empirical data.For both Dataset 1 and 2, 1000 simulations are done foreach participant. We then analyzed the similarity betweenthe simulated players’ answers with real participants’ an-swers by inspecting the proportion of game trials that thesimulated player answered the same as the real participant.We counted the number of simulated players based on theresponse similarity and Figure 5 show the results. For bothdatasets 1 and 2, the QDT model performs better than theCPT model; there are more simulated players with higher re-sponse similarity. We also observe that the distributions aremultimodal. By the central limit theorem, we are expectingthe distribution for a single participant to be normally dis-tributed. When aggregating the result for all participants, theaggregated distribution becomes a mixture of normal distri-butions. The modes of the distributions indicate there is agroup of participants that have been modeled similarly wellby the model. For Dataset 1, the CPT model simulates onegroup of participants well, with a peak at 0.75, but has amuch lower response similarity for the rest of the partici-pants as we see the other peak at 0.625. On the other hand,all three modes of QDT model’s distribution are relativelyhigher. This indicates that the QDT models predict more po-larized option probabilities, which also agrees with the resultshown in Figure 4. The situation is the same for Dataset 2,both CPT and QDT have a peak at 0.625, but CPT has an-other peak at lower response similarity of 0.5, which meansthere is a group of participants that the CPT model predictsvery fair option probabilities between the gamble option andsure option. The added attraction factors capture the sub-conscious effects of these participants and thus, the optionprobabilities become more polarized, indicating one optionis more preferable than the other. Therefore the QDT modelsalways have a higher response similarity.
We present a method for modeling human decision-makingusing QDT, using several different attraction factor compo-nents to capture a range of irrational and subjective aspectsof the decision-making. Experiments show that this model-ing of the decision-maker’s state of mind enables a more ac-curate prediction of individual decisions than with classicalCPT, which only models limited cognitive aspects. More-over, the proposed model outperforms a range of data-drivenmethods.The parameters are now tuned individually. To allowfaster adaptation to new individuals, a good option is hierar-chical parameter estimation (Murphy and ten Brincke 2018)with training first on the group level and then fine-tuning foreach individual. Another future direction could be formulat-ing more generally applicable attraction factor componentsthat can be used in a variety of decision problems. Anotherinteresting direction is to combine to model with deep neuralnetworks. One major problem in applying deep neural net-works in predicting human decisions is the lack of data. OurQDT model could be served as a cognitive model that canbe used to generate massive human decision data. Approx-imate the attraction factor using neural networks could also a) Dataset 1 (b) Dataset 2
Figure 5: Simulation results for 1000 simulations. For each subject in each simulation, we calculate the proportion of gametrials that has the same response as the empirical dataset. The distribution of the response similarity is plotted.be a direction worth exploring
References
Birnbaum, M. H. 2004. Tests of rank-dependent utility andcumulative prospect theory in gambles represented by natu-ral frequencies: Effects of format, event framing, and branchsplitting.
Organizational Behavior and Human DecisionProcesses
Economics Letters
57: 305–311.Conte, A.; Hey, J. D.; and Moffatt, P. G. 2011. Mixture mod-els of choice under risk.
Journal of Econometrics
The-ory and Decision
PLOS ONE
Econometrica:Journal of the Econometric Society
Experimental Economics
Handbook of the Fun-damentals of Financial Decision Making: Part I , 99–127.World Scientific.Kolmogorov, A. N.; and Bharucha-Reid, A. T. 2018.
Foun-dations of the Theory of Probability: Second English Edi-tion . Courier Dover Publications.Murphy, R. O.; and ten Brincke, R. H. 2018. Hierarchicalmaximum likelihood parameter estimation for cumulative prospect theory: Improving the reliability of individual riskparameter estimates.
Management Science
Journal of Mathematical Psychology
Memory & Cognition
Time pressure andstress in human judgment and decision making . SpringerScience & Business Media.Tversky, A.; and Kahneman, D. 1981. The framing of de-cisions and the psychology of choice.
Science
Journal ofRisk and Uncertainty
SwissFinance Institute Research Paper (16-31).Yukalov, V. I.; and Sornette, D. 2008. Quantum decisiontheory as quantum theory of measurement.
Physics LettersA
Entropy
Advances in Complex Sys-tems
IEEE Transactions onSystems, Man, and Cybernetics: Systems
Philosophical Transactionsof the Royal Society A: Mathematical, Physical and Engi-neering Sciences
A Appendix
This appendix is borrowed from (Vincent et al. 2016), devel-oped by Yukalov and Sornette in a series of articles.(Yukalovand Sornette 2008)(Yukalov and Sornette 2009)(Yukalovand Sornette 2010). Quantum decision theory (QDT) has re-cently been introduced as an alternative formulation to exist-ing theories. It is based on two essential ideas: (i) an intrinsicprobabilistic nature of decision making and (ii) a generalisa-tion of probabilities using the mathematics of Hilbert spacesthat naturally accounts for entanglement between choices.
A.1 Mathematical structure of QDT
Definition.1(Action Ring)
The action ring A = { A n : n = 1 , , ...N } is the set of in-tended actions, endowed with two binary operations:-The reversible and associative addition.-The non-distributive and non-commutative multiplication,which possesses a zero element called empty action.The interpretation of the sum A + B is that A or B is intendedto occur. The product AB means that A and B will both oc-cur. The zero element is the impossible action, so AB = BA= 0 means that the actions A and B cannot occur together:they are disjoint. Definition.2(Composite action and action modes)
When an action A n can be represented as an union (i.e. isthe sum of several actions), it is referred to as composite.Otherwise it is simple.The particular ways A jn of realizing a composite action A n are called the action modes and are disjoint simple elements: A n = M n (cid:91) j A jn M n > Definition.3(Elementary prospects)
An elementary prospect e α is an intersection of separate ac-tion modes, e α = (cid:92) n A αn where the A αn are action modes such that e α e β = 0 if α (cid:54) = β . Definition.4(Action prospect)
A prospect π n is an intersection of intended actions, each ofwhich can be simple (represented by a single action mode)or composite. π n = (cid:92) j A n j To each action mode, we associate a mode state | A jn (cid:105) andits hermitian conjugate (cid:104) A jn | . Action modes are assumed tobe orthogonal and normalized to one, so that (cid:104) A jn | A kn (cid:105) = δ jk . This allows us to define orthonornal basic states for theelementary prospects: | e α (cid:105) = | A α . . . A αN (cid:105) (cid:104) e α | e β (cid:105) = (cid:89) n δ α n δ β n = δ αβ Definition.5(Mind space and prospect state)
The mind space is the Hilbert space M = Span {| e α (cid:105)} For each prospect π n , there corresponds a prospect state | π n (cid:105) ∈ M | π n (cid:105) = (cid:88) α a α | e α (cid:105) Definition.6(Strategic state of mind)
The strategic state is a normalized fixed state of the mindspace M describing a decision maker at a given time: | ψ (cid:105) = (cid:88) α c α | e α (cid:105) The strategic state characterizes a particular decision makerat a given time, it includes his/her personal attributes and isrelated to the information available to the decision maker.
A.2 Prospect probabilities
In the context of quantum decision theory, the preferencesof a decision maker depend on his/her state of mind andon the available prospects. Those preferences are expressedthrough prospect operators.
Definition.7(Prospect Operator)
For each prospect π n , we define the prospect operator ˆ P ( π n ) = | π n (cid:105) (cid:104) π n | By this definition, the prospect operator is self-adjoint. Itsaverage over the state of mind defines the prospect probabil-ity p ( π n ) : p ( π n ) = (cid:68) ψ (cid:12)(cid:12)(cid:12) ˆ P ( π n ) (cid:12)(cid:12)(cid:12) ψ (cid:69) The decision maker is more likely to choose the prospectwith the highest prospect probability. The probabilitiesshould correspond to the frequency with which the prospectwould be chosen if the choice could be made several timesin a same state of mind.By definition .5 and .6, we can distinguish two terms inthe expression of p ( π n ) : a utility factor f( π n ) and an attrac-tion factor q ( π n ) : p ( π n ) = f ( π n ) + q ( π n ) f ( π n ) = (cid:88) α | c ∗ α a α | q ( π n ) = (cid:88) α (cid:54) = β c ∗ α a α a ∗ β c β Within the framework of quantum decision theory, the utilityand attraction terms are subjected to additional constraints: f ( π n ) ∈ [0 , , (cid:88) f ( π n ) = 1 q ( π n ) ∈ [ − , , (cid:88) q ( π nn