[PDF] On the mathematics of the free-choice paradigm

Abstract

Chen and Risen pointed out a logical flaw affecting the conclusions of a number of past experiments that used the free-choice paradigm to measure choice-induced attitude change. They went on to design and implement a free-choice experiment that used a novel type of control group in order to avoid this logical pitfall. In this paper, we describe a method by which a free-choice experiment can be correctly conducted even without a control group.

Full PDF

aa r X i v : . [ s t a t . O T ] A ug ON THE MATHEMATICS OF THE FREE-CHOICE PARADIGM

PETER SELINGER AND KRISTOPHER TAPP Introduction

Experimental design can be tricky to get right. A very potent illustration ofthis point is found in Chen and Risen’s recent identiﬁcation of a logical ﬂaw ina number of past free choice experiments studying the psychological concept of cognitive dissonance [4]. This illustration is unique in many ways. First, the mistakeis subtle yet elementary; in the simplest of the aﬀected experiments, the mistakeis equivalent to the error of misunderstanding the Monty Hall problem. Second,the mistake aﬀected a fairly large number of research experiments performed overa span of ﬁve decades. Third, it challenged some of the experimental evidence fordissonance theory, which is a well known and celebrated area of social psychology.In fact, the term “cognitive dissonance” has migrated from the scientiﬁc realm intopopular culture, where it occurs frequently in New York Times articles, is the titleof a Pod Cast, is featured in several Dilbert cartoons, and has been used to explaineverything from why President Clinton was not impeached to why some weight lossprograms work better than others.In this paper, our ﬁrst goal is expository – we wish to tell this interesting story,which seems to have largely escaped the notice of the mathematics community. Wewill describe the theory of cognitive dissonance, the experimental evidence on whichit rests, and the ﬂaw observed by Chen and Risen. Our second goal is to proposemethods by which this mistake can be ﬁxed. Chen and Risen already implementeda modiﬁed experiment that used a novel type of control group to avoid the logicalpitfall. We will describe experimental designs by which the ﬂaw can be avoidedeven without a control group. Our methods are based on the inherent symmetryof a free-choice experiment. Our third goal is to describe additional problems withall of these free choice experiments (including our own methods and the Chen-Risen method), calling back into question whether any free choice experiment cancorrectly measure the eﬀects of cognitive dissonance.

Acknowledgements

The authors would like to thank Keith Chen and Jane Risen for their insightfulsuggestions and comments on this work.2.

Cognitive Dissonance

Festinger coined the term cognitive dissonance to describe the uncomfortablecognitive state that arises when one’s actions are inconsistent with one’s underlyingattitudes/beliefs [7]. Dissonance Theory is largely about our tendency to reducedissonance by shifting our underlying attitudes. For example, after you break upwith a romantic partner, you might tell yourself that you never really liked him/herin the ﬁrst place. That’s dissonance – you are shifting your attitude (how you really feel about your ex-partner) to make it more compatible with your action (breakingup with him/her). It helps you to avoid regret and move on.If the end of a relationship seems too mundane, dissonance theory began withthe end of the world. In 1955, Festinger learned of a cult, which he named

TheSeekers , who believed that God would destroy the earth on December 21 of thatyear, but aliens from the planet Clarion would arrive in a spaceship before thedestruction to save the true believers. They really believed it. The gave awaytheir belongings, divorced their non-believing spouses, and sacriﬁced all that theyhad in order to follow the speciﬁc instructions they had received from Clarion.One of Festinger’s students inﬁltrated the group and was able to observe theirpreparations for the earth’s annihilation [9]. They gathered to wait with excitedanticipation for the spaceship that would save them at midnight. And they waited.A few minutes after midnight, they all reset their watches to agree with the onemember whose watch still read before midnight. Several minutes later, when eventhe modiﬁed watches read past midnight, one member realized that he had notfollowed Clarion’s instruction to remove all metal objects in preparation for spaceﬂight. He still had a metal tooth ﬁlling. He removed it. After several more hoursof nervous waiting, there was still no spaceship and no annihilation. How did theyhandle the gaping inconsistency they now felt between their actions and their newresignation that the earth would continue to spin? Cognitive dissonance doesn’tget much bigger than that. Festinger and his students had predicted that theywould employ heroic measures to rationalize away the apparent inconsistency. Andso they did. At 4:00am, they received a ﬁnal message from Clarion: “This littlegroup, sitting all night long, has spread so much goodness and light that the Godof the Universe has spared the Earth from its destruction.” It was their group thathad saved the earth! As Festinger had predicted, The Seekers not only found a newconsistency-restoring belief to cling to, but they used every possible media outletto share their new belief with the world that they had saved.Thus began a psychology theory which “has had an amazing ﬁfty-year run.” [5]Dissonance theory rests on the results of hundreds of laboratory experiments, whichfall into three experimental paradigms: free-choice , induced compliance , and eﬀortjustiﬁcation . The error observed by Chen and Risen aﬀects only the free-choiceparadigm, which we will discuss in the next section. In the remainder of thissection, we will brieﬂy discuss the two unaﬀected paradigms.The ﬁrst induced compliance experiment was performed by Festinger and Carl-smith [8], using students as subjects. In their experiment, each subject was askedto perform a boring task involving turning pegs on a peg board, and then asked toconvince another student that the task had been interesting (in an elaborate ruseto convince the subject to tell this lie, s/he was told something like this: “Theother student is waiting to be the next subject in this experiment. Unlike you, heis in the experimental group, which means that before performing the peg boardtask, he will be prepped by a confederate to believe that the task will be inter-esting. But the confederate did not show up, so can I hire you to play the roleof the confederate?”) Finally, the subject answered debrieﬁng questions, includingquestions about how interesting the task was. The result was that subjects paid $1to lie generally came to actually believe themselves that the task had been interest-ing, while subjects paid $20 to lie did not. Presumably, cognitive dissonance wasaroused by the inconsistency between ﬁnding the task boring but saying that it was HE FREE-CHOICE PARADIGM 3 interesting. A student paid $20 had a rational explanation for this inconsistency(it’s worth lying for that much money) while a student paid $1 may have uncon-sciously reasoned something like this: “I’m not dumb enough to lie for one dollar,so I must have been telling the truth.” Dozens of other induced compliance exper-iments have been performed. The majority of them used a similar design in whichstudents were convinced (by a similarly elaborate ruse) to write counter-attitudinalessays. It was generally found that students paid only a small amount to writesuch an essay tended to shift their beliefs in the direction of what they had written,while students paid a large amount did not.The ﬁrst eﬀort justiﬁcation experiment was performed by Aronson and Mills [1].They found that students who endured a severe initiation to join what turned outto be a dull club ended up liking the club more than students who endured onlya mild initiation. Presumably, these students had a larger degree of dissonancearoused by the inconsistency between their eﬀorts and their group experience, andthus had stronger motivation to reduce dissonance by shifting their attitudes aboutthe group.The induced compliance and eﬀort justiﬁcation experiments, and the dissonancetheory interpretation of their results, have been been attacked on many fronts.Cooper’s book [5] provides a wonderful overview of ﬁve decades of probing andre-working of these experiments and their interpretations. Versions of the exper-iments have been performed on subjects wired to polygraphs, subjects who hadunknowingly taken stimulants, subjects who had taken placebos but believed theyhad taken stimulants, subjects who underwent electric shock therapy, subjects withamnesia, children and draft-dodgers, and the outcomes have been correlated withthe subjects’ degree of self-esteem, level of introversion/extroversion, and ethnicity,to name a few. Cooper argues that dissonance theory largely survived the attacksin a modiﬁed form.None of the attacks were mathematical in nature until the free-choice paradigmwas challenged on probabilistic grounds in 2010.3.

The Free-Choice Paradigm

The ﬁrst free-choice experiment was performed by Brehm [2], a student of Fes-tinger. Dozens more free-choice experiments followed. In this section, we reviewthe structure of a typical free-choice experiment.A collection of objects is used, say 15 household objects. In the ﬁrst stage, asubject is asked to rank these objects from 1=most desirable to 15=least desirable.In the second stage, the subject is asked to choose between two of these objects,and is usually told that s/he will be allowed to take the selected one home. Forexample, the subject might be asked to choose between the objects that s/he justranked 7 th best and 9 th best (where the numbers 7 and 9 are pre-selected andconstant over all subjects). Finally in the third stage, the subject is again asked torank all 15 objects. This process is repeated with many subjects. Each subject’s spread is calculated as the amount that the ranking of the chosen object improves(i.e., decreases) plus the amount that the ranking of the rejected object worsens(i.e., increases) between the ﬁrst and third stage of the experiment.Positive spread is interpreted as an indication that the subject shifted his/herranking to make it more consistent with his/her object choice, presumably to reducedissonance. For example, if I chose to take home the hairdryer instead of the toaster, PETER SELINGER AND KRISTOPHER TAPP then dissonance is aroused by the inconsistency between my choice of the hairdryerand my lingering memories of the reasons that I almost chose the toaster (I alreadyown a hairdryer and the toaster sure looks shiny). I can reduce dissonance bydownplaying these memories (the new hairdryer is better than the one that I own,and shiny things hurt my eyes). This is the kind of opinion-shift that causes positivespread and was interpreted as evidence of choice-induced attitude change resultingfrom dissonance.In some experiments, positive average spread was taken as evidence for disso-nance theory. In other experiments, the spread was required to be more positivefor an experimental group than for a control group. Members of a control groupmight skip stage two, or else in stage two might be asked to make a choice betweena pair of irrelevant objects. Since they do not make the crucial choice in stage two,how is their spread deﬁned? Notice that in stage two, the majority of subjectsin the experimental group might be expected to make the consistent choice of the7 th ranked object over the 9 th ; however, since 7 and 9 are fairly close together, areasonable minority might make the reversal choice of the 9 th over the 7 th . For amember of the control group, spread can be deﬁned as if this member had madethe consistent choice; that is, spread equals the amount that the ranking of theoriginally-7 th -ranked object decreases below 7 plus the amount that the ranking ofthe originally-9 th -ranked object increases above 9. Brehm took this approach, andhe compared the average spread of all of the control group to the average spread ofonly those members of the experimental group who had made a consistent choice(thereby assuring that spread was deﬁned by the same formula for everyone). Otherexperiments have used a diﬀerent approach in which members of the control groupin stage two are randomly given either their 7 th or 9 th ranked object, and theirspread is then deﬁned as if they had chosen the object they were given.Another common variation is to ask the subject in stages one and three to provide ratings of the objects rather than a ranking of the objects. For example, each ofthe 15 objects might be separately rated on a scale of 1 to 10. When ratings areused, the choice in stage two is usually made between two objects that in stageone were rated similarly and highly (because diﬃcult decisions are hypothesized toarouse more dissonance).More recently, the free-choice setup was simpliﬁed to detect dissonance in mon-keys and children, who are not capable of communicating a complete ranking ofa collection of objects [6]. In fact, simplifying the setup made its hidden mathe-matical error become more noticeable. In the monkey version, the error becameessentially equivalent to the Monty Hall problem, which ﬁnally led to its discovery.We will restrict our attention to the experiments on adults, and we refer to [15] forthe story of monkeys and Monty Hall.4. The Chen-Risen critique of free-choice experiments

The outcomes of these free-choice experiments on adults – positive averagespread, or higher average spread for an experimental group than a control group– were taken as evidence that subjects shifted their rankings to reduce dissonance.Did you catch the mistake? Chen and Risen pointed out that these outcomes arepredicted by a null hypothesis model in which subjects never change their minds.In their model, each subject has a never-changing true ranking of the objects. Butrandom noise can cause the rankings that a subject provides in stages one and

HE FREE-CHOICE PARADIGM 5 three to diﬀer from this true ranking, and can cause the stage two choice to beinconsistent with this true ranking. Under natural hypotheses on the distributionsby which this random noise is modeled, Chen and Risen showed that many of theoutcomes of free-choice experiments which have previously been taken as evidencefor dissonance theory are actually predicted by their model. Thus, the past exper-imental results do not provide evidence against the “nobody changes their minds”null hypothesis.The key intuition behind Chen and Risen’s observation is this: a subject’s choicein stage two provides added probabilistic information about his/her true ranking.For example, suppose that in stage one, I rank the toaster 7 th best and the hairdryer9 th best. At this point, your best guess is that I truly like the toaster 7 th best andthe hairdryer 9 th best, although the truth could be otherwise due to random noise.Now suppose that in stage two, I select the hairdryer. At this point, how do youexpect that I truly feel about toasters and hairdryers? Your best guess is that mytrue feelings are a sort of average of the feelings I indicated in stages one and two.After factoring in the stage two probabilistic information, you would guess that Itruly like hairdryers more (and toasters less) than I indicated in stage one. Thus,you should predict that my stage three ranking will shift in the direction of mystage two choice – not because I will change my mind, but rather because my stagetwo choice indicated that my true feelings were always in this direction.Chen and Risen went on to design and implement a free-choice experiment whichavoided the error of its predecessors by using a novel type of control group; speciﬁ-cally, one whose members went through the same three stages, but with the order ofthe second and third stages reversed. Thus, a member of the control group rankedthe objects, then ranked the objects again, then chose between the ones which wereranked 7 th and 9 th best in the ﬁrst ranking. Just as before, the subject’s spreadwas deﬁned as the amount that the chosen object improved plus the amount thatthe rejected object worsened between the two rankings. Their null hypothesis (orat least a more precise formulation of it) predicts that the control group and exper-imental group should have the same average spread. After all, if subjects are actinglike computer programs spitting out random perturbations of their true rankings,then there is no causality between the stages, so the order of the stages is irrele-vant. But if cognitive dissonance really causes choice-induced attitude change, thenthe experimental group should have higher average spread than the control group.When the numbers were crunched, their experimental group had only nominallyhigher average spread, which provided only weak support for dissonance theory.The free-choice paradigm is one of the three legs of the tripod on which disso-nance theory rests. Chen and Risen’s novel control group represents one possibleway to repair this leg. Their experiment could be re-implemented with more sub-jects in the hope of obtaining statistically signiﬁcant results. But experiments areexpensive. The purpose of our paper is to contribute to the conversation abouthow future free-choice experiments would best be conducted. In the next section,we present alternatives to Chen and Risen’s control group method.5. A free-choice experiment with no control group

In this section, we demonstrate that it is possible to conduct a free-choice ex-periment even without a control group, in a manner that avoids the probabilisticerror that Chen and Risen identiﬁed in past experiments.

PETER SELINGER AND KRISTOPHER TAPP

We begin by more precisely describing a very general “nobody changes theirminds” null hypothesis model. Let n denote the number of objects (the previousdiscussion used n = 15). Let S ( n ) denote the set of all n ! possible orderings ofthe objects. The rankings that a subject provides in stages one and three aresamples from his/her ranking distribution , r : S ( n ) → [0 , σ ∈ S ( n ), the value r ( σ ) represents the probability that the subject willprovide that ranking. Notice that the same distribution is used for stages one andthree, which captures the key idea that the subjects never change their minds.Also notice that our model is more general than the Chen-Risen model, becausewe do not necessarily assume that the subject has a well deﬁned “true ranking”from which the ranking distribution is obtained by adding random noise. It is anarbitrary distribution, so the only requirement is that P σ ∈ S ( n ) r ( σ ) = 1.In past free-choice experiments, all subjects have made their stage-two choicesbetween the objects they just ranked in a pair of comparison positions (like 7 and9), which was pre-selected and constant over all subjects. We now suggest threemodiﬁcations of this design that make the experiment immune to the critique ofChen and Risen. Proposition 1.

Under the null hypothesis model, the expected average spread equalszero if the free-choice experiment is modiﬁed in any of these ways: (1)

All subjects make their stage-two choices between the same pre-selected pairof comparison objects (like hairdryer and toaster). (2)

Each subject makes his/her stage-two choice between the objects he/she justranked in a pair of comparison positions (like 7 and 9) that is uniformlyrandomly chosen separately for each subject. (3)

Each possible pair of comparison positions is used, one for each subject.For example, with n = 15 objects, there are 105 pairs of distinct numbersbetween 1 and 15, so this experiment requires exactly 105 subjects.Proof. To prove the ﬁrst claim, notice that the subject’s two rankings (in stages oneand three) are independent samples from the same distribution. Thus, exchangingthe order of these two rankings does not aﬀect the expected spread. But on theother hand, exchanging the order of the two rankings has the eﬀect of multiplyingthe expected spread by −

1; it follows that the expected spread for a single subject(under the null hypothesis model) can only be zero.It follows that the expected spread for a single subject equals zero if the stage-two choice is made between a pair of comparison objects that is uniformly randomlychosen separately for each subject. This is because it equals zero for every possiblepair of objects that could be chosen.To prove the second claim, notice that the pair of comparison objects will dependhere on the following two independent random processes: • The pair of ranking positions is chosen uniformly at random. • The subject’s stage-one ranking is sampled from his/her ranking distribu-tion.Since these two processes are independent, their order is irrelevant. If we imaginethat the subject’s ranking is provided ﬁrst, then it is easy to see that, for any ﬁxedranking s/he provides, uniformly randomly choosing a pair of positions from thisﬁxed ranking is equivalent to uniformly randomly choosing a pair of objects.

HE FREE-CHOICE PARADIGM 7

Notice that the above arguments establish that the expected spread for a single subject equals zero in experiments (1) and (2). If all subjects are identical, thenclaim (3) follows immediately, since in this case the expected spread for a singlesubject in (2) equals the average of all subjects’ expected spreads in (3). With non-identical subjects, we justify claim (3) as follows, provided that the assignment ofsubjects to ( i, j )-pairs is random. Assume for simplicity that n = 15, and considerthe 105 ×

105 matrix describing the expected spread for every subject using everypossible-( i, j ) pair. This matrix has a row for each subject and a column for each( i, j )-pair. Since the average of the entries of every row of this matrix equals zero,the expected average of 105 entries chosen one from each row (according to a randompermutation of the numbers 1 . . .

105 that assigns subjects to pairs) must also equalzero. (cid:3)

Each of the three modiﬁcations above represents a method for conducting a freechoice experiment without a control group. In all three cases, the null hypothesismodel predicts an expected average spread of zero, so a measured average spreadsigniﬁcantly above zero would be evidence against the null hypothesis, and mightreasonably be attributed to choice-induced attitude change.6.

Balanced positive and negative spread

Dissonance theory has suﬀered from confusion as to what the expected spreadwould be if subjects never changed their minds. Let i and j denote the rankingpositions used in stage two, with 1 ≤ i < j ≤ n , and deﬁne ∆ = j − i . Many pastexperimenters have assumed that the expected spread should equal zero. It wasﬁrst noted in [2] that regression to the mean can cause negative spread when ∆ islarge. For an extreme example, suppose that n = 15, i = 1 and j = 15. In thiscase, it is almost certain that the subject will make a consistent choice, in whichcase it is impossible for the spread to be positive, and regression to the mean willlikely cause it to be negative.Assuming that each subject’s ranking distribution is obtained from a true rankingby adding random noise in a manner that satisﬁes some natural hypotheses, Chenand Risen stated as a theorem that the expected spread is positive for every choiceof i, j , but they mentioned that regression to the mean can invalidate their proofwhen ∆ is large.In a typical situation, one might be led to expect positive spread when ∆ is smalldue to the probabilistic arguments of Chen and Risen, and negative spread when∆ is large due to regression to the mean, although the exact meaning of “large”and “small” might depend on the speciﬁc ranking distributions of the subjects.Proposition 1(3) implies that choices of i, j for which the expected spread is pos-itive must be balanced by choices for which it is negative. In the remainder of thissection, we illustrate this balance with a simulated example that attempts to modelthe type of random noise that aﬀects human rankings in free-choice experiments.Our simulation uses n = 12 objects. When asked to give a ranking of these12 objects, our simulated subject begins with its true ranking and modiﬁes it byrepeating the following procedure: it ﬂips a weighted coin, and if the coin landsheads, then it swaps the objects in a randomly chosen pair of adjacent positions. Itstops making changes when the coin ﬁrst lands tails. Thus, the volume of randomnoise is controlled by the coin’s probability of landing heads, which we denote p .Furthermore, the simulated subject’s choosing algorithm is the one induced by its PETER SELINGER AND KRISTOPHER TAPP ranking algorithm. In other words, when asked to choose between a pair of objects,it uses the above algorithm to rank all 12 objects, and then selects the better-rankedone of the pair.In Table 1, we have calculated the expected spreads (rounded to three decimals)for all { i, j } possibilities with p = 0 .

8. Prior to rounding, the 66 values in this tablemust sum to exactly zero by Proposition 1. Coincidentally, the rounded values sumto exactly zero as well.

Table 1.

Expected spreads for the 12-object experiment with p = 0 . i = 1 2 3 4 5 6 7 8 9 10 11 j = 1 − . − − . . − − . . . − − . . . . − − . − . . . . − − . − . . . . . − − . − . − . . . . . − − . − . − . . . . . . − − . − . − . − . − . . . . . − − . − . − . − . − . − . − . . . . − − . − . − . − . − . − . − . − . − . − . . In summary, this example is intended to illustrate Proposition 1 in order to aidour understanding of a question about which there has been some confusion inthe literature, namely, what spread one should expect under the assumption thatsubjects never change their minds.7.

Comparison of free-choice experiments

In this section, we compare several versions of the free-choice experiment. Someof our claims are justiﬁed by computer simulations of these experiments, in whichsubjects’ random noise is modeled by the coin ﬂipping process described in theprevious section. The three control-group-free experiments enumerated in Proposi-tion 1 will be denoted as E1, E2 and E3 respectively. The control-group-experimentof Chen and Risen will be denoted as E0.As we have already pointed out, all four experiments are expected to detect nodissonance under the null hypothesis. This follows from Chen and Risen’s result (forE0) and from Proposition 1 (for E1–E3). However, the experiments may diﬀer in sensitivity : how likely is each experiment to yield statistically signiﬁcant evidenceof dissonance in case the null hypothesis is false? Since this likelihood dependson the number of test subjects, to get a fair comparison, we assume that all fourexperiments use the same number of subjects. For simplicity, we assume that allfour experiments use 15 objects and 105 subjects. In E0, these subjects are splitbetween the experimental and control group.One disadvantage of E0 comes from having half as many subjects in its exper-imental group (which increases the standard error by a factor of √

2) and fromneeding to calculate the spread diﬀerence between two groups (which increases itby another factor of √

2, assuming the two groups have similar standard deviations,which was the case in simulations).The primary disadvantage of E2 is that many subjects are wasted on ( i, j )-choicesfor which ∆ is large enough that dissonance researchers would not hypothesize anyspread due to choice-induced attitude change. Dissonance and attitude change are

HE FREE-CHOICE PARADIGM 9 only theorized to emerge when the decision is hard enough that the subject feels aneed to rationalize choosing one item over the other.In computer simulations, E0 was more likely than E2 to produce statisticallysigniﬁcant evidence of a choice-induced spread phenomenon, if this phenomenonwas modeled to only appear when ∆ was small. However, the winner dependedcompletely on how we modeled the dependence on ∆ of the choice-induced spreadphenomenon, which felt like a fairly arbitrary decision. The decision could notbe empirically guided, since there is still no valid experimental evidence in theliterature of any type of choice-induced spread.It might be possible to implement E1 so that it has the advantage but not thedisadvantage of E2, by pre-selecting a pair of comparison objects that one expectsto appear close together near the center of most subjects’ rankings. To achieve this,one could let the 15 items be ranked by a small number of test subjects, and thenpick two items (e.g. toaster and hairdryer) that typically rank “near the middle.”The main experiment could then be done (with a diﬀerent group of test subjects),always computing the toaster/hairdryer spread. If it bears out that subjects in themain experiment also typically rank toaster and hairdryer close together, then notmany subjects will be wasted on ( i, j )-choices for which ∆ is large, and no subjectsare wasted on a control group either.One might think that E3 is an improvement on E2, since it eliminates the randomselection of ( i, j )-pairs, and thereby eliminates the worry that E2’s outcome couldbe blamed on a non-representative sampling of pairs. However, in simulations, E2and E3 had similar standard deviations. Moreover, since the 105 subjects in E3perform diﬀerent tasks, it is not necessarily valid to use the usual standard errorformula, SE = σ/ √ Further problems with the free-choice paradigm

In this section, we mention a problem shared by all four of the free-choice exper-iments considered in the last section; namely, a positive outcome could be blamedon psychological phenomena other than dissonance. Minimizing the impact of suchother phenomena on free-choice experiments has always been of concern to disso-nance researchers, going back as far as Brehm’s original work [2].For example, suppose that the act of choosing between two objects doesn’tchange the subject’s true ranking, but does force the subject to think more carefullyabout the true positions of these objects in his/her true ranking.This behavior can easily be modeled with simulated subjects as in Section 6,using two coin-weightings, 0 ≤ p ≤ P ≤

1. The subject’s ﬁrst ranking is subjectto the larger volume, P , of random noise. The subject’s object choice is subjectto the smaller volume, p , because s/he is forced to think about it more carefully.In the second ranking, the positioning of the comparison objects is subject to thesmaller volume of noise, because the subject has thought more carefully about thesepositions. Since spread depends only on the positioning of the comparison objects,it is equivalent to make the second ranking overall subject to the smaller volume, p ,of random noise. Of course the second ranking for a member of the control groupof E0 is still subject to the larger volume, P .Experiments E0, E2 and E3 will all report positive outcomes in the presence ofthis “think about it more carefully” phenomenon, even though the subjects never change their true rankings. To see this without a computer, consider the extremecase where p = 0 (no random noise) and P = 1 (the limit case in which rankingsare completely random). In this case, it is straightforward to calculate analyticallythat the expected spread in E2 and in E3, and the expected “spread diﬀerence”between the experimental and control groups of E0, are all equal to 16 /

3. However,with milder parameter choices, E2 and E3 are fooled much more dramatically thanE0. For example, with p = 0 . P = 0 .

9, the expected spread diﬀerence for E0equals 0 .

01, while the expected spread for E3 equals 0.14. Thus, while all of theexperiments are subject to this criticism, when parameters are set to reasonablymodel human behavior, the ﬂaw appears to be much more problematic in E2 andE3 than in E0.Memory is another psychological process on which positive results of free-choiceexperiments might be blamed. A subject will obviously remember his/her stage-twochoice, and might be inclined to construct a stage-three ranking that is consistentwith this choice. Even without assuming any particular coin-ﬂipping model for hu-man behavior, it is obvious that this inclination could lead to positive experimentaloutcomes in E0, E2 and E3. For example, compared to the control group in E0, theexperimental group will exhibit increased spread caused by this inclination to movethe stage-three ranking in the direction of the stage-two choice. While this mightbe considered a special case of the choice-induced attitude change that free-choiceexperiments hope to measure, it is too special of a case because it can be blamedon memory rather than dissonance.We have not yet discussed E1 because its outcomes will depend on the manner inwhich true rankings vary from subject to subject, which can’t be modeled accuratelyin computer simulations, since it depends heavily on the particular 15 objects thatare used and how real people feel about them. Nevertheless, it is clear that E1 issubject in principle to the same criticisms as the other experiments. For example, ifeach subject’s true ranking is modeled as completely random, than E1 is equivalentto E2.9.

The auxiliary hypotheses of past free-choice experiments

The probabilistic arguments of Chen and Risen cast some doubt on the conclu-sions of all past free-choice experiments. Although E0, E1, E2, and E3 are immuneto these criticisms, their results could be blamed on natural psychological processesother than dissonance. Can any type of free-choice experiment correctly measurechoice-induced attitude change caused by dissonance? One potential glimmer ofhope might come from reexamining the auxiliary hypotheses of past free-choiceexperiments. Here are summaries of a few of them:(1) Brehm and Jones performed a free-choice experiment in which subjectsrated music albums [3]. Before choosing between a pair of albums, someof the subjects were told that one of the pair was associated with a recordcompany promotion, so they would receive a free pair of movie tickets ifthey happened to choose the promoted one (they were not told which one itwas). Other subjects were not told about the promotion until after makingtheir choice. Among both groups of subjects, some won the free movietickets and some did not. The result was that subjects who knew aboutthe promotion before making the choice and who received the movie ticketsexhibited less positive spread than subjects in the other three categories.

HE FREE-CHOICE PARADIGM 11

Presumably, a foreseeable positive beneﬁt to their selection left them withless need to rationalize the choice.(2) Steele, Hopp, and Gonzales used the free-choice paradigm to study therole of aﬃrmation in dissonance theory [13]. Some of their subjects valuedscience highly, while others valued business highly. In both groups, someof the subjects performed the experiment wearing a lab coat, and some didnot. The result was that subjects who valued science and wore a lab coatexhibited less positive spread than subjects in the other three categories.One interpretation is to regard dissonance as posing a threat to one’s self-system. The lab coat presumably served as a symbolic aﬃrmation of acore value, thus reducing the need to decrease dissonance through attitudechange.(3) Stone used the free-choice paradigm to study the relationship of self-esteemto dissonance [14]. He found that subjects with high self-esteem showed lesspositive spread than subjects with low self-esteem, but only when the sub-jects were primed to think about personal, as opposed to social, standards.(4) Heine and Lehman found that Japanese men exhibited less positive spreadthan Canadian men [10]. One interpretation involves the diﬀerences be-tween independent and interdependent cultures. Kitayama, Snibbe, Markusand Suzuki tested this interpretation in a follow-up study in which theyasked some of the Japanese men to rank the albums as they thought that most students would rank them [11]. Subjects given these instructionsexhibited larger average spread than subjects who were asked to reportpersonal rankings. Presumably, dissonance was aroused more by eventsthat were felt to aﬀect interpersonal relationships with others.Given the known problems with the free-choice paradigm, what should be madeof these results? A signiﬁcant diﬀerence in spread between two groups must meansomething. Since there are now several known possible interpretations of positivespread other than dissonance, one must decide whether any of these alternativeinterpretations can explain the experimental results. Chen and Risen noted thatpositive spread can occur because people maintain true rankings. Could self-esteemaﬀect the degree to which subjects maintain true rankings? Positive spread canalso occur because of memory or the “think about it more carefully” phenomenon.Does self-esteem correlate with memory? Could the chance to win a movie ticketmake a subject think more carefully about which album has the promotion ratherthan about his/her true feelings for the albums? Could a lab coat make a science-valuing subject feel more obliged to construct a second ranking that is consistentwith his/her object choice?Although questions along these lines are important, in some cases the originalinterpretations of the experimental results feel more plausible than the alternativeinterpretations. Also, some of these auxiliary hypotheses were independently testedvia the induced-compliance and eﬀort-justiﬁcation paradigms. We believe that it isa bit of a stretch to interpret all of the above results under a model in which thereis no dissonance. Thus, taken together, the past free-choice experiments provide atleast some evidence of choice-induced attitude change caused by dissonance.

Appendix A. Expected spread calculation

In this appendix, we describe the methods by which we derived the expectedspread values in Table 1.In this 12-object experiment, the subject provides three rankings (the second ofwhich determines his/her stage-two choice). Thus, there are (12!) ≈ possibleoutcomes of this experiment, so the straightforward approach of tabulating allpossible outcomes on a computer does not work here. A statistical approach wouldbe reasonable; for example, one could approximate each expected spread in thetable as the average spread of 100 ,

000 computer simulated runs of the free choiceexperiment. But this introduces approximation errors. In the remainder of thissection, we instead describe an algorithm that calculates the exact value of eachexpected spread in the table.The key simpliﬁcation is to only track the ranking-positions of the two objectsbetween which a stage-two choice is made (say “toaster” and “hairdryer”), treatingthe other 10 objects as indistinguishable. So instead of managing all 12! permuta-tions of the 12 distinct objects, we will only manage the 12 ×

11 = 132 members ofthe set, S , of “simpliﬁed rankings” deﬁned as S = { ( a, b ) | ≤ a, b ≤ , a = b } , where a represents the position of the toaster, and b the position of the hairdryer.For example, the element (7 , ∈ S represents the ranking( ∗ , ∗ , hairdryer , ∗ , ∗ , ∗ , toaster , ∗ , ∗ , ∗ , ∗ , ∗ ) . We will consider the members of S to be ordered in some fashion from 1 to 132.We next deﬁne a pair of 132-by-132 matrices, called Q and M , whose rows andcolumns will be indexed by the elements of S . For s , s ∈ S , the entry Q s ,s (inrow s and column s of Q ) represents the probability that the act of swapping asingle randomly chosen adjacent pair of positions will convert s into s . The entry M s ,s represents the probability that the experiment’s random process (repeatedswapping randomly chosen adjacent pairs until the coin ﬁrst lands tails) will convert s into s . It is straightforward to design an algorithm that explicitly calculatesthe entries of Q . We can then compute the entries of M in terms of the entries of Q by solving the recursive formula M = (1 − p ) I + pM Q for M , which yields: M = (1 − p )( I − pQ ) − . To avoid rounding errors, this matrix calculation could be done over the rationalnumbers, although the denominators of the entries of M become rather unwieldy.At ﬁrst glance, the matrix M might seem useful only for studying a free-choiceexperiment in which the two objects (“toaster” and “hairdryer”) are pre-selected.However, by exploiting the experiment’s symmetry, we will now use M to study thefree-choice experiment in which the two object-positions ( i and j ) are pre-selected.It is convenient to re-name the objects as “1” through “12” in the order of thesubject’s true ranking. An arbitrary ranking, σ , can thereby be considered as abijection of { , ..., } , with the subject’s true ranking represented by the identityfunction, Id. Recall that the subject gives three rankings, σ , σ , σ , from which HE FREE-CHOICE PARADIGM 13 her spread, s = spread( σ , σ , σ ) is calculated as: s = ( ( j − i ) − (cid:0) σ ◦ σ − ( j ) − σ ◦ σ − ( i ) (cid:1) if σ ◦ σ − ( i ) < σ ◦ σ − ( j ) , (cid:0) σ ◦ σ − ( j ) − σ ◦ σ − ( i ) (cid:1) − ( j − i ) otherwise , Notice that spread( σ , σ , σ ) = spread( σ ◦ τ, σ ◦ τ, σ ◦ τ ) for any bijection τ of { , ..., } . In particular, s = spread( σ , σ , σ ) = spread(Id , σ ◦ σ − , σ ◦ σ − ) . Deﬁne s = ( i, j ) ∈ S . In terms of simpliﬁed rankings, our list of all possibleoutcomes of the experiment is: D = { ( s , s , s ) | s i ∈ S } , where s = σ − ( s ), s = σ ( s ) and s = σ ( s ), with permutations acting onmembers of S in the natural way here. Our strategy is to calculate the spread andthe probability of each of the 132 = 2 , ,

968 outcomes in D , and then computethe expected spread of the experiment as the corresponding weighted sum.Consider an arbitrary member β = ( s , s , s ) = (( a , b ) , ( a , b ) , ( a , b )) ∈ D .The spread of β simpliﬁes to: s = ( ( j − i ) − ( b − a ) if a < b , ( b − a ) − ( j − i ) otherwise . The probability of β equals P · P · P , where P = M s ,s , P = M s ,s , and P = M s ,s . To justify the ﬁrst of these three equations, observe that σ and σ − are identicallydistributed because the probability that any sequence of adjacent swaps will occuris the same as the probability that the sequence will occur in reverse order. References

1. E. Aronson and J. Mills,

The eﬀect of severity of initiation on liking for a group , Journal ofAbnormal and Social Psychology (1959), 177–181.2. J. W. Brehm, Postdecision changes in the desirability of alternatives , Journal of Abnormaland Social Psychology (1956), 384–389.3. J. W. Brehm and R. A. Jones, The eﬀect on dissonance of surprise consequences , Journal ofExperimental Social Psychology (1970), 420–431.4. M. K. Chen and J. L. Risen, How choice aﬀects and reﬂects preferences: Revisiting the free-choice paradigm , Journal of Personality and Social Psychology (2010), 573–594.5. J. Cooper, Cognitive dissonance: Fifty years of a classic theory , Sage Publishing, 2007.6. L. C. Egan, L. R. Santos, and P. Bloom,

The origins of cognitive dissonance: Evidence fromchildren and monkeys , Psychological Science (2007), 978–983.7. L. Festinger, A theory of cognitive dissonance , Stanford University Press, 1957.8. L. Festinger and J. M. Carlsmith,

Cognitive consequences of forced compliance , Journal ofAbnormal and Social Psychology (1959), 203–210.9. L. Festinger, H. W. Riecken, and S. Schachter, When prophecy fails , University of MinnesotaPress, 1956.10. S. J. Heine and D. R. Lehman,

Culture, dissonance, and self-aﬃrmation , Personality andSocial Psychology Bulletin (1997), 389–400.11. S. Kitayama, A. C. Snibbe, H. R. Markus, and T. Suzuki, Is there any free choice? Self anddissonance in two cultures , Psychological Science (2004), 527–533.12. C. M. Steele, The psychology of self-aﬃrmation: Sustaining the integrity of the self , Advancesin Experimental Social Psychology, vol. 21, Academic Press, 1988, pp. 261–302.

13. C. M. Steele, H. Hopp, and J. Gonzales,

Dissonance and the lab coat: Self-aﬃrmation andthe free choice paradigm , Unpublished manuscript, University of Washington. Cited in [12],1986.14. J. Stone,

What exactly have I done? The role of self-attribute accessibility in dissonance ,Cognitive dissonance: Progress on a pivotal theory in social psychology (E. Harmon-Jonesand J. Mills, eds.), American Psychological Association, 1999, pp. 175–200.15. K. Tapp,

The mathematics of measuring self-delusion , Math Horizons (2013), 5–9.

Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Sco-tia B3H 4R2, Canada

E-mail address : [email protected] Department of Mathematics, Saint Joseph’s University, 5600 City Ave., Philadelphia,PA 19131

E-mail address ::