OOn a Generalized Form of Sub jective Probability
Russell J. Bowater
Independent researcher. Corresponding author. Contact via given email address or viathe website: sites.google.com/site/bowaterfospage
Ludmila E. Guzm´an-Pantoja
Assistant professor, Institute of Agro-industries, Technological University of the Mixteca,Carretera a Acatlima Km. 2.5, Huajuapan de Le´on, Oaxaca, C.P. 69000, Mexico.
Abstract:
This paper is motivated by the questions of how to give the concept of probabilityan adequate real-world meaning, and how to explain a certain type of phenomenon that can befound, for instance, in Ellsberg’s paradox. It attempts to answer these questions by construct-ing an alternative theory to one that was proposed in earlier papers on the basis of variousimportant criticisms that were raised against this earlier theory. The conceptual principles ofthe corresponding definition of probability are laid out and explained in detail. In particular,what is required to fully specify a probability distribution under this definition is not just thedistribution function of the variable concerned, but also an assessment of the internal and/orthe external strength of this function relative to other distribution functions of interest. Thisway of defining probability is applied to various examples and problems including, perhapsmost notably, to a long-running controversy concerning the distinction between Bayesian andfiducial inference. The characteristics of this definition of probability are carefully evaluated interms of the issues that it sets out to address.
Keywords:
Additivity of probabilities; Ellsberg’s paradox; Fiducial inference; Internal andexternal strength; Reference set; Similarity. a r X i v : . [ s t a t . O T ] O c t . Introduction Over the years the issue of how to give the concept of probability a real-world meaning hasproved to be controversial, see for example Fine (1973), Gillies (2000) and Eagle (2011).Closely related to this issue is the problem of how to adequately elicit subjective proba-bilities in any given practical context. Various approaches have been suggested to tacklethis latter problem, see for example, Kadane and Wolfson (1998), Garthwaite, Kadaneand O’Hagan (2005), O’Hagan et al. (2006) and Kynn (2008). A method incorporatedinto some of these approaches involves comparing the likeliness of any given event ofinterest with the likeliness of various given unions of outcomes of a standard experiment,e.g. drawing a ball out of an urn containing distinctly labelled balls or spinning what isknown as a probability wheel (see Spetzler and Stael von Holstein 1975).In Bowater (2017a) a definition of the probability of an event was proposed, namelytype B probability, that was based around this elicitation method, and in particular, onordering the similarities that are felt between the likeliness of the two events in variousgiven event pairings. This definition was subsequently extended in Bowater (2017b) sothat continuous probability distributions could be characterized in an analogous manner,and was applied to the problem of statistical inference both in Bowater (2017b) andBowater and Guzm´an (2018). However, the following criticisms have been raised againstthis definition of probability:1) It is unclear how probabilities can be made to obey the additivity rule of probability,which is regarded as one of the main aims of the definition.2) It is inconvenient that probabilities are only defined at evenly spaced points on theinterval [0 ,
1] with the spacing between points being potentially quite large.3) The dependency of probabilities on a reference set of events is unattractive.4) The definition does not appear to be universal, e.g. the type of characterization this2efinition gives to continuous probability distributions was not extended to discrete orcategorical probability distributions.The main aim of the present paper is to substantially overhaul this definition of prob-ability in a way that attempts to address these criticisms.As well as trying to give probability an adequate real-world meaning, the work outlinedin the present paper, as was the case in Bowater (2017a), is motivated by the questionof how to explain a particular type of phenomenon which can not be easily explainedby applying the conventional mathematical definition of probability. One of the moststandard (but perhaps one of the least convincing) instances of this type of phenomenoncan be found in what is known as Ellsberg’s two colour or two urn example, see Ells-berg (1961). Given that some readers may not be familiar with this example it will nowbe briefly outlined.Let us imagine that there are two urns that both contain 100 balls where each ballmay be either red or black in colour. In the first urn the ratio of red to black balls isentirely unknown, i.e. there may be from 0 to 100 red or black balls in the urn. Bycontrast, in the second urn it is known that there are exactly 50 red balls and 50 blackballs. An individual is asked to decide which urn he would prefer to randomly draw aball out of if getting a red ball wins $100 while getting a black ball wins nothing, andwhich urn he would prefer if a black ball wins $100 while a red ball wins nothing.In Ellsberg (1961) it is claimed that, first, the majority of people would prefer thesecond urn in response to both questions, which is a claim supported by later experi-ments, e.g. Fellner (1961), Becker and Brownson (1964) and Curley and Yates (1989)and, perhaps more importantly, that this behaviour can not be assumed to be irrational.This type of behaviour is regarded by some as representing a paradox, and is in factknown as a version of Ellsberg’s paradox, as it goes against the idea that an individualwould prefer the urn associated with the highest probability of winning the prize or be3ndifferent between urns that have the same probability of winning the prize.Having clarified the motivation for this paper, let us give a brief description of itsstructure. The main theoretical principles of the definition of probability that will beproposed are laid out and explained in detail in the sections that immediately follow, inparticular Sections 2.1 to 3.5. A substantive application of this definition of probabilityis then presented in Sections 3.6 and 3.7, and a possible extension to the main theorywith regard to a special case is proposed in Section 3.8. The final two sections of thepaper discuss how well the theory achieves the objectives that have been outlined in thepresent section.
2. Fundamental concepts2.1. Disclaimer
While the theory that will be outlined in the present paper has a great deal in commonwith the theory outlined in Bowater (2017a, 2017b), it is nevertheless a theory that isintended as a substitute for, rather than an extension of, this earlier theory. As a resultthe definitions used in the present paper generally stand alone from those used in theseearlier papers, and caution is recommended in using this earlier work to try to gaingreater insight about the present work.
In contrast to Bowater (2017a, 2017b) where the concept of strength was developedseparately for the probability of an event and for continuous probability distributions,here the concept of strength will be defined primarily as a concept that is applied to(cumulative) distribution functions.A probability distribution will be defined by its distribution function and the strength4f this function relative to other distribution functions of interest. The distribution func-tion will be defined as having the standard mathematical properties of such a function.The definition of the concept of strength will be outlined and discussed in detail in Sec-tion 3, after some more fundamental concepts have been presented in the sections thatimmediately follow.In the theory that will be developed, the probabilities of events will be analysed inthe context of the discrete or continuous distribution functions to which they must beassociated. This includes the simple case where the distribution function is defined by justthe probability of a given event and that of its complement, i.e. a Bernoulli distributionfunction.
Let S ( A, B ) denote the similarity that a given individual feels there is between his confi-dence (or conviction) that an event A will occur and his confidence (or conviction) thatan event B will occur. For any three events A , B and C , it will be assumed that anindividual is capable of deciding whether or not the orderings S ( A, B ) > S ( A, C ) and S ( A, B ) < S ( A, C ) are applicable. The notation S ( A, B ) = S ( A, C ) will be used torepresent the case where neither of these orderings apply. However, for any fourth event D , it will not be assumed, in general, that an individual is capable of deciding whether ornot the orderings S ( A, B ) > S ( C, D ) and S ( A, B ) < S ( C, D ) are applicable. Therefore,a similarity S ( A, B ) can be categorized as a partially orderable attribute of any givenpair of events A and B . This is essentially the same definition of the concept of similarityas used in Bowater (2017a, 2017b). 5 .4. Reference sets of eventsDefinition 1: Discrete reference set of events Let O = { O , O , . . . , O k } be a finite ordered set of k mutually exclusive and exhaustiveevents. It will be assumed that for any given three subsets O (1), O (2) and O (3) of theset O that contain the same number of events, the following is true: S (cid:91) O i ∈ O (1) O i , (cid:91) O i ∈ O (2) O i = S (cid:91) O i ∈ O (1) O i , (cid:91) O i ∈ O (3) O i It now follows that the discrete reference set of events R is defined by R = { R ( λ ) : λ ∈ Λ } (1)where R ( λ ) = O ∪ O ∪ · · · ∪ O λk and Λ = { /k, /k, . . . , ( k − /k } It should be clear that any given individual could easily decide that the set of all theoutcomes of drawing a ball out of an urn containing k distinctly labelled balls could bethe set O . Definition 2: Continuous reference set of events
Let V be a random variable that must take a value in the interval Λ = (0 , ,
1) thathave the same total length, the following is true: S ( { V ∈ Λ(1) } , { V ∈ Λ(2) } ) = S ( { V ∈ Λ(1) } , { V ∈ Λ(3) } )It now follows that the continuous reference set of events R is defined by equation (1)but with the set Λ defined as it is presently, i.e. as the interval (0 ,
1) and the event R ( λ )defined to be the event { V < λ } .Again, it should be clear that any given individual could easily decide that the outcomeof spinning a wheel of unit circumference, as defined by the position on its circumference6ndicated by a fixed pointer in its centre, could be the variable V , assuming the positionis measured as the distance in a given direction around the circumference from a givenpoint on the circumference. A scaling event L ( λ ) will be defined as the event { V ∗ < λ } , where λ ∈ [0 ,
1] and V ∗ hasthe same definition as the random variable V used in Definition 2 but with the addedcondition that it must be the outcome of a well-understood physical experiment, such asthe outcome of spinning the type of wheel described in the previous section. Since whatdoes or does not constitute a well-understood physical experiment is rather vague, thedefinition of a scaling event is open to criticism. The relevance of this criticism shouldbe taken into account with respect to the way scaling events are used in the rest of thispaper. A discrete or continuous reference set R will be defined as being compatible with adiscrete or continuous reference set R if Λ ∩ Λ (cid:54) = ∅ , where Λ and Λ are the sets ofallowable values of λ for R and R respectively and, for all conceivable pairs of events E and E , and all λ ∈ Λ ∩ Λ , it holds that S ( E , R ( λ )) > S ( E , R ( λ )) if and only if S ( E , R ( λ )) > S ( E , R ( λ ))For example, we would expect a rational individual to decide that a reference set R based on the outcomes of drawing out a ball from an urn containing 10 distinctly labelledballs is compatible with a reference set R based on the outcomes of drawing out a ballfrom an urn containing 100 distinctly labelled balls, where the set Λ ∩ Λ would be ofcourse equal to the set Λ = { . , . , . . . , . } . Also, a rational individual may well7ecide that a reference set R based on the outcomes of drawing out a ball from an urncontaining k distinctly labelled balls is compatible with a reference set R based on theoutcome of spinning the type of wheel described in Section 2.4, where the set Λ ∩ Λ would be of course equal to the set Λ = { /k, /k, . . . , ( k − /k } .On the other hand, similar to Ellsberg’s two urn example described in the Introduction,let us imagine that there are two urns that both contain k balls, where each ball has beenmarked with a number that is in the range from 1 to k . In the first urn, the numberof balls that have been marked with any given number is entirely unknown, i.e. theremay be 0 to k balls marked with any given number, while in the second urn, similar tothe example that has just been discussed, we know that there is exactly one ball thathas been marked with any given number. Here, in comparison to the earlier examples,it would be expected that a much smaller proportion of rational individuals would beprepared to treat a reference set R based on the outcomes of drawing out a ball from thefirst urn as being compatible with a reference set R based on the outcomes of drawingout a ball from the second urn.
3. Strength of a distribution function3.1. Overview
As mentioned in Section 2.2, in order to complete the definition of a probability distri-bution, the strength of its distribution function relative to other distribution functions ofinterest needs to be established. In the following sections, the strength of a distributionfunction will be defined in terms of the context in which the concept of strength is beingapplied. 8 .2. When eliciting a distribution function
First, we will define the concept of strength in the case where a given individual is tryingto elicit a distribution function for a given random variable X based on his own personalopinion. In this context, it would seem appropriate to use a concept of strength thatwill be referred to as internal strength. This concept will now be defined separately forcontinuous and for discrete distribution functions. Definition 3: Internal strength for continuous distribution functions
Let a given continuous random variable X of possibly various dimensions have two pro-posed distribution functions F X ( x ) and G X ( x ). We define the set of events F [ a ] by F [ a ] = (cid:26) { X ∈ A} : (cid:90) A f X ( x ) dx = a (cid:27) (2)where { X ∈ A} is the event that X lies in the set A and f X ( x ) is the density functioncorresponding to F X ( x ), and we define the set G [ a ] in the same way with respect tothe distribution function G X ( x ). It now follows that for a given discrete or continuousreference set of events R that are independent of X , the distribution function F X ( x ) isdefined as being internally stronger than the distribution function G X ( x ) at the resolutionlevel λ , where λ is any value in the set Λ corresponding to the set R , ifmin A ∈ F [ λ ] S ( A, R ( λ )) > min A ∈ G [ λ ] S ( A, R ( λ )) (3)To give an example of the application of this definition, let us imagine that a doctor istrying to elicit a distribution function for the change in average survival time X causedby the administration of a new drug in comparison to a standard drug. We will assumethat the reference set of events R is based on the outcome of spinning the type of wheeldescribed in Section 2.4, and that the resolution λ is some value in the interval [0 . , . G X ( x ) be the current proposed distribution function for X . The aim is therefore totry to adjust this distribution function so that it better represents what is known about9he variable X , which we will regard as being equivalent to achieving some kind of overallincrease in the similarities S ( A, R ( λ )) where A ∈ G [ λ ].In particular, it is natural to put more attention on increasing the smaller of thesesimilarities without lowering by too much, or at all, the larger of these similarities. Hence,it would seem sensible to take another step in the elicitation process if an alternativedistribution function F X ( x ) is judged as being (according to Definition 3) internallystronger than the distribution function G X ( x ). The distribution function F X ( x ) wouldthen become the current proposed distribution function for X , and the elicitation processwould continue until no improvements to this distribution function can be made. Definition 4: Internal strength for discrete distribution functions
Let a given discrete random variable X that can only take a value x that belongs to thefinite or countable set { x , x , . . . } have two proposed distribution functions F X ( x ) and G X ( x ). Also, let the events L ( b ) , L ( b ) , . . . be scaling events (as defined in Section 2.5)that are independent of the variable X . Furthermore, we define the set of events F [ a ] by F [ a ] = (cid:40) ∞ (cid:91) i =1 ( L i ( b i ) ∩ { X = x i } ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ (cid:88) i =1 [ b i ∈ (0 ,
1) ] ≤ ∧ ∞ (cid:88) i =1 b i f X ( x i ) = a (cid:41) (4)where f X ( x ) is the probability mass function corresponding to F X ( x ), and [ ] on theright-hand side of this equation denotes the indicator function, and we define the set ofevents G [ a ] in the same way with respect to G X ( x ).It now follows that, for a given discrete or continuous reference set of events R that areindependent of X and the scaling events L ( b ) , L ( b ) , . . . , the function F X ( x ) is definedas being internally stronger than the function G X ( x ) at the resolution λ , where λ ∈ Λ, ifthe condition in equation (3) is satisfied with respect to the definitions currently beingused.One of the reasons for the first predicate in the definition of F [ a ] in equation (4), i.e.10he condition that at most only value in the set b , b , . . . is not equal to 0 or 1, is thatwithout this predicate there would be an event in the set F [ a ] that would be effectivelyequivalent to any of the scaling events L ( a ) , L ( a ) , . . . , i.e. the event corresponding to b i = a ∀ i . In other words, the event would have the very undesirable property of notdepending on the distribution function of interest F X ( x ). The practical importance ofthis issue will perhaps be more clearly seen when this definition of F [ a ] is used again inSection 3.3.The application of Definition 4 of internal strength can be illustrated by imaginingthat an election for a state governor has five candidates, and a political analyst is tryingto elicit probabilities for the events x , x , ..., x of each one of these candidates winning.The reference set of events R and the resolution λ will be defined as in the previousexample, and let the current proposed distribution function and mass function for thevariable in question be G X ( x ) and g X ( x ) respectively. At any given stage of the elicitationprocess, the smaller of the similarities in the set { S ( A, R ( λ )) : A ∈ G [ λ ] } will usuallybe caused by one or two of the probabilities in the set { g X ( x i ) : i = 1 , , ... , } beingrelatively poor representations of the analyst’s beliefs. This being the case, it wouldseem natural that the next step in the elicitation process would be to try to lessen thisimportant defect, which effectively means that we should try to increase the minimumsimilarity on the right-hand side of equation (3). Hence, it would seem sensible to allowan alternative distribution function F X ( x ) to replace G X ( x ) as the current proposeddistribution function for X if it is (according to Definition 4) internally stronger than thedistribution function G X ( x ). Although the concept of internal strength can be regarded as the basis of a naturalway of eliciting distribution functions, it does not really provide us with a useful means11f comparing the nature of distribution functions that have been already elicited fordifferent random variables. Therefore, once a distribution function has been elicited, analternative concept of strength is required so that the function can be interpreted in thismore outward-looking context. This alternative concept of strength will be referred to asexternal strength. It is a concept that not only can be applied to distribution functionsthat need to be derived using the kind of systematic elicitation process referred to in theprevious section, but also to distribution functions that are directly identified as providingthe best representations of our beliefs, which will be referred to as ‘given’ distributionfunctions, e.g. the distribution function of a variable that represents the outcome of awell-understood physical process. As was the case for internal strength, the concept ofexternal strength will be defined separately for continuous and for discrete distributionfunctions.
Definition 5: External strength for elicited or given continuous distributionfunctions
Let two continuous random variables X and Y of possibly different dimensions haveelicited or given distribution functions F X ( x ) and G Y ( y ) respectively. We define the setof events F [ a ] as in equation (2), and we define the set G [ a ] in the same way with respectto the variable Y instead of X and the distribution function G Y ( y ) instead of F X ( x ).It now follows that, for a given discrete or continuous reference set of events R thatare independent of X and Y , the function F X ( x ) is defined as being externally strongerthan the function G Y ( y ) at the resolution λ , where λ ∈ Λ, if S F = min A ∈ F [ λ ] S ( A, R ( λ )) > max A ∈ G [ λ ] S ( A, R ( λ )) = S G (5)This definition can be interpreted as meaning that if the function F X ( x ) is judged asbeing externally stronger than the function G Y ( y ) then, relative to the reference event R ( λ ), it better represents the uncertainty associated with the variable X than G Y ( y )12epresents the uncertainty associated with the variable Y .In comparison to the definition of internal strength in equation (3), it is naturallyappealing to have the maximization operator on the right-hand side of equation (5)instead of the minimization operator, as this of course implies that all the similarities inthe set { S ( A, R ( λ )) : A ∈ F [ λ ] } are greater than any similarity in the set { S ( A, R ( λ )) : A ∈ G [ λ ] } . However, it would not have been sensible to have defined internal strengthsuch that the maximization instead of the minimization operator appears on the right-hand side of equation (3), as using such a strong condition as the basis for an elicitationprocess would generally impede the ease with which such a process could develop.To give an example of the application of Definition 5, let us compare a uniform distri-bution function F X ( x ) over (0 ,
1) for the output X of a pseudo-random number generatorthat has been very carefully designed to produce approximately uniform random numbersin (0 ,
1) with a doctor’s elicited distribution function G Y ( y ) for the change in averagesurvival time Y caused by the administration of a new drug in comparison to a standarddrug. The reference set of events R and the resolution λ will defined as in the previousexamples.Under these assumptions, it would be expected that the similarities in the set { S ( A, R ( λ )) : A ∈ F [ λ ] } would all be regarded as being quite high. This is becausethe event R ( λ ) is the outcome of a well-understood physical experiment, i.e. a randomspin of a wheel, while any event in the set F [ λ ] feels like it can be almost be treated asthough it is the outcome of a well-understood physical experiment. On the other hand,the doctor’s uncertainty about whether or not any given event in the set G [ λ ] will occurcan be regarded as depending largely on his incomplete knowledge about highly complexbiological processes in the human body. Therefore it would be expected that, accordingto Definition 5, the function F X ( x ) would be judged as being externally stronger than thefunction G Y ( y ) which can be interpreted as meaning that, relative to the spin-of-a-wheel13vent R ( λ ), the function F X ( x ) performs better than the function G Y ( y ) at representingthe uncertainty that these functions are intended to represent. Definition 6: External strength for elicited or given discrete distribution func-tions
Let X and Y be two discrete random variables that can only take values in the finiteor countable sets x = { x , x , . . . } and y = { y , y , . . . } respectively, and let F X ( x ) and G Y ( y ) be elicited or given distribution functions for these two variables respectively. Also,let the events L ( b ) , L ( b ) , . . . be scaling events that are independent of the variables X and Y . Furthermore, we define the set of events F [ a ] as in equation (4), and we definethe set G [ a ] in the same way with respect to the variable Y and the distribution function G Y ( y ).It now follows that, for a given discrete or continuous reference set of events R thatare independent of the variables X and Y and the scaling events L ( b ) , L ( b ) , . . . , thefunction F X ( x ) is defined as being externally stronger than the function G Y ( y ) at theresolution λ , where λ ∈ Λ, if the condition in equation (5) is satisfied with respect to thedefinitions currently being used.This definition can be applied to the motivating example referred to in the Intro-duction, i.e. Ellsberg’s two urn example. In particular, we will denote the outcomes ofdrawing a ball out of the first urn and the second urn in this example as the randomvariables X and Y respectively, and we will denote the distribution functions for thesetwo variables as F X ( x ) and G Y ( y ) respectively. The reference set R and the resolution λ will be defined as in the earlier examples. Now let us imagine that, with regard toboth the first and the second urns, an individual elicits a probability mass function thatassigns a probability of 0.5 to both the events of drawing out a red ball and drawing outa black ball. This would seem to be quite a rational decision to make.14ince both F X ( x ) and G Y ( y ) effectively define a Bernoulli distribution function, thesets of events F [ λ ] and G [ λ ] will only contain two events. For example, the sets F [0 . G [0 .
5] simply contain the events of drawing out a red ball and drawing out a blackball from the first and second urns respectively. In applying Definition 6 of externalstrength to this particular case, i.e. the case where λ = 0 .
5, it should be fairly clear whythe individual is likely to decide that the similarities between the spin-of-a-wheel event R (0 .
5) and the events in G [0 .
5] are higher than the similarities between R (0 .
5) and theevents in F [0 . G Y ( y ) is externally stronger than the distribution function F X ( x ). A similarline of reasoning can be used to justify the same decision for other values of λ . If theindividual prefers monetary rewards that are associated with the outcomes from the urnin Ellsberg’s two urn example that is associated with the distribution function that isexternally stronger than the distribution function associated with the other urn, then the‘paradoxical’ behaviour identified by Ellsberg in this example can be accounted for bythe definition of probability outlined in the present work.We also could consider applying Definition 6 to the governor election example outlinedin Section 3.2 under the assumption that the political analyst has already used Defini-tion 4 of internal strength to elicit a distribution function H Z ( z ) to the events z , z ,..., z of each of the five candidates winning. With the reference set R and the resolu-tion λ defined as in the previous examples, it should be fairly clear why this distributionfunction is likely to be considered externally weaker than the distribution function G Y ( y )from Ellsberg’s two urn example. However, it would be much less easy to predict whetherany given political analyst would decide that the function H Z ( z ) is externally stronger,weaker or neither stronger nor weaker than the distribution function F X ( x ) from thisearlier example. 15 .4. Sensitivity to the choice of the reference set R and the resolution λ In general, Definitions 3 to 6 of internal and external strength depend on the referenceset of events R being used. More comments will be made with regard to this matter inSection 4. However for now, let us clarify that if, according to the definition given inSection 2.6, a discrete or continuous reference set R is compatible with another discreteor continuous reference set R , then Definitions 3 to 6 will not be affected by whetherthe reference set R or R is used, provided that the resolution λ ∈ Λ ∩ Λ , where Λ and Λ are as defined in Section 2.6.With regard to the choice of the resolution level λ , it could be argued that the furtherthat λ is away from the value 0.5, the greater the detail in which the characteristics of thedistribution functions involved in the Definitions 3 to 6 may be explored. On the otherhand, it is known that people have difficulty in weighing up the uncertainty associatedwith events that are very unlikely or very likely to occur, which is a disadvantage thatcould apply if λ was less than say 0.05 or greater than say 0.95. Nevertheless, it wouldbe expected that in many applications, Definitions 3 to 6 will be largely insensitive tothe choice of the value of λ over the range [0 . , . λ that hasbeen used in the examples that have been considered so far. Of course, not all distribution functions can be regarded as having been derived by somemethod of direct evaluation. Therefore, let us now turn our attention to defining theconcept of strength in the case where we wish to compare the nature of distributionfunctions that have been derived using any type of method including through the use ofa formal system of reasoning, e.g. derived by applying the standard rules of probability. Inparticular, this will be achieved by simply using a more general definition of the concept16f external strength than the definitions of this concept presented in Section 3.3.
Definition 7: General definition of external strength
Let two random variables X and Y have distribution functions F X ( x ) and G Y ( y ) respec-tively. Also, let M F and M G be two sets of reasoning processes that could be used tomeasure the minimum similarity S F and the maximum similarity S G respectively, wherethese similarities are as defined in equation (5), and where the assumptions underlyingthis equation correspond to Definition 5 if the variables X and Y are continuous or toDefinition 6 if these variables are discrete.It now follows that, for a given discrete or continuous reference set of events R thatsatisfies the assumptions of Definition 5 if the variables X and Y are continuous or theassumptions of Definition 6 if these variables are discrete, the function F X ( x ) is definedas being externally stronger than the function G Y ( y ) at the resolution λ , where λ ∈ Λ, ifmax M ∈ M F S F > max M ∈ M G S G (6)where M ∈ M denotes ‘over all reasoning processes in the set M ’.Clearly in the special case considered in Section 3.3, the sets M F and M G each containonly one reasoning process, which is the method of direct evaluation. More generallythough we are faced with the problem that Definition 7 may depend on the choices thatare made for the sets M F and M G . In many cases, this problem can be largely avoidedby choosing the sets M F and M G to be large enough so that they contain all methodsof reasoning that are relevant to measuring the similarities concerned. However, as willbe illustrated in Section 3.6, this may be difficult to achieve if there are one or morepotentially relevant methods of reasoning that are not well understood.Observe that when distribution functions are derived by formal systems of reasoningrather than by a direct method of evaluation, the problem also arises that the distributionfunction for any given random variable may itself depend on which system of reasoning17s used to derive it. Due to this possibility, the following definition is required. Definition 8: Criterion for choosing between distribution functions
We will assume that F X ( x ) and G X ( x ) are two proposed distribution functions for therandom variable X that have been derived using two separate methods of reasoning.Under this assumption if, in Definition 7, the random variable Y is assumed to be equiv-alent to X , and the sets M F and M G are regarded by the given individual who has thetask of evaluating the similarities in equation (6) as containing all methods of reasoningthat are relevant for this task, then the function F X ( x ) will be favoured over G X ( x ) asbeing the distribution function for X if it is externally stronger than G X ( x ) according toDefinition 7.We can interpret this definition as meaning that F X ( x ) will be favoured over G X ( x )as being the distribution function of X if, relative to the reference event R ( λ ), it betterrepresents the uncertainty associated with the variable X than the function G X ( x ). In this section, we will apply the concept of external strength to the controversy aboutwhether fiducial reasoning is of any use in circumstances where the fiducial distributionfunction is equal to a posterior distribution function corresponding to a given choiceof the prior distribution function. We will concern ourselves only with the case whereinferences need to be made about the mean µ of a normal density function that has aknown variance σ on the basis of a random sample x of n values drawn from the densityfunction, since it will be seen that the issues that are explored in analysing this case arerelevant to many other cases. The type of fiducial inference that will be applied will besubjective fiducial inference as outlined in Bowater and Guzm´an (2018).Let it be assumed that very little or nothing was known about µ before the sample x µ centred at some givenprior median µ . Assuming this has been done, let the corresponding prior and posteriordistribution functions be denoted as D ( µ ) and D ( µ | x ) respectively. However, thesedistribution functions are not sufficient to define the prior and posterior distributions of µ under the definition of probability being considered. As has already been established,to complete these definitions we need to evaluate the strengths of these distributionfunctions relative to other distribution functions of interest. In the current context, it isclear that this needs to be done by applying Definition 7 of external strength.To apply this definition, it will be assumed that there is only the method of directevaluation in the set of reasoning processes M µ that will used to measure the similaritiesin equation (6) with respect to the prior distribution function D ( µ ), and that there isonly Bayesian reasoning in the set of reasoning processes M µ | x that will used to measurethese similarities with respect to the posterior distribution function D ( µ | x ). By Bayesianreasoning it is meant any system of reasoning that is related to the way that Bayes’theorem updates the prior to the posterior distribution function by combining it withthe likelihood function. The reference set R and the resolution λ will be defined as inprevious examples.Under these assumptions, if the set of events D µ [ λ ] is defined as the set F [ λ ] wasdefined in equation (2) but with respect to the variable µ and the prior distributionfunction D ( µ ), then it would be expected that the similarities in the set { S ( A, R ( λ )) : A ∈ D µ [ λ ] } would all be regarded as being very low. In fact, we would expect that itwould be difficult, if not impossible, to find any directly elicited distribution function (forany random variable in any context) that could be regarded as being externally weakerthan the prior distribution function D ( µ ) according to Definition 7. This is because,apart from needing to satisfy the condition that it is diffuse and symmetric, the choice19f the prior density function for µ when there is very little or no prior information about µ will be extremely arbitrary, implying that the definition of the events in the set D µ [ λ ]will be just as arbitrary. For example, if λ = 0 . D µ [ λ ] will contain theevents { µ < µ } and { µ > µ } which clearly depend on the very arbitrary choice of theprior median µ . In general, for all values of λ in [0 . , . D µ [ λ ]will also depend on the arbitrary decision that needs to be made about how diffuse theprior density function for µ should be.Let D µ | x [ λ ] be defined as F [ λ ] was defined in equation (2) but with respect to thevariable µ and the posterior distribution function D ( µ | x ). Since the posterior densityfunction for µ is determined through Bayes’ theorem simply by reweighting the priordensity function for µ , that is, by normalizing the density function that results frommultiplying the prior density function by the likelihood function, it would seem difficultto apply a Bayesian reasoning process, i.e. a member of the set M µ | x , to argue that thesimilarities in the set { S ( A, R ( λ )) : A ∈ D µ | x [ λ ] } should be generally larger than thesimilarities in the set { S ( A, R ( λ )) : A ∈ D µ [ λ ] } . For this reason, under the assumptionsthat have been made, it can be argued that it should be difficult, if not impossible, tofind a directly elicited distribution function that could be regarded as being externallyweaker than the posterior distribution function D ( µ | x ) according to Definition 7.It is common practice to try to approximate the posterior distribution function D ( µ | x )with a distribution function C ( µ | x ) that is the result of using Bayes’ theorem to updatea prior density function of the form c ( µ ) = constant ∀ µ ∈ ( −∞ , ∞ ). We should firstnote that, under the definition of probability being used in the present work, it wouldseem inappropriate to refer to C ( µ | x ) as a posterior distribution function, since it isbased on a prior density function c ( µ ) that does not follow the standard mathematicalrules of probability, in particular it is an improper density function. Second, since thefunction C ( µ | x ) is only being used to approximate the function D ( µ | x ), its external20trength relative to other distribution functions must be inherited from D ( µ | x ), i.e. itmust be roughly determined by the external strength of D ( µ | x ) relative to the functionsin question.We will now turn to the application of subjective fiducial inference to the case of in-terest. The terminology that will be used corresponds to Bowater and Guzm´an (2018),nevertheless the way that subjective fiducial inference will be applied to this case is equiv-alent to what was outlined in both Bowater (2017b) and Bowater and Guzm´an (2018).Since the sample mean ¯ x is a sufficient statistic for µ , it can be assumed to be thefiducial statistic in this case. Defining the primary random variable (primary r.v.) Γas having a standard normal density means that it can be assumed that the data x isgenerated by the following data generating algorithm:1) Generate a value γ for the random variable Γ by randomly drawing this value fromthe standard normal density function.2) Determine ¯ x by setting Γ equal to γ and X equal to ¯ x in the transformation X = µ + ( σ/ √ n )Γ (7)3) Generate the data set x by conditioning the joint density function of this data setgiven µ on the already generated value of the sample mean ¯ x .It now follows that the fiducial distribution function of µ is determined, according to thegeneral rule given in Bowater and Guzm´an (2018), by setting X equal to ¯ x and treating µ as a random variable in equation (7), which implies that this distribution function isdefined by the expression µ | σ , x ∼ N (¯ x, σ /n )This distribution function of µ is the same as the function C ( µ | x ) that was definedearlier. However, to evaluate the external strength of this distribution function relativeto other distribution functions, it will now be assumed that fiducial reasoning is theonly type of reasoning in the set of reasoning processes M C that, under the definition in21quation (6), will be used to measure the similarities in the set { S ( A, R ( λ )) : A ∈ C µ | x [ λ ] } ,where C µ | x [ λ ] is defined as F [ λ ] was defined in equation (2) but with respect to thevariable µ and the distribution function C ( µ | x ). By fiducial reasoning it is meant anysystem of reasoning that directly attempts to justify the fiducial argument, which will beinterpreted to be the argument that the density function of the primary r.v. Γ should bethe same both before and after the data x has been observed.To perform the task just mentioned, let us proceed by reanalysing one of the abstractscenarios that were outlined in Bowater (2017b). In the scenario in question, it is imag-ined that someone, who will be referred to as the selector, randomly draws a ball outof an urn containing 7 red balls and 3 blue balls and then, without looking at the ball,hands it to an assistant. The assistant, by contrast, looks at the ball, but conceals itfrom the selector, and then places it under a cup. The selector believes that the assistantsmiled when he looked at the ball. Finally, the selector is asked to assign a probabilityto the event that the ball under the cup is red. We assume that it was known from theoutset that the aim of this exercise was for the selector to assign a probability to thisparticular event.In this scenario, let us now imagine that, relative to other distribution functions ofinterest, the selector wishes to evaluate the external strength of the Bernoulli distributionfunction B Y ( y ) that corresponds to assigning a probability of 0.7 to the event that theball under the cup is red ( y = 1), and a probability of 0.3 to the event that it is blue( y = 0). This means that he will need to evaluate the similarities in the set { S ( A, R ( λ )) : A ∈ B [ λ ] } , where the set B [ λ ] is defined as F [ λ ] was defined in equation (4) but withrespect to the variable Y and the distribution function B Y ( y ), which of course impliesthat it can only contain two events.In doing this, it will be assumed that the selector takes into account the fact that asmile by the assistant would be information that could imply that it is less likely or more22ikely that the ball under the cup is red. Therefore, his evaluation of the similarities inquestion must depend on his subjective judgement regarding the meaning of the assis-tant’s supposed smile. Nevertheless, he may feel that, if the assistant had indeed smiled,he would not really have understood the smile’s meaning. In this case, it would seemrational for him to conclude that the similarities in the set { S ( A, R ( λ )) : A ∈ B [ λ ] } couldbe at least approximately evaluated by making the assumption that he had put the balldirectly under the cup rather than giving the assistant an opportunity to look at theball. Under this assumption, since along with the event R ( λ ), the propensity of either ofthe two events in B [ λ ] to occur would only depend on the outcome of a well-understoodphysical experiment, it would be expected that he would regard both of the similaritiesin the set { S ( A, R ( λ )) : A ∈ B [ λ ] } as being equal or very close to the highest possiblesimilarity that can exist between two events, which is a conclusion that therefore could bejustified as being valid or approximately valid in the scenario that is of genuine interest.Returning to the evaluation of the relative external strength of the fiducial distributionfunction C ( µ | x ), let us assume that, in step 1 of the data generating algorithm outlinedabove, the value γ of the primary r.v. Γ is generated by a well-understood physicalexperiment, which is usually a reasonable assumption to make. We will now make ananalogy between the uncertainty about the value of γ after the data has been observed inthis case, and the uncertainty about the colour of the ball under the cup in the scenariothat has just been outlined. In particular, given that little or nothing was known about µ before the data x was observed, the event of observing the data should be akin tothe event of the selector believing that the assistant smiled when he looked at the ballin question, and hence this event should have little or no meaning in terms of its effecton the uncertainty that is felt about the value of γ . As a result if, after the data hasbeen observed, Γ is assigned the same distribution function as before the data has beenobserved, i.e. a standard normal distribution function, then it would be expected that23he relative external strength of this function would be regarded as being similar to therelative external strength of the function B Y ( y ). Since the distribution function C ( µ | x )is fully defined by this distribution function for Γ and known constants, it can thereforebe argued that the similarities in the set { S ( A, R ( λ )) : A ∈ C µ | x [ λ ] } as defined earliershould all be regarded as being equal or quite close to the highest possible similarity thatcan exist between two events.This conclusion could hardly be more different to the conclusion that was reached whenthe relative external strength of the same distribution function C ( µ | x ) was evaluated byeffectively taking into account its approximation to the distribution function D ( µ | x ), andthen applying only Bayesian reasoning. On account of this, and in accordance with thedefinition of a probability distribution given in Section 2.2, it could be proposed that theposterior distribution for µ that corresponds to the use of a flat improper prior density for µ would be better described as the fiducial distribution for µ , since the relative externalstrength of the distribution function in question C ( µ | x ) would be naturally justified usingfiducial rather than Bayesian reasoning, and arguably rather than any other currentlyknown form of statistical reasoning, if all these reasoning processes were included in theset M C . We will now apply Definition 8 to an example where there are two possible distributionfunctions for the same random variable. In particular, let it be imagined that in the caseanalysed in the previous section, there is now a notable but quite a low level of prior beliefthat µ will not be a very long distance from a given value µ . We will assume that inapplying the Bayesian method, this prior belief about µ is represented as a normal priordensity function for µ with mean µ and a moderate to large variance. Let the resultingposterior distribution function be denoted by I ( µ | x ). Alternatively, we could apply the24ducial method to this problem, under the assumption that it may be adequate to nottake into account the prior belief about µ in forming a post-data distribution functionfor µ . Therefore, again the fiducial distribution function for µ will be C ( µ | x ) as definedin Section 3.6.Let I µ | x [ λ ] be defined as F [ λ ] was defined in equation (2) but with respect to thevariable µ and the distribution function I ( µ | x ). Now, in applying Definition 8 to choosewhich is the most appropriate distribution function for µ out of I ( µ | x ) and C ( µ | x ) afterthe data has been observed, let us assume that the set of reasoning processes that willbe used to evaluate both the set of similarities { S ( A, R ( λ )) : A ∈ I µ | x [ λ ] } and the set { S ( A, R ( λ )) : A ∈ C µ | x [ λ ] } contains both Bayesian and standard fiducial reasoning butno other method of reasoning. It is clear though that the former set of similarities canonly be evaluated indirectly using fiducial reasoning, while the latter set of similaritiescan only be evaluated indirectly using Bayesian reasoning.If we apply Bayesian reasoning to evaluate the similarities { S ( A, R ( λ )) : A ∈ I µ | x [ λ ] } ,then since choosing a prior density function to represent the prior beliefs about µ inquestion is still fairly arbitrary, it would be expected that these similarities will beregarded in general as being only moderately higher than the similarities in the set { S ( A, R ( λ )) : A ∈ D µ | x [ λ ] } , under the assumption that these latter similarities areevaluated in the context of the case considered in Section 3.6.To evaluate the similarities { S ( A, R ( λ )) : A ∈ C µ | x [ λ ] } in the present context byapplying fiducial reasoning, it again seems sensible to first analyse the relative externalstrength of the standard normal distribution function as the distribution function of theprimary r.v. Γ after the data has been observed. In particular, it would seem withoutdoubt that the presence of the prior beliefs about µ in question should have the effectof lessening our degree of comfort in assuming that this distribution function is stillstandard normal. However, since the prior beliefs about µ have been assumed to be25uite vague, this effect may be considered negligible if the sample size is large, andeven if the sample size is small, it may still not be considered as being a very large effect.Therefore, although the similarities in the set { S ( A, R ( λ )) : A ∈ C µ | x [ λ ] } may be regardedin general as being lower than if these similarities were evaluated in the context of thecase considered in Section 3.6, they nevertheless may be regarded in general as being stillsubstantially higher than the similarities in the set { S ( A, R ( λ )) : A ∈ I µ | x [ λ ] } , wherethese latter similarities must of course be evaluated either by using Bayesian reasoning,or by treating the distribution function I ( µ | x ) as an approximation to C ( µ | x ) underfiducial reasoning. This can be interpreted as meaning that according to Definition 7, thedistribution function C ( µ | x ) would be regarded as being externally stronger than thefunction I ( µ | x ), which in turn would imply that, according to Definition 8, the function C ( µ | x ) would be favoured over I ( µ | x ) as being the distribution function of µ after thedata has been observed.The case has therefore been made in this section that, if Bayesian and standard fiducialreasoning are the only allowable reasoning processes in evaluating the similarities ofinterest, then even when there is notable prior information about µ , it may not in factbe worth taking into account this information in establishing a post-data distributionfunction for µ , due to the detrimental effect this may have on the relative externalstrength of the distribution function concerned. The concept of external strength has been defined as an ordinal measurement, i.e. usingthe definitions that have been given we are able to rank distribution functions in termsof their external strength. However, a question that naturally arises is whether it ispossible to measure on a continuous scale some kind of characteristic that incorporatesthe essence of the concept of external strength. In this section, we will only consider how26his could be done in the special case where we wish to compare distribution functionsthat have already been classified as even-similarity distribution functions according tothe following definition.
Definition 9: An even-similarity distribution function If F X ( x ) is the distribution function of a random variable X then, for a given referenceset of events R that satisfies the assumptions of Definition 3 or Definition 4 dependingrespectively on whether X is a continuous or a discrete variable, the function F X ( x ) willbe defined as being an even-similarity distribution function at the resolution level λ , if theminimum similarity S F is equal to the maximum similarity S F according to the notationused in equation (5).A continuous measure of external strength could now be defined in the following way. Definition 10: Proposed continuous measure of external strength
With respect to a given reference set of events R and a given resolution λ , let F X ( x ), G Y ( y ) and H Z ( z ) be even-strength distribution functions for any three given randomvariables X , Y and Z respectively, where X and Z are independent from each otherand, according to Definition 7, the function G Y ( y ) is not externally weaker than F X ( x )and is not externally stronger than H Z ( z ). We define F [ λ ] according to Definition 3 orDefinition 4 depending respectively on whether X is a discrete or a continuous variable,and we define G [ λ ] and H [ λ ] in the same way but with regard to the variables Y and Z and the distribution functions G Y ( y ) and H Z ( z ) respectively, where the two sets ofscaling events L ( b ) , L ( b ) , . . . that are possibly used in the definitions of F [ λ ] and H [ λ ]are independent from each other.Now, if the event R ( λ ) is independent of the variables X , Y and Z , and of the scalingevents that may have been used to define F [ λ ], G [ λ ] or H [ λ ], then for any three events A , B and C that are members of the sets F [ λ ], G [ λ ] and H [ λ ] respectively, it would be27easonable to assume in general that a value of α ∈ [0 ,
1] could be found that satisfiesthe condition: S ( B, R ( λ )) = S ( ( A ∩ L ( α ) c ) ∪ ( C ∩ L ( α )) , R ( λ ) ) (8)where L ( α ) is a scaling event that is independent of the variable X , the variable Z , theevent R ( λ ) and the scaling events that may have been used to define the sets F [ λ ] and H [ λ ], and where L ( α ) c denotes the complement of L ( α ).This value of α could then be interpreted as a continuous measure of the externalstrength of the distribution function G Y ( y ) relative to both the distribution functions F X ( x ) and H Z ( z ). For instance, a small value of α would indicate that the externalstrength of the function G Y ( y ) is closer to that of the function F X ( x ) than to that ofthe function H Z ( z ), while a large value of α would indicate that the external strength of G Y ( y ) is closer to that of H Z ( z ) than to that of F X ( x ). However, it will be assumed thatthis continuous measure of external strength may only be applied in cases where it cannot or does not contradict the general definition of external strength, i.e. Definition 7.To give an example, we can apply Definition 10 to the case considered in Section 3.6.In particular, it would appear acceptable to assume that the function F X ( x ) in thisdefinition could be the prior or posterior distribution function D ( µ ) or D ( µ | x ), thefunction G Y ( y ) could be the fiducial distribution function C ( µ | x ), and the function H Z ( z ) could be the Bernoulli distribution function U Z ( z ) that corresponds to assigninga probability of λ to the event of drawing a ball out of an urn containing k distinctlylabelled balls that belongs to a given subset of λk balls ( z = 1), and a probability of 1 − λ to the complement of this event ( z = 0). If it is also assumed that the similarity on theleft-hand side of equation (8) is evaluated by using fiducial reasoning, and the similarityon the right-hand side of this equation is evaluated directly if F X ( x ) is taken to be D ( µ ),or by using Bayesian reasoning if F X ( x ) is taken to be D ( µ | x ), then by using the sametype of principles that were explained in Section 3.6, it could be argued that α should be28qual or close to one, which could be interpreted as meaning that the external strengthof C ( µ | x ) is equal to that of U Z ( z ), or is at least much closer to that of U Z ( z ) than tothat of D ( µ ) or D ( µ | x ).
4. Discussion
We will now discuss how well the theory outlined in the present paper addresses thecriticisms 1 to 4 listed in the Introduction of the definition of probability outlined inBowater (2017a, 2017b).
Criticism 1: Satisfying the additivity rule of probability
Obeying the additivity rule is no longer a goal for the theory, as was the case in Bowa-ter (2017a), but rather an assumption upon which the definition of probability is con-structed. In particular, to guarantee that this assumption is satisfied, this definitionhas been based exclusively on probability distribution functions instead of also basingit on the probabilities of events in isolation. Nevertheless, in any given situation, theadequacy of making the assumption that probabilities are additive is reflected in therelative external strengths that are associated with the distribution functions concerned.For example, if F X ( x ) is the distribution function of a given random variable X , but theassumption that the probabilities of X lying in different subsets of the sample space of X are additive was difficult to make, then we would not expect all the similarities in theset { S ( A, R ( λ )) : A ∈ F [ λ ] } to be regarded as being close to the highest similarity thatcan exist between two events, where the reference set R and the resolution λ are definedas in earlier examples. Criticism 2: Precision of probability values
Unlike in Bowater (2017a, 2017b) where probabilities were only defined at potentially29uite widely spaced points on the continuous interval [0 , R is discrete, as the precision by which probabilities can be measured is no longerdetermined by how the set R is defined as was the case in these earlier papers.However, in contrast to this earlier work, there is no guarantee that the probabilitythat is elicited to any given event is a unique value. This is because there may be aset F ∗ of possible distribution functions F X ( x ) for a given variable X , each member ofwhich is regarded to be internally stronger than any function F X ( x ) not in this set, butnot internally stronger than any other function F X ( x ) within this set. It would be hopedthough that usually the distribution functions in the set F ∗ would be fairly similar toeach other. In this type of situation, it is recommendable that any statistical analysisthat requires a distribution function for X as an input incorporates a sensitivity analysisover the functions F X ( x ) in the set F ∗ . Criticism 3: Dependence of probabilities on the reference set
As was the case in Bowater (2017a, 2017b), probabilities depend in general on the refer-ence set of events R with respect to which they are defined. This is due to the fact that,in general, the relative internal and external strengths of a distribution function dependon the reference set R that is being used. As alluded to in Section 3.4, this issue thoughis made substantially less important by taking into account that reference sets may oftenbe regarded as being compatible according to the definition given in Section 2.6.We could of course attempt to remove this dependence completely by defining the refer-ence set of events R under the added condition that the set of events O = { O , O , . . . , O k } in the case where R is discrete, or the continuous variable V in the case where R is con-tinuous, must be the outcomes or outcome of a well-understood physical experiment. Fora continuous reference set R , this would mean that the set R would be composed entirelyof scaling events according to the definition in Section 2.5. However, placing this extra30ondition on the set R would not appear to be that helpful for at least two reasons.First, since the definition of the set F [ a ] for a continuous distribution function givenin equation (2) does not depend on the concept of a scaling event, the definitions of theset R given in Section 2.4 allow us at least to define the concepts of relative internal andexternal strength of a continuous distribution function without entering into a potentiallywoolly discussion about when an outcome or set of outcomes can or should be classified asbeing generated by a well-understood physical experiment. The second reason for usingthese earlier given definitions of the set R is that, in some situations, it may be useful tobase assessments of uncertainty on a set R that contains events that are not associatedat all with the outcome or outcomes of a well-understood physical experiment.In particular, if the goal of an individual is to communicate his personal uncertaintyabout a random variable to others, then this may not be easy to do if he was evaluated thedistribution function of the variable as being relatively externally weak. Therefore, theindividual may wish to find an alternative reference set R that contains events associatedwith a standardized form of uncertainty that can be clearly appreciated by many people,but also with respect to which he would consider the distribution function of the variableconcerned as being relatively externally strong.As alluded to in Section 3.3, if the reference set of events R is based on the outcomeof spinning the type of wheel described in Section 2.4, then for any resolution λ inthe interval [0 . , . H Z ( z )associated with the governor election example, where z = { z , z , ... , z } are the eventsof each of the five candidates winning, as being relatively externally strong. To give anexample of the argument that has just been put forward, let us now change the referenceset of events R to the reference set described in Section 2.6 that is based on the outcomesof drawing a ball out an urn that is known to only contain balls marked with a numberin the range 1 to k , but for which the number of balls marked with any given number31s entirely unknown. Under this assumption, it would seem plausible that, for any givenresolution λ in [0 . , .
95] permitted by the definition of the set R , a rational individualpossibly could regard the distribution function H Z ( z ) just referred to as being relativelyexternally strong. Given that also the events in the reference set R are associated witha fairly standard and easily understood type of uncertainty, an individual may feel it iseasier to convey his personal uncertainty about the outcome of the governor election toothers by using this alternative reference set rather the original reference set. Criticism 4: Lack of universality of the definition
The present work has addressed the lack of universality of the definition of probabilityoutlined in Bowater (2017a, 2017b) by defining the concepts of internal and externalstrength so that they can be applied not just to continuous but also to discrete dis-tribution functions, while at the same time eliminating the rather cumbersome notionadvocated in these earlier papers that the concept of strength can also be applied tothe probability of an individual event without any consideration of its association witha specified distribution function. Nevertheless, if desired, we can of course define thisdistribution function to be just the probability of the event and that of its complement,i.e. a Bernoulli distribution function.
5. Some closing remarks
Returning to the overall motivation for the theory outlined in the present paper and forthe earlier theory that was outlined in Bowater (2017a, 2017b), it is hoped, even morethan was the case for this earlier theory, that the present theory gives the concept ofprobability a natural and useful real-world meaning. Moreover, it was shown in Sec-tion 3.3 how the present theory can account for a rational preference for the second urnin Ellsberg’s two urn example, and how, by accounting for the same type of phenomenon,32t can justify the use of fiducial rather than Bayesian reasoning in the examples that werediscussed in Sections 3.6 and 3.7. Therefore, it is hoped that this theory has adequatelyachieved all the goals that were set out at the start of this paper.
References
Becker, S. W. and Brownson, F. O. (1964). What price ambiguity? or the role ofambiguity in decision-making.
Journal of Political Economy , , 62–73.Bowater, R. J. (2017a). A formulation of the concept of probability based on the useof experimental devices. Communications in Statistics: Theory and Methods , ,4774–4790.Bowater, R. J. (2017b). A defence of subjective fiducial inference. AStA Advances inStatistical Analysis , , 177–197.Bowater, R. J. and Guzm´an, L. E. (2018). Multivariate subjective fiducial inference. arXiv.org (Cornell University Library), Statistics Theory , arXiv:1804.09804.Curley, S. P. and Yates, J. F. (1989). An empirical evaluation of descriptive models ofambiguity reactions in choice situations. Journal of Mathematical Psychology , ,397–427.Eagle, A. (2011). Philosophy of Probability: Contemporary Readings , Routledge, Lon-don.Ellsberg, D. (1961). Risk, ambiguity and the Savage axioms.
Quarterly Journal ofEconomics , , 643–669.Fellner, W. (1961). Distortion of subjective probabilities as a reaction to uncertainty. Quarterly Journal of Economics , , 670–689.33ine, T. L. (1973). Theories of Probability: an Examination of Foundations , AcademicPress, New York.Garthwaite, P. H., Kadane, J. B. and O’Hagan, A. (2005). Statistical methods foreliciting probability distributions.
Journal of the American Statistical Association , , 680–700.Gillies, D. (2000). Philosophical Theories of Probability , Routledge, London.Kadane, J. B. and Wolfson, L. J. (1998). Experiences in elicitation.
Journal of theRoyal Statistical Society, Series D , , 3–19.Kynn, M. (2008). The ‘heuristics and biases’ bias in expert elicitation. Journal of theRoyal Statistical Society, Series A , , 239–264.O’Hagan, A., Buck, C. E., Daneshkhak, A., Eiser, J. R., Garthwaite, P. H., et al. (2006). Uncertain Judgements: Eliciting Experts’ Probabilities , Wiley, Chichester.Spetzler, C. S. and Stael von Holstein, C. A. S. (1975). Probability encoding in decisionanalysis.
Management Science ,22