[PDF] Organic fiducial inference

Abstract

A substantial generalisation is put forward of the theory of subjective fiducial inference as it was outlined in earlier papers. In particular, this theory is extended to deal with cases where the data are discrete or categorical rather than continuous, and cases where there was important pre-data knowledge about some or all of the model parameters. The system for directly expressing and then handling this pre-data knowledge, which is via what are referred to as global and local pre-data functions for the parameters concerned, is distinct from that which involves attempting to directly represent this knowledge in the form of a prior distribution function over these parameters, and then using Bayes' theorem. In this regard, the individual attributes of what are identified as three separate types of fiducial argument, namely the strong, moderate and weak fiducial arguments, form an integral part of the theory that is developed. Various practical examples of the application of this theory are presented, including examples involving binomial, Poisson and multinomial data. The fiducial distribution functions for the parameters of the models in these examples are interpreted in terms of a generalised definition of subjective probability that was set out previously.

Full PDF

OOrganic ﬁducial inference

Russell J. Bowater

Independent researcher, Sartre 47, Acatlima, Huajuapan de Le´on, Oaxaca, C.P. 69004,Mexico. Email address: as given on arXiv.org.Personal website: sites.google.com/site/bowaterfospage

Abstract:

A substantial generalisation is put forward of the theory of subjective ﬁducialinference as it was outlined in earlier papers. In particular, this theory is extended to deal withcases where the data are discrete or categorical rather than continuous, and cases where therewas important pre-data knowledge about some or all of the model parameters. The system fordirectly expressing and then handling this pre-data knowledge, which is via what are referred toas global and local pre-data functions for the parameters concerned, is distinct from that whichinvolves attempting to directly represent this knowledge in the form of a prior distributionfunction over these parameters, and then using Bayes’ theorem. In this regard, the individualattributes of what are identiﬁed as three separate types of ﬁducial argument, namely the strong,moderate and weak ﬁducial arguments, form an integral part of the theory that is developed.Various practical examples of the application of this theory are presented, including examplesinvolving binomial, Poisson and multinomial data. The ﬁducial distribution functions for theparameters of the models in these examples are interpreted in terms of a generalised deﬁnitionof subjective probability that was set out previously.

Keywords:

Data generating algorithm; Fiducial statistic; Generalised subjective probability;Global and local pre-data functions; Primary random variable; Types of ﬁducial argument. a r X i v : . [ s t a t . O T ] A ug . Introduction The theory of subjective ﬁducial inference was ﬁrst proposed in Bowater (2017b), andwas then modiﬁed and extended to deal with more general inferential problems in whichvarious parameters are unknown in Bowater (2018a). A further analysis that supportsthe adoption of this approach to inference is provided in Bowater and Guzm´an (2018b).References to loosely related work in the general area of ﬁducial inference can be foundin the ﬁrst two of these three papers.The aim of the present work is to substantially generalise this theory of inference as itwas deﬁned in Bowater (2018a). In particular, this theory will be extended to deal withcases where the data are discrete or categorical rather than continuous, and cases wherethere was important knowledge about some or all of the model parameters before thedata were observed. Such knowledge, which will be termed ‘pre-data knowledge’, will betreated as being distinct from ‘prior knowledge’, since the use of this latter term usuallyexclusively implies that inferences will be made under the Bayesian paradigm.The development of the earlier theory will be at a level that is suﬃcient to justify thetheory being renamed as ‘organic ﬁducial inference’. Also, the use of the word ‘subjective’in the original name caused confusion as for some this meant that the theory mustsubstantially depend on personal beliefs, or in some other way must be far from beingobjective. As was explained in Bowater (2018a) and Bowater and Guzm´an (2018b), thiswas not the case for the original theory, and is not generally the case for the theory thatis about to be presented. The word ‘organic’ in the new name, however, still emphasizesthat the theory is designed for living subjects, e.g. humans, and not for robots.For cases in which nothing or very little was known about the model parameters beforethe data were observed, the motivation for this paper is similar to how the need for thework in Bowater (2018a) was justiﬁed, that is, it is motivated by the severe criticisms thatgenerally can be made in these cases against the frequentist and Bayesian approaches to2nference. These criticisms, some of which are well known, were set out in Section 4 ofBowater (2017b) and Sections 2 and 7 of Bowater (2018a), and to save space they willnot be repeated here.In other cases that will be of interest, i.e. where there is moderate to strong pre-dataknowledge about some or all of the model parameters, conventional schools of infer-ence can also be inadequate. In particular, frequentist theory is a generally inﬂexibleframework for incorporating such knowledge into the inferential process. For example,it has proved, on the whole, very diﬃcult to adapt the conﬁdence interval approach tosituations where we simply know, before the data was observed, that values in a givensubset of the natural space of a parameter of interest are impossible, see for exampleMandelkern (2002) and the references therein. On the other hand, while our pre-dataknowledge about some or all of the model parameters may be substantial, it may not becomprehensive enough in many situations to be adequately incorporated into a Bayesiananalysis by placing a prior density function over the parameters in question.Let us now summarise the structure of the paper. Some brief comments about theconcept of probability that will be used in the paper are made in the following section.Further concepts, principles and deﬁnitions that underlie the theory of organic ﬁducial in-ference in cases where only one model parameter is unknown are presented and discussedin Section 3. In relation to earlier work, an account is then given in Section 4 of how thismethodology is extended to include cases where various parameters are unknown.In the second half of the paper, the theory is applied to various examples. In particularin Sections 5 and 6, problems of inference based on both continuous and discrete dataare examined where nothing or very little was known about the model parameters beforethe data were observed. Examples are then discussed in Section 7 where the naturalparameter space is restricted as a result of pre-data knowledge about the case in question,and ﬁnally in Section 8, the impact of more general forms of pre-data knowledge about3he model parameters is illustrated.

2. Generalised subjective probability

The deﬁnition of probability upon which the theory of organic ﬁducial inference willbe based is the deﬁnition of subjective probability that was presented in Bowater andGuzm´an (2018b), although the key concept of similarity that this deﬁnition relies on wasintroduced in Bowater (2017a), and discussed in Bowater (2017b) and Bowater (2018a).For the sake of convenience, this deﬁnition of probability will be referred to as generalisedsubjective probability.Under this deﬁnition, a probability distribution is deﬁned by its distribution func-tion, which has the usual mathematical properties of such a function, and the strength of this function relative to other distribution functions of interest. In very loose terms,the strength of a distribution function is essentially a measure of how well the distri-bution function represents a given individual’s uncertainty about the random variableconcerned. In this paper, we will be primarily interested in the external strength ofa continuous distribution function as speciﬁed by Deﬁnitions 5 and 7 of Bowater andGuzm´an (2018b). To avoid repeating all the technical details, the reader is invited toexamine these deﬁnitions as well as the application of these deﬁnitions to a fundamentalproblem of statistical inference in Sections 3.6 and 3.7 of this earlier paper.Although generalised subjective probability will be the adopted deﬁnition of proba-bility, the concept of strength will not be explicitly discussed in the sections that imme-diately follow in order to give a more digestible introduction to the other main conceptsthat underlie organic ﬁducial inference. Instead, the role of this deﬁnition of probabil-ity in organic ﬁducial inference will be fully examined when this method of inference isapplied to examples later in the paper. 4 . Univariate organic ﬁducial inference3.1. Sampling model and data generation

In general, it will be assumed that a sampling model that depends on one or variousunknown parameters θ , θ , . . . , θ k generates the data x . Let the joint density or massfunction of the data given the true values of θ , θ , . . . , θ k be denoted as g ( x | θ , θ , . . . , θ k ).For the moment, though, we will assume that the only unknown parameter in the modelis θ j , either because there are no other parameters in the model, or because the truevalues of the parameters in the set θ − j = { θ , . . . , θ j − , θ j +1 , . . . , θ k } are known.In a change from Bowater (2018a), the following more general deﬁnition of a ﬁducialstatistic will be applied. Deﬁnition 1: A ﬁducial statistic

A ﬁducial statistic Q ( x ) will be deﬁned as being the only statistic in a suﬃcient set of one-dimensional statistics that is not an ancillary statistic. Of course, given this requirement,there may not exist any possible choice for this kind of statistic. However, in this paper,we will only consider cases where this deﬁnition can be successfully applied. In othercases, a way of deﬁning the ﬁducial statistic is to allow it to be any one-to-one functionof a unique maximum likelihood estimator of θ j . This latter criterion was applied inSection 5.7 of Bowater (2018a).We will also make a more general assumption about the way in which the data weregenerated than in this earlier paper. Assumption 1: Data generating algorithm

Independent of the way in which the data were actually generated, it will be assumed5hat the data set x was generated by the following algorithm:1) Simulate the values of the ancillary complements, if any exist, of a given ﬁducialstatistic Q ( x ).2) Generate a value γ for a continuous one-dimensional random variable Γ, which has adensity function π ( γ ) that does not depend on the parameter θ j .3) Determine a value q ( x ) for the ﬁducial statistic Q ( x ) by setting Γ equal to γ and q ( x ) equal to Q ( x ) in the following expression, which eﬀectively deﬁnes the distributionfunction of Q ( x ): Q ( x ) = ϕ (Γ , θ j ) (1)where the function ϕ (Γ , θ j ) is deﬁned so that it satisﬁes the following conditions: Assumption 1.1: Conditions on the function ϕ (Γ , θ j )a) The distribution function of Q ( x ) as deﬁned by the expression in equation (1) is equalto what it would have been if Q ( x ) had been determined on the basis of the data set x .b) The only random variable upon which ϕ (Γ , θ j ) depends is the variable Γ.4) Generate the data set x by conditioning the sampling density or mass function g ( x | θ , θ , . . . , θ k ) on the already generated value for Q ( x ) and the values of any an-cillary complements of Q ( x ).Observe that Assumption 1.1 diﬀers from the corresponding assumption in Bowa-ter (2018a) due to the absence of a condition that is similar to condition (c) of Assump-tion 1.1 in this earlier paper.In the context of the above algorithm, the variable Γ will be referred to as a primaryrandom variable (primary r.v.), which is consistent with how this term was used inBowater (2017b), Bowater (2018a) and Bowater and Guzm´an (2018b). To clarify, if6his algorithm was rewritten so that the value γ of the variable Γ was generated bysetting it equal to a deterministic function of an already generated value for Q ( x ) andthe parameter θ j , then Γ would not be a primary r.v. Although the ﬁducial argument is usually considered to be a single argument, in thissection we will clarify and develop the argument by breaking it down into three separatebut related sub-arguments.

Deﬁnition 2(a): Strong or standard ﬁducial argument

This is the argument that the density function of the primary r.v. Γ after the data havebeen observed, i.e. the post-data density function of Γ, should be equal to the pre-datadensity function of Γ, i.e. the density function π ( γ ) as deﬁned in step 2 of the algorithmin Assumption 1. In the case where nothing or very little was known about the parameter θ j before the data were observed, justiﬁcations for this argument, without using Bayesianreasoning, were outlined in Section 3.1 of Bowater (2017b), Section 6 of Bowater (2018a)and Section 3.6 of Bowater and Guzm´an (2018b), and therefore will not be repeated here. Deﬁnition 2(b): Moderate ﬁducial argument

This type of ﬁducial argument will be assumed to be only applicable to cases where valuesof the primary r.v. Γ that were possible before the data were observed, i.e. values in theset { γ : π ( γ ) > } , are made impossible by the act of observing the data. Under thiscondition, it is the argument that, over the set of values of Γ that are still possible giventhe data, the relative height of the post-data density function of Γ should be equal tothe relative height of the pre-data density function of Γ.It is an argument that can be certainly viewed as being less attractive than the strongﬁducial argument as its use implies that our beliefs about Γ will be modiﬁed by the data.7evertheless, it will be made clear in Section 7.1 how this argument can be adequatelyjustiﬁed without using Bayesian reasoning in an important class of cases. Deﬁnition 2(c): Weak ﬁducial argument

This argument will be assumed to be only applicable to cases where the use of neitherthe strong nor the moderate ﬁducial argument is considered to be appropriate. It is theargument that, over the set of values of the primary r.v. Γ that are possible given thedata, the relative height of the post-data density function of Γ should be equal to therelative height of the pre-data density function of Γ multiplied by weights on the valuesof Γ that are determined from a function over the parameter θ j that was speciﬁed beforethe data were observed. The precise way in which these weights over the values of Γ areformed will be deﬁned in Section 3.4.Similar to the strong and moderate ﬁducial arguments, this type of ﬁducial argumentcan be adequately justiﬁed without using Bayesian reasoning in many important cases.Such a justiﬁcation and examples of the cases in question will be presented in Section 8. θ j In the theory of organical ﬁducial inference, it will be assumed that pre-data knowledgeabout the parameter θ j is expressed through what will be called a global pre-data functionand a local pre-data function for θ j , which have the following deﬁnitions. Deﬁnition 3: Global pre-data (GPD) function

The global pre-data (GPD) function ω G ( θ j ) is any given non-negative and locally in-tegrable function over the space of the parameter θ j . It is a function that only needsto be speciﬁed up to a proportionality constant, in the sense that if it is multiplied bya positive constant, then the value of the constant is redundant. If ω G ( θ j ) = 0 for all θ j ∈ A where A is a given subset of the real line, then this implies that it was regarded8s being impossible that θ j ∈ A before the data x were observed. Unlike a Bayesian priordensity, it is not controversial to use a GPD function that is not globally integrable.In many cases, the GPD function will have the following simple form: ω G ( θ j ) = (cid:26) θ j ∈ Ab otherwise (2)where the set A may be empty and b > Deﬁnition 4: Local pre-data (LPD) function

The local pre-data (LPD) function ω L ( θ j ) is a function of the parameter θ j that hasthe same mathematical properties as the GPD function, i.e. it is a non-negative andlocally integrable function over the space of θ j that only needs to be speciﬁed up toa proportionality constant. Its role is to complete the deﬁnition of the joint post-datadensity function of the primary r.v. Γ and the parameter θ j in cases where using eitherthe strong or moderate ﬁducial argument alone is not suﬃcient to achieve this. Forthis reason, the LPD function is in fact redundant in many situations. We describethis function as being ‘local’ because it is only used in the inferential process under thecondition that γ equals a speciﬁc value, and with this condition in place and given thedata x , the parameter θ j usually must lie in a compact set that is contained in a verysmall region of the real line. It will be seen that because of this, even if the LPD functionis not redundant, its inﬂuence on the inferential process will usually be relatively minor. Given the data x , the ﬁducial density function of the parameter θ j conditional on theparameters in the set θ − j being known, i.e. the density function f ( θ j | θ − j , x ), will be9eﬁned according to the following two compatible principles. Principle 1 for deﬁning the ﬁducial density f ( θ j | θ − j , x )This principle requires that the following condition is satisﬁed. Condition 1

Let G x and H x be the sets of all values of Γ and θ j respectively that are possible giventhe value of the ﬁducial statistic q ( x ) and its ancillary complement, if it exists, that arecalculated on the basis of the data x . In deﬁning these sets, it is assumed that values of θ j that were regarded as being impossible before the data were observed can not be madepossible by observing the data. Given this notation, the present condition is satisﬁed if,on substituting the variable Q ( x ) in equation (1) by the value q ( x ), this equation woulddeﬁne a bijective mapping between the set G x and the set H x .Under Condition 1, the ﬁducial density function f ( θ j | θ − j , x ) is deﬁned by setting Q ( x )equal to q ( x ) in equation (1), and then treating the value θ j as being a realisation of therandom variable Θ j , to give the expression: q ( x ) = ϕ (Γ , Θ j )except that, instead of Γ necessarily having the density function π ( γ ) as deﬁned instep 2 of the algorithm in Assumption 1, it will be assumed to have the following densityfunction: π ( γ ) = (cid:26) c ω G ( θ j ( γ )) π ( γ ) if γ ∈ G x θ j ( γ ) is the value of θ j that maps on to the value γ , the function ω G ( θ j ( γ )) is theGPD function as introduced by Deﬁnition 3, and c is a normalising constant.The function π ( γ ) will be regarded as being the post-data density function of Γ. Also,in the deﬁnition of the weak ﬁducial argument, i.e. Deﬁnition 2(c), the function over θ j γ values in the construction of this densityfunction for Γ will now be identiﬁed as being the GPD function.Observe that if the GPD function is neutral, i.e. it has the form given in equation (2),then over the set G x , the density π ( γ ) will be equal to the pre-data density π ( γ )conditioned to lie in this set. For this type of GPD function, if G x = { γ : π ( γ ) > } (4)then clearly the procedure for making inferences about θ j will depend on the strong ﬁdu-cial argument, otherwise it will depend on the moderate ﬁducial argument. Alternatively,if the GPD function is not equal to a positive constant over the set H x , then we can seethat inferences about θ j will be made by using the weak ﬁducial argument.Furthermore notice that if, on substituting the variable Q ( x ) by the value q ( x ), equa-tion (1) deﬁnes an injective mapping from the set { γ : π ( γ ) > } to the space of theparameter θ j , then the GPD function ω G ( θ j ) expresses in eﬀect our pre-data beliefs about θ j relative to what is implied by the strong ﬁducial argument. By doing so, it determineswhether the strong, moderate or weak ﬁducial argument is used to make inferences about θ j , and also the way in which the latter two arguments inﬂuence the inferential process.In this respect, under the same assumption concerning equation (1), it can be seenthat if the pre-data density π ( γ ) is a uniform density for Γ over (0 , d = (cid:90) γ ∈ D ω G ( θ j ( γ )) dγ and e = (cid:90) γ ∈ E ω G ( θ j ( γ )) dγ where D and E are non-empty subsets of the interval (0 ,

1) such that the events { Γ ∈ D } and { Γ ∈ E } are assigned the same probability by the density π ( γ ), then assuming that e is not zero, the probability of the event { Γ ∈ D } will be d/e times the probability ofthe event { Γ ∈ E } after the data have been observed.Finally, it should be noted that in the theory of subjective ﬁducial inference as outlined11n Bowater (2018a), the density π ( γ ) is eﬀectively always deﬁned to be equal to thedensity π ( γ ), i.e. the only type of ﬁducial argument that this earlier theory relies on isthe strong ﬁducial argument. Principle 2 for deﬁning the ﬁducial density f ( θ j | θ − j , x )This principle requires that the following two conditions are satisﬁed. Condition 2(a)

Given the value q ( x ) for the variable Q ( x ), it is required that, H x = { θ j : ( ∃ γ ∈ G x )[ θ j ∈ θ j ( γ )] } where G x and H x are as deﬁned in Condition 1, and θ j ( γ ) is the set of values of θ j thatmap on to the value γ according to equation (1). Condition 2(b)

The GPD function ω G ( θ j ) must be equal to a positive constant over the set H x .Under Conditions 2(a) and 2(b), the ﬁducial density function f ( θ j | θ − j , x ) is deﬁnedby f ( θ j | θ − j , x ) = (cid:90) γ ∈ G x ω ∗ ( θ j | γ ) π ( γ ) dγ (5)where π ( γ ) is as deﬁned in equation (3), although ω G ( θ j ( γ )) will always be equal toa positive constant in this equation, and the conditional density function ω ∗ ( θ j | γ ) isdeﬁned by ω ∗ ( θ j | γ ) = (cid:26) c ( γ ) ω L ( θ j ) if θ j ∈ θ j ( γ )0 otherwise (6)where ω L ( θ j ) is the LPD function as introduced by Deﬁnition 4, and c ( γ ) is a normalisingconstant, which clearly must depend on the value of γ .It can be seen that the density function f ( θ j | θ − j , x ) as deﬁned by equation (5) is12ormed by marginalising, with respect to γ , a joint density of Γ and θ j that is based on ω ∗ ( θ j | γ ) being the conditional density of θ j given γ , and on π ( γ ) being the marginaldensity of Γ. Similar to what was the case under Principle 1, if the condition in equa-tion (4) is satisﬁed, then the density π ( γ ) will be equal to the density π ( γ ), i.e. thedensity function f ( θ j | θ − j , x ) is determined on the basis of the strong ﬁducial argument,otherwise it is determined on the basis of the moderate argument. However, in contrastto what was the case under Principle 1, the weak ﬁducial argument is never used to makeinferences about θ j .Also, we can observe that the density function ω ∗ ( θ j | γ ) deﬁned in equation (6) isformed by normalising the LPD function after θ j has been restricted to lie in the subset θ j ( γ ). The role of the density ω ∗ ( θ j | γ ) is therefore to make use of the nature of the LPDfunction to distribute θ j over those values of θ j that are consistent with any given valueof γ . For this reason, it is assumed that the LPD function ω L ( θ j ) is chosen to reﬂect whatwe believe about θ j . In particular, these beliefs are assumed to be our pre-data ratherthan post-data beliefs about θ j , as otherwise it is evident that, in general, we would beguilty of making inferences about θ j by using the data twice. As eluded to in Deﬁnition 4,the sets θ j ( γ ) will in general be compact sets that are usually wholly contained withinvery small regions of the real line.Furthermore, it is worth noting that if Condition 2(b) is satisﬁed, then Principle 1 isessentially a special case of Principle 2. This is because to apply Principle 1 it is requiredthat Condition 1 holds, and if it does then, ﬁrst, Condition 2(a) must hold, second, thedensity ω ∗ ( θ j | γ ) could be regarded as converting itself into a point mass function atthe value θ j ( γ ), and third, as a result of this, the joint density function of Γ and θ j inequation (5) eﬀectively becomes a univariate density function. Therefore, the integrationof this latter function with respect to γ would be naturally regarded as being redundant.As a ﬁnal point, we need to acknowledge the fact that important cases exist in which13either Condition 1 is satisﬁed nor Conditions 2(a) and 2(b) are both satisﬁed. If Condi-tion 2(a) does not hold, then we have a problem that could be described as ‘spillage’ dueto the fact that the set H x will be a proper subset of { θ j : ( ∃ γ ∈ G x )[ θ j ∈ θ j ( γ )] } , andtherefore this latter set ‘spills out’ of the set H x . How to deal with this problem of spillagewill be returned to in Section 7.2, and how to deal with cases where Condition 2(b) doesnot hold will be discussed in Section 8.

4. Multivariate organic ﬁducial inference

We will now consider the case where all the parameters θ , θ , . . . , θ k in the samplingmodel are unknown. Deﬁnition 5: Joint ﬁducial density functions

Under the assumption that Principles 1 or 2, or any natural variations on these principles,can be used to deﬁne the full conditional ﬁducial densities f ( θ j | θ − j , x ) for j = 1 , , . . . , k (7)and that this set of conditional densities determine a joint density function for the param-eters θ , θ , . . . , θ k , this latter density function will be deﬁned as being the joint ﬁducialdensity function of these parameters, and will be denoted as f ( θ , θ , . . . , θ k | x ). It canbe easily shown that this density function will always be unique.To corroborate that the set of full conditional densities in equation (7) actually de-termine a joint density function for the parameters concerned, the analytical or thecomputational method that were proposed for this purpose in Bowater (2018a) could beapplied. These methods will now be brieﬂy described. An analytical method

Under the assumption that the set of full conditional densities in equation (7) can be ex-14ressed analytically, a way of establishing whether they determine a joint density functionfor θ , θ , . . . , θ k is simply to propose an analytic expression for such a density function,derive the full conditional densities of the proposed density function, and see if theymatch the full conditionals in equation (7). Statement about incompatible full conditional densities

It is not acknowledged in the following subsection or in Section 6.3 that the stationarydensity of an ergodic Gibbs sampler is aﬀected by the scanning order of the variableson which the sampler is based when the full conditional densities concerned are incom-patible. These sections will be rewritten in due course to take this important issueinto account. Nevertheless, doing so will not aﬀect the relevance of the results that arecurrently presented in the example in the latter section.

A computational method

A more general method for establishing whether the full conditional densities in equa-tion (7) determine a joint density function for the parameters concerned is based onattempting to generate random samples from this joint density by applying the Gibbssampler (Geman and Geman 1984 and Gelfand and Smith 1990) to the full conditionalsin question. Of course, the Gibbs sampler, assuming that it is irreducible and aperiodic,will only converge to a unique stationary density if the joint density f ( θ , θ , . . . , θ k | x )actually exists (and the reverse is also true). For this reason, we now choose to redeﬁnethe problem as being one of trying to establish whether the Gibbs sampler converges toa unique stationary density on the basis of the observed behaviour of this sampler.This may also seem to be a diﬃcult problem to resolve. However, in a more conven-tional application of the Gibbs sampler, we are faced with the similar problem of whetherthe sampler converges to its unique stationary density in a reasonable amount of time,i.e. before a large pre-speciﬁed number of cycles of the sampler have been completed.15his is the reason why a substantial number of techniques have been developed to assesswhether Monte Carlo Markov chains, such as the Gibbs sampler, converge to their uniquestationary densities within a given ﬁnite number of cycles, see for example Gelman andRubin (1992) and Brooks and Roberts (1998).Obviously, if there is the added complication that we are not completely sure that theGibbs sampler has a unique stationary density, then it would seem appropriate that weuse these convergence diagnostics more intensively. On the whole though, if in the contextof having already taken into account how the full conditional densities in equation (7)were formed, the use of such diagnostics can give us a high degree of conﬁdence that theGibbs sampler has converged to a unique stationary density, then of course we shouldhave a high degree of conﬁdence that the joint ﬁducial density f ( θ , θ , . . . , θ k | x ) doesindeed exist.An important beneﬁt of using the Gibbs sampling method that has just been describedis that to calculate expectations of interest with respect to this joint ﬁducial density, wewill often need to rely on simulation methods such as the Gibbs sampler. Therefore byusing this Gibbs sampling method, two goals can be achieved simultaneously.

5. An example with continuous data and little pre-data knowledge

We will now apply the methodology put forward in the previous sections to some exam-ples. To begin with, let us consider the standard problem of making inferences about themean µ of a normal density function, when its variance σ is unknown, on the basis of asample x of size n , i.e. x = ( x , x , . . . , x n ), drawn from the density function concerned.Although the way in which the theory of subjective ﬁducial inference can be used tosolve this problem was detailed in Bowater (2018a), let us quickly place this problem inthe context of the type of inference that is the subject of the present paper, i.e. organicﬁducial inference. 16f σ is known, a suﬃcient statistic for µ is the sample mean ¯ x , which therefore canbe assumed to be the ﬁducial statistic Q ( x ). Based on this assumption, equation (1) canbe expressed as ¯ x = ϕ (Γ , µ ) = µ + ( σ/ √ n )Γ (8)where Γ ∼ N (0 , µ before the data x were observed, it is quite natural to specify the GPD function for µ as follows: ω G ( µ ) = a , µ ∈ ( −∞ , ∞ ), where a >

0. Furthermore, since equation (8)will always satisfy Condition 1, the ﬁducial density f ( µ | σ , x ) can be always determinedby Principle 1. In particular, as the GPD function is neutral, and the condition inequation (4) will be satisﬁed, the ﬁducial density in question is derived under this principleby applying the strong ﬁducial argument. As a result, it can be easily shown that thisﬁducial density is deﬁned by µ | σ , x ∼ N (¯ x, σ /n )On the other hand, if µ is known, a suﬃcient statistic for σ is ˆ σ = (1 /n ) (cid:80) ni =1 ( x i − µ ) which will be assumed to be Q ( x ). Based on this assumption, equation (1) can beexpressed as ˆ σ = ϕ (Γ , σ ) = ( σ /n )Γwhere Γ ∼ χ n . Under the assumption of no or very little pre-data knowledge about σ ,it is quite natural to specify the GPD function for σ as follows: ω G ( σ ) = b if σ ≥ b >

0. Furthermore, we can see that Principle 1 will be againalways applicable, and as the GPD function is neutral and the condition in equation (4)will be satisﬁed, the ﬁducial density f ( σ | µ, x ) is derived under this principle by againcalling on the strong ﬁducial argument. As a result, it can be easily shown that thisﬁducial density is a scaled inverse χ density function with n degrees of freedom andscaling parameter equal to ˆ σ . 17inally, by using the analytical method outlined in Section 4, it can be easily estab-lished that the conditional density functions f ( µ | σ , x ) and f ( σ | µ, x ) that have justbeen deﬁned determine a joint ﬁducial density for µ and σ , and by integrating overthis joint density function, it can be deduced that the marginal ﬁducial density for µ isdeﬁned by µ | x ∼ t n − (¯ x, s/ √ n ) (9)where s is the sample standard deviation, i.e. it is the well-used non-standardised Student t density function with n − x and scalingparameter equal to s/ √ n .The full conditional ﬁducial densities for many other problems of inference are natu-rally obtained in a similar way, i.e. under Principle 1, with a neutral GPD function andapplying the strong ﬁducial argument. For example, the full conditional ﬁducial densitiesthat were put forward in all the applications of subjective ﬁducial inference that werediscussed in Bowater (2018a) can be derived either exactly or approximately under thesame assumptions.Let us now turn to the issue of how to interpret the joint ﬁducial density functions f ( θ , θ , . . . , θ k | x ) that can be derived under these assumptions in terms of the frameworkof generalised subjective probability, i.e. the deﬁnition of probability outlined in Bowaterand Guzm´an (2018b). In accordance with what was explained back in Section 2, tocomplete the deﬁnition of any ﬁducial or posterior distribution, within this framework,we require both the distribution function of the variables concerned, and an assessment ofthe external strength of this function relative to other distribution functions of interest.With regard to the main example of the present section, a detailed evaluation of theexternal strength of the ﬁducial distribution function of µ given σ , i.e. F ( µ | σ , x ), wasprovided in Bowater and Guzm´an (2018b). In particular, it was shown how it can beargued that if the compound events R ( λ ) in the reference set R (using the notation of this18arlier paper) are made up of the outcomes of a well-understood physical experiment, e.g.the positions of a wheel after it has been spun, then, for any resolution λ ∈ [0 . , . F ( µ | σ , x ) should be judged asbeing at a level that is close to the highest attainable level. On the basis of the argumentspresented in Bowater (2018a) and Bowater and Guzm´an (2018b), the same conclusioncan be reached about the relative external strength of the ﬁducial distribution functionof σ given µ , i.e. F ( σ | µ, x ).Since the joint ﬁducial distribution function of µ and σ is fully deﬁned by two distri-bution functions, namely F ( µ | σ , x ) and F ( σ | µ, x ), that, under the assumptions thathave been made about the reference set R and the resolution λ , can both be argued asbeing externally very strong, then under the same assumptions, it can be argued thatthis joint distribution function should also be regarded as being externally very strong.In loose terms, this means that the joint distribution of µ and σ in question shouldbe regarded as being close in nature to the kind of probability distribution that wouldbe placed over the outcomes of the physical experiment on which the reference set R isbased. By generalising the same line of reasoning (see Bowater 2018a for clariﬁcation),similar conclusions can be reached about the relative external strengths of the joint ﬁdu-cial distribution functions F ( θ , θ , . . . , θ k | x ) that can be derived for other problems thatsatisfy the criteria of the cases that have been considered in the present section, e.g. theproblems discussed in Bowater (2018a).

6. Examples with discrete data and little pre-data knowledge

In this section, organic ﬁducial inference will be applied to examples in which the data x are discrete, and where nothing or very little was known about the model parametersbefore the data were observed. 19 .1. Inference about a binomial proportion First, let us consider the problem of making inferences about the population proportionof successes p on the basis of observing x successes in n trials, where the probability ofobserving any given number of successes y follows the usual deﬁnition of the binomialmass function as speciﬁed by: g ( y | p ) = (cid:18) ny (cid:19) p y (1 − p ) n − y for y = 0 , , . . . , n As clearly the value x is a suﬃcient statistic for p , it will therefore be assumed to bethe ﬁducial statistic Q ( x ). Based on this assumption, equation (1) can be expressed as x = ϕ (Γ , p ) = min { z : Γ < z (cid:88) y =0 g ( y | p ) } (10)where Γ ∼ U (0 , p , it is again quite natural that the GPD function has the following form: ω G ( p ) = a if 0 ≤ p ≤ a >

0. This time, though, since equation (10)will never satisfy Condition 1 for any choice of the GPD function and for any value of x , we can never apply Principle 1. On the other hand, this equation together with thespeciﬁed GPD function will satisfy Condition 2(a) for all possible values of x , and sinceCondition 2(b) will also hold for all x , Principle 2 can always be applied. Furthermore,as the condition in equation (4) will also be satisﬁed, inferences will be made about p under this principle by using the strong ﬁducial argument.As a result, by placing the present case in the context of the general deﬁnition ofthe ﬁducial density f ( θ j | θ − j , x ) given in equations (5) and (6), we obtain the followingexpression for the ﬁducial density f ( p | x ): f ( p | x ) = (cid:90) ω ∗ ( p | γ ) dγ (11)where ω ∗ ( p | γ ) = (cid:26) c ( γ ) ω L ( p ) if p ∈ p ( γ )0 otherwise (12)20f course, to be able to complete this deﬁnition, a LPD function ω L ( p ) needs to be spec-iﬁed. Observe that any choice for this function that satisﬁes the very loose requirementsof Deﬁnition 4 will lead to a ﬁducial density f ( p | x ) that is valid for any n ≥ x = 0 , , . . . , n . Nevertheless, to provide two practical examples, we will choose tohighlight the two LPD functions that are deﬁned by ω L ( p ) = b if 0 ≤ p ≤ b > ω L ( p ) = 1 / (cid:112) p (1 − p ) if 0 ≤ p ≤ f ( p | x ) for any given value of x . However, drawingrandom values from this density function will be generally fairly straightforward.In this respect, the histograms in Figures 1(a) and 1(b) were each formed on the basisof one million independent random values drawn from the density function f ( p | x ), with n being equal to 10 and the observed x being equal to 1. The results in Figure 1(a) dependon choosing the LPD function to be the one given in equation (13), while the results inFigure 1(b) depend on this function being as deﬁned in equation (14). The dashed-linecurves in these ﬁgures represent the posterior density for p that corresponds to the priordensity for p being uniform on (0 , p that corresponds to the prior density for p being the Jeﬀreysprior for the case in question, i.e. the density function that is proportional to the functionfor p in equation (14).It can be seen from these ﬁgures that, although the posterior density for p is highlysensitive to which of the two prior densities is used, the ﬁducial density for p barely movesdepending on whether the LPD function is proportional to the uniform prior, or whetherit is proportional to the Jeﬀreys prior for this case. Moreover, we can observe that thetwo ﬁducial densities for p in question both closely approximate the posterior density for21 a) D en s i t y p (b) D en s i t y p Figure 1: Samples from the organic ﬁducial density of a binomial proportion p that is based on this Jeﬀreys prior.Similar to the previous section, let us now turn to the issue of how to interpret theﬁducial density f ( p | x ) in terms of the framework of generalised subjective probability. Itwill be assumed that the reference set R and the range of the resolution λ are as deﬁnedin this earlier section.To begin with, on the basis of the lines of reasoning presented in Bowater (2018a) andBowater and Guzm´an (2018b), it can be argued that the relative external strength of thedistribution function that corresponds to the post-data density of the primary r.v. Γ, i.e.the uniform density function π ( γ ), should be judged as being at a level that is close tothe highest attainable level, which loosely means that arguably this density should bean extremely good representation of our post-data beliefs about Γ. On the other hand,given that it is being assumed that we have no or very little pre-data knowledge about p ,it will not be easy to ﬁnd an LPD function ω L ( p ) that adequately represents our pre-databeliefs about p . Therefore, it would be expected that similar to any prior distributionfunction that could be chosen for p in this type of situation, the distribution functionsthat correspond to the conditional densities ω ∗ ( p | γ ) deﬁned in equation (12) would be22udged as being externally quite weak.Nevertheless, since these latter distribution functions are deﬁned over intervals for p that will be generally much shorter than the interval for p over which the prior distributionfunction for p must be deﬁned, i.e. the interval (0 , p . Moreover, since in cases where n is not very small and x is notequal to 0 or n , the role of the LPD function ω L ( p ) could be described as being heavilysubordinate to the role of the density π ( γ ) in the construction of the joint density of p and γ in equation (11), it can be argued that, in these cases, the distribution functionthat corresponds to the ﬁducial density f ( p | x ) should be regarded as being externallyvery strong. In loose terms, this means that the ﬁducial probability of p lying in anygiven interval of moderate width should be regarded as being close in nature to theprobabilities of the events contained in the reference set R .By contrast, since the posterior density for p is eﬀectively obtained through Bayes’theorem by simply reweighting the prior density for p , that is, by normalising the den-sity function that results from multiplying this prior density function by the likelihoodfunction, it would seem diﬃcult to use a form of a reasoning that is compatible with theBayesian paradigm, to argue that the relative external strength of the posterior distri-bution function for p should not be heavily dependent on the relative external strengthof the prior distribution function for p , which as already mentioned would be expectedto be externally quite weak. We will now consider the problem of making inferences about an unknown event rate τ on the basis of observing x events over a time period of length t , where the probabilityof observing any given number of events y over a period of this length follows the usual23eﬁnition of the Poisson mass function as speciﬁed by: g ( y | τ ) = ( τ y /y !) exp( − τ ) for y = 0 , , , . . . Again, since the data set to be analysed consists of a single value x , this value will beassumed to be the ﬁducial statistic Q ( x ). Based on this assumption, equation (1) can beexpressed in a way that is similar to equation (10), i.e. x = ϕ (Γ , τ ) = min { z : Γ < z (cid:88) y =0 g ( y | τ ) } (15)where Γ ∼ U (0 , τ , theGPD function will again be speciﬁed in the following way: ω G ( τ ) = a for τ > a >

0. Similar also to the previous problem, the nature of equation (15)means that Principle 1 can never be applied for any choice of the GPD function, but theparticular choice that has been made for this latter function means that Principle 2 canalways be applied, and in particular, inferences will be made about τ under this principleby using the strong ﬁducial argument.As a result, expressions that deﬁne the ﬁducial density f ( τ | x ) are identical to theexpressions in equations (11) and (12) except that the proportion p is replaced by theevent rate τ . Although any choice for the LPD function ω L ( τ ) that conforms to Deﬁni-tion 4 will imply that this ﬁducial density is valid for any x = 0 , , , . . . , let us choose tohighlight the consequences of using the two LPD functions that are deﬁned by ω L ( τ ) = b if τ > b > ω L ( τ ) = 1 / √ τ if τ > f ( τ | x ),24 a) D en s i t y . . . . t (b) D en s i t y . . . . t Figure 2: Samples from the organic ﬁducial density of a Poisson event rateunder the assumption that two events were observed over a given period of length t ,i.e. x = 2, with the LPD functions that underlie the results in these two ﬁgures beingdeﬁned by equation (13) and by equation (14) respectively. In these ﬁgures, the dashed-line curves represent the posterior density for τ that corresponds to the prior densityfor τ being the function for τ in equation (16), while the solid-line curves represent thisposterior density when the prior density for τ is the function for τ in equation (17),i.e. the Jeﬀreys prior for the case in question. Observe that the use of these two priordensities for τ is controversial as they are both improper.It is evident that there is almost no diﬀerence between the two histograms in Fig-ures 2(a) and 2(b), and as was the case for the histograms in Figures 1(a) and 1(b),they are both closely approximated by the posterior density that is based on the Jeﬀreysprior for the problem of interest. Furthermore, using a very similar line of reasoningto the one that in Section 6.1 was used to argue that, under certain assumptions, thedistribution function that corresponds to the ﬁducial density f ( p | x ) should be regardedas being externally very strong, it can also be argued, under the same assumptions aboutthe set R and the resolution λ , that if x >

0, the distribution function that corresponds25o the ﬁducial density of current interest, i.e. f ( τ | x ), should also be regarded as beingexternally very strong. To conclude this section, let us consider the problem of making inferences about thepopulation proportions p = ( p , p , . . . , p k +1 ) (cid:48) of all the k + 1 outcomes of an experiment,where p i is the proportion of times outcome i is generated by the experiment, based onobserving any given sample of counts x = ( x , x , . . . , x k +1 ) (cid:48) of these outcomes, where x i is the number of times outcome i is observed, and the probability of observing thissample followed the usual deﬁnition of the multinomial mass function as speciﬁed by: g ( x | p ) = n ! x ! x ! · · · x k +1 ! k +1 (cid:89) i =1 p x i i for x , x , . . . , x k +1 ∈ Z ≥ , where n = (cid:80) k +1 i =1 x i Given that p k +1 = 1 − (cid:80) ki =1 p i , let us deﬁne the complete set of model parametersas being the set { p , p , . . . , p k } . Now, if it is assumed that all the proportions in thisset are known except p j , a set of suﬃcient statistics for p j would be { x j , x j + x k +1 } .However, x j + x k +1 is an ancillary statistic, and therefore according to Deﬁnition 1, itcan be assumed that x j is the ﬁducial statistic Q ( x ). Under this assumption, and takinginto account that the quantity p j + p k +1 is known, it is convenient to express the deﬁnitionof the conditional ﬁducial density f ( p j | p − j , x ), where p − j = { p , . . . , p j − , p j +1 , . . . , p k } ,in terms of the ﬁducial density f ( r j | p − j , x ), where r j = p j / (( p j + p k +1 ). This is becausethe deﬁnition of this latter ﬁducial density is equivalent to the deﬁnition of the ﬁducialdensity f ( p | x ) in equations (11) and (12) except that p , x and n in this earlier deﬁnitionare substituted by r j , x j and x j + x k +1 respectively.In this way, the set of full conditional ﬁducial densities for this problem can be deter-mined, i.e. the set f ( p j | p − j , x ) for j = 1 , , . . . , k (18)26n the basis of having done this, the histograms in Figures 3(a)-(d) summarise asample of three million realisations of all the parameters of a multinomial distributionfunction with k = 4 that was obtained by excluding an initial burn-in sample of 500 ofsuch random vectors from one run of a Gibbs sampler applied to this set of full conditionaldensities. The sample of counts x was (0 , , , , (cid:48) , and to complete the deﬁnition of theseconditional ﬁducial densities, the LPD functions concerned, i.e. { ω L ( p j ) : j = 1 , , , } were all chosen to have the form of the LPD function given in equation (13). The Gibbssampler in question was also run various times more from diﬀerent starting points, andthe results provided no evidence to suggest that the sampler was failing to converge to aunique stationary density function. Therefore, it would seem reasonably safe to assumethat the full conditional densities in equation (18) determine a joint ﬁducial density forthe parameters concerned, and we have succeeded in generating a series of random vectorsfrom this density function.The solid-line curves in Figures 3(a)-(d) represent the marginal posterior densities foreach of the parameters p , p , p and p respectively when the joint prior density for theseparameters is the Jeﬀreys prior for the case in question, i.e. a symmetric Dirichlet densitywith concentration parameter α equal to 0.5. On the other hand, the long-dashed andshort-dashed curves in these ﬁgures represent these marginal posterior densities whenthe joint prior density concerned is, respectively, a uniform density and the Perks priordensity, i.e. a symmetric Dirichlet density with α equal to 1 / ( k + 1). For any given valueof k , the use of the uniform prior density was advocated for example by Tuyl (2017), whilethe use of the Perks prior density was advocated for example by Berger et al. (2015).It can be seen that the histograms for the proportions p , p and p in Figures 3(b)-(d) are closely approximated by the marginal posterior densities corresponding to each ofthese parameters when the joint prior density is the Jeﬀreys prior for this case, whereasthe histogram for the proportion p in Figure 1(a) is only loosely approximated by the27 a) D en s i t y p (b) D en s i t y p (c) D en s i t y p (d) D en s i t y . . . . p Figure 3: A sample from a joint organic ﬁducial density of multinomial proportionsobtained using the Gibbs samplermarginal posterior density for p derived on the basis of this prior density. Also, thecovariances between all the proportions p = ( p , . . . , p ) (cid:48) , except those involving theparameter p , were found to be very similar between the joint ﬁducial density and thejoint posterior density in question. Furthermore, additional simulations showed that thejoint ﬁducial density in this example was not very sensitive to the choice of the LPDfunctions concerned, i.e. { ω L ( p j ) : j = 1 , , , } .Before proceeding let us assume that the reference set R and the resolution λ aredeﬁned as in previous sections. Now, given the natural relationship that exists betweenany of the full conditional densities in equation (18) and the ﬁducial density for a binomialproportion deﬁned in equations (11) and (12), a similar line of reasoning to one outlined28n Section 6.1 can be used to argue that the distribution functions that correspond to thedensities in equation (18) should all be regarded as being externally very strong providedthat x j > x k +1 > x j + x k +1 is not very small for all values of j . The ﬁrst ofthese conditions of course does not apply in the case where x = 1 in the example that hasbeen highlighted, but this example was not chosen to represent the most ideal scenario.Furthermore, since the joint ﬁducial distribution function of all the proportions p is fullydeﬁned by the full conditional densities in equation (18), a similar line of reasoningto one mentioned in Section 5 can be used to argue that this joint distribution functionshould also be regarded as being externally very strong provided that the aforementionedconditions on the counts x hold, and the total count n is not very small relative to thenumber of proportions k + 1.Finally, it needs to be taken into account that the joint ﬁducial distribution functionin question is potentially sensitive to which of the population proportions is deﬁned tobe the proportion p k +1 . However, extensive simulations that were conducted showedthat the eﬀect of this choice of parameterisation was generally negligible, and was onlyfound to be slightly more than negligible in certain cases where the total count n wasless than the number of proportions k + 1. Moreover, this issue can be easily resolved byalways applying the criterion of designating the proportion p k +1 so that its correspondingcount x k +1 is the highest or equal highest out of all the counts x . As the count x k +1 isalways one of the two counts that are used to form each of the full conditional ﬁducialdensities in equation (18), this criterion is justiﬁable from a statistical viewpoint, and italso guarantees that the case is avoided where the count x k +1 = 0, and at least one of theremaining counts equals zero, which would imply that at least one of these conditionalﬁducial densities is undeﬁned. 29 . Examples with restricted parameter spaces Let us now turn our attention to examples of the application of organic ﬁducial inferencein which it was known, before the data were observed, that values in a given subset of thenatural space of the model parameters were impossible, but apart from this, nothing orvery little was known about these parameters. In relation to this issue, the importanceof the need to make inferences about a normal mean µ when there is a lower bound on µ ,and about a Poisson rate parameter τ when there is a positive lower bound on τ has beenunderlined by practical examples from the ﬁeld of quantum physics that are described,for example, in Mandelkern (2002). These examples motivate what will be examined inthe present section. With regard to the example considered in Section 5, let us change what is assumed tohave been known about the mean µ before the data were observed to the assumptionthat, for any given value of the variance σ , it was known that µ > µ , where µ is agiven ﬁnite constant, but apart from this, nothing or very little was known about µ . Inthis situation, it is quite natural to specify the GPD function for µ as follows: ω G ( µ ) = (cid:26) a if µ > µ µ ≤ µ where a >

0. Although, as was the case in Section 5, this GPD function is neutral,this time the condition in equation (4) will never hold, and therefore the ﬁducial density f ( µ | σ , x ) is derived under Principle 1 by using the moderate rather than the strongﬁducial argument. The consequence of this in terms of the deﬁnition of the marginalﬁducial density for µ is that this density function becomes simply the marginal densityfunction for µ deﬁned in equation (9) conditioned to lie in the interval ( µ , ∞ ). However,it is of interest to examine the potential eﬀect on the relative external strength of this30arginal density function due to the use of the moderate rather than the strong ﬁducialargument in constructing the conditional density f ( µ | σ , x ).In this regard, let us remember that in the deﬁnition of the function ϕ (Γ , µ ) in equa-tion (8) it was assumed that the pre-data density function of the primary r.v. Γ, i.e. thefunction π ( γ ), is a standard normal density function. Now, on observing the samplemean ¯ x , we immediately know that the value γ generated in step 2 of the algorithm inAssumption 1 must be less than the value γ = ( √ n/σ )(¯ x − µ ).The moderate ﬁducial argument in this situation, i.e. the argument that the relativeheight of the post-data density function of Γ, i.e. the function π ( γ ), in the interval( −∞ , γ ) should be equal to the relative height of π ( γ ) over this interval, is similar (butnot identical) to the Bayesian argument that the relative height of a density function fora ﬁxed parameter θ should not be aﬀected by learning that a given subset of values for θ are impossible, apart from it of course becoming equal to zero over this subset. Althoughthis type of Bayesian argument has been criticised as being overly simpliﬁed due to thefact that it does not take into account the manner in which we learn that values in theparticular subset are impossible, see for example Shafer (1985), it is an argument thatis considered as being almost universally acceptable. For this reason, under the sameassumptions about the reference set R and the resolution λ as made in previous sections,it can be argued that the density function π ( γ ), i.e. a standard normal density for γ truncated to the interval ( −∞ , γ ), in the context of being a representation of what isbelieved about γ after the data are observed, should be regarded as being externally verystrong. As a result, under the same assumptions, the case can be made that the jointﬁducial density of µ and σ in the present example, and the marginal densities that canbe derived from this joint density should also be regarded as being externally very strong.Clearly the same type of reasoning can be applied to many other problems of inferenceover restricted parameter spaces that are similar to the problem that has just been31iscussed. Returning to the problem of making inferences about a Poisson rate parameter τ thatwas discussed in Section 6.2, let us now assume that before the data were observed, it wasknown that τ > τ , where τ is a given positive constant, but apart from this, nothingor very little was known about τ . Again, as was the case in Section 6.2, it is clear thatPrinciple 1 can not be applied to determine the ﬁducial density of τ .Observe that, in this new situation, the set H x as deﬁned in Condition 1, where theparameter θ j in this deﬁnition is τ , is the set { τ : τ > τ } , and that it is naturalto specify the GPD function ω G ( τ ) so that Condition 2(b) is satisﬁed. However, incontrast to the example outlined in Section 6.2, the deﬁnition of the function ϕ (Γ , τ )given in equation (15) implies that the set G x as deﬁned in Condition 1 does not satisfyCondition 2(a), and therefore we have the problem of ‘spillage’ that was referred to atthe end of Section 3.4.The ﬁrst step of a very straightforward way of trying to circumvent this diﬃculty isto make inferences about τ in an artiﬁcial scenario, namely the scenario considered inSection 6.2. In doing this, it will be assumed that the LPD function is chosen to representas best as possible a general situation where nothing or very little was known about theparameter τ over the interval (0 , ∞ ) before the data were observed, e.g. the LPD functiongiven in equation (16) or equation (17). Having determined a ﬁducial density for τ overthe interval (0 , ∞ ) by using this method, we then simply condition this density to liein the interval ( τ , ∞ ) to thereby obtain a ﬁducial density for τ that corresponds to theproblem at hand.Although in applying this strategy we do not directly use any of the three types ofﬁducial argument outlined in Section 3.2, if the same strategy was applied to the example32iscussed in Section 7.1, which of course would not require the use of a LPD function,then the ﬁducial density of µ conditional on σ being known, i.e. the density f ( µ | σ , x ),would be the same as is obtained by using the approach put forward in this previoussection, i.e. an approach that is based on the moderate ﬁducial argument. On the otherhand, the strategy has the clear disadvantage that it depends on expressing pre-dataknowledge about a parameter of interest via the GPD function, and possibly also viathe LPD function, with regard to an artiﬁcial scenario rather than the scenario thatis actually under consideration. Nevertheless, under the same assumptions about thereference set R and the resolution λ as made in previous sections, it still can be arguedthat, if in the present example, the observed count x is greater than zero and is not verysmall relative to the threshold τ , then the ﬁducial density for τ that results from usingthis strategy should be regarded as being externally quite strong.To give a good practical example of the application of the strategy that has just beendiscussed, let us suppose that the threshold τ , which will be regarded as the event ratefor the background noise over a time length t , needs to be estimated on the basis of aPoisson count x collected over a period of length α times t when only background noisecould be present, where α is a given value. Since it will be assumed that τ can takeany positive value, the ﬁducial density of τ formed on the basis of the data x , i.e. thedensity f ( τ | x ), is deﬁned in the same way as the ﬁducial density f ( τ | x ) was deﬁnedin Section 6.2. Taking into account also a Poisson count x collected over a period oflength t when a signal should be present, we will then be interested in making inferencesabout the event rate τ = τ + τ over this time period, which will be regarded as theevent rate for background noise plus the signal. Due to the fact that τ will be assumedto be a positive event rate, namely the event rate for the signal only, the parameter τ must be greater than τ , and so it will be assumed that the ﬁducial density of τ formedon the basis of the data x and conditioned on τ being greater than τ , i.e. the density33 ( τ | τ , x ), is determined using the method described in the present section. Given thesedeﬁnitions, the joint ﬁducial density of τ and τ can therefore be expressed as f ( τ, τ | x, x ) = f ( τ | τ , x ) f ( τ | x )To illustrate a speciﬁc case, Figures 4(a) and 4(b) show histograms of one million inde-pendent random values drawn from, respectively, the marginal density of τ = τ + τ andthe marginal density of τ over this joint ﬁducial density assuming that the LPD functionthat was used to form both of the densities f ( τ | τ , x ) and f ( τ | x ) was the simple stepfunction given in equation (16) and that α = 4, x = 3 and x = 2. The solid-line anddashed-line curves in Figure 4(a) represent the posterior density of τ that corresponds,respectively, to the use of the Jeﬀreys prior for the case when τ is unrestricted over theinterval (0 , ∞ ) and to the use of this prior density with the condition that τ > .

75, where0.75 (= x /α ) is clearly the maximum likelihood estimate of τ . These curves have beenadded to this ﬁgure, only because we know that, under the conditions in question, theyclosely approximate the ﬁducial densities for τ when the LPD function being consideredis used. In particular, comparing the lower tails of the histogram and the dashed-linecurve in Figure 4(a), highlights the extra uncertainty that is introduced by taking intoaccount the statistical error in the estimation of τ .

8. An example with two diﬀerent GPD functions that are non-neutral

To give a ﬁnal example of the application of organic ﬁducial inference, let us again returnto the problem of inference considered in Section 5, and let us assume that the GPDfunction ω G ( µ ) used to determine the ﬁducial density of the mean µ given the variance σ , i.e. the density f ( µ | σ , x ), is one of the two step functions deﬁned by ω G ( µ ) = (cid:26) a if µ >

01 otherwise (19)34 a) D en s i t y . . . . t + t (b) D en s i t y . . . . t Figure 4: Samples from marginal organic ﬁducial densities of Poisson event ratesand by ω G ( µ ) = (cid:26) a if − b < µ < b a is any given constant greater than one, and b is any given positive constant.As a way of interpreting either of these two GPD functions, it can be observed that ifthere is an interval of values ( γ , γ ) for the primary r.v. γ such that ω G ( µ ) = 1 for all µ ∈ { µ ( γ ) : γ ∈ ( γ , γ ) } , where in keeping with earlier notation µ ( γ ) is the value of µ that maps on to the value γ given the data x , and there is another interval { γ , γ } for γ such that ω G ( µ ) = a for all µ ∈ { µ ( γ ) : γ ∈ ( γ , γ ) } , then the probability of the event { γ ∈ ( γ , γ ) } divided by the probability of the event { γ ∈ ( γ , γ ) } will be regarded asbeing a times larger after the data are observed than before step 2 of the algorithm inAssumption 1 was implemented.Clearly the GPD function in equation (19) can be used to represent the scenario inwhich nothing or very little was known about µ before the data were observed, except thatit was known that, when the data are observed, positive values of µ would be regardedas being more likely and negative values of µ less likely than as required to be able toaccept the strong ﬁducial argument. On the other hand, if for example b is chosen to be35mall, the GPD function in equation (20) could be used to represent a scenario wherethere was little or no pre-data knowledge about µ except that, it was known that, whenthe data are observed, values of µ lying in a narrow interval centred at zero, which couldbe the value of µ that corresponds to the null eﬀect of a treatment compared to a control,would be regarded as being more likely and values of µ lying outside of this interval lesslikely than as assumed by the strong ﬁducial argument.On the basis of either of the GPD functions in equations (19) and (20), the ﬁducialdensity f ( µ | σ , x ) is derived under Principle 1 by applying the weak ﬁducial argument.In particular, the two forms of this ﬁducial density that correspond to using these twoGPD functions are the same as the two forms of the posterior density for µ given σ that result from treating these GPD functions as prior densities for µ under the Bayesianparadigm. However, there are at least two good reasons why it is better to regard thesedensities as being ﬁducial densities backed by the methodology outlined in Section 3.4,rather than posterior densities backed up by standard Bayesian theory.First, if the GPD functions in equations (19) and (20) are treated as being priordensities then these density functions must be improper. This is also one of a number ofcriticisms that could be applied to the interpretation of the ﬁducial density f ( µ | σ , x )derived in Section 5 as being a posterior density for µ , as the required prior density for µ in this case would be a ﬂat improper density for µ over the interval ( −∞ , ∞ ). Morespeciﬁcally, though, it would seem particularly awkward to try to justify either of theimproper prior densities for µ that correspond to the GPD functions being presentlyconsidered as being a natural approximation to a proper prior density, or some kind ofnatural limit of allowing a hyperparameter of a proper prior density to tend to inﬁnity.This is due to the discontinuity that occurs at zero for the function in equation (19), andthe discontinuities that occur at − b and b for the function in equation (20).The second reason why it is better to use ﬁducial rather than Bayesian reasoning in36he cases under consideration is that the ﬁducial densities f ( µ | σ , x ) that correspondto the GPD functions in equations (19) and (20) can be regarded as being based on aset of conditional versions of these densities derived using the moderate ﬁducial argu-ment. In particular, under the GPD function in equation (19), the ﬁducial density for µ when µ is conditioned to lie in one of the intervals ( −∞ ,

0) or (0 , ∞ ) would be derivedusing the moderate ﬁducial argument, while, under the GPD function in equation (20),the ﬁducial density for µ when µ is conditioned to lie in one of the subsets ( − b, b ) or( −∞ , − b ) ∪ ( b, ∞ ) would also be derived using this type of ﬁducial argument. Takinginto account the intuitive appeal of the moderate ﬁducial argument that was discussedin Section 7.1, the case can be made that the partial dependence on this argument thathas been identiﬁed should mean that, under the same assumptions about the referenceset R and the resolution λ as made in previous sections, the relative external strengthof the ﬁducial density f ( µ | σ , x ), when µ is unrestricted over the whole of the real line,and when either of the GPD functions in question is used, should be regarded as be-ing reasonably high in many situations where the use of the GPD function concerned isconsidered to be adequate.The same line of reasoning can also be applied in assessing the relative externalstrength of the ﬁducial density of any given parameter θ j of any given sampling model g ( x | θ , θ , . . . , θ k ) conditional on all other parameters, provided that such a density for θ j can be derived under Principle 1, and the GPD function for θ j is a step function with atleast two steps that have distinct non-zero heights. Furthermore, if the GPD function for θ j is allowed to take any form that simply satisﬁes the loose requirements of Deﬁnition 3,then despite this line of reasoning being in general no longer applicable, the capacity toexpress pre-data knowledge about θ j in a way that is distinct from placing a prior densityover θ j under the Bayesian paradigm will be generally retained.On the other hand, if Condition 1 is not satisﬁed then, since Conditions 2(a) and 2(b)37an only be satisﬁed by special, albeit quite important, forms of the GPD function for θ j , e.g. the simple choices made for this function in the cases considered in Section 6, it isclear that over all possible choices for this function, we will not be able in general to makeinferences about θ j by directly using the methodology outlined in Section 3.4. However,in this general case, we can use a similar strategy to the one outlined in Section 7.2 by ﬁrstusing Principle 2 to construct a ﬁducial density f ( θ j | θ − j , x ) that would be appropriate inthe artiﬁcial scenario in which it is assumed that there was little or no pre-data knowledgeabout θ j , and then normalising the density function that results from multiplying thispreliminary ﬁducial density for θ j by the GPD function for θ j that corresponds to theactual scenario being considered. For a similar reason to that which has just been outlinedcombined with reasoning given in Section 7.2, this type of strategy would appear to beparticularly attractive if this latter GDP function for θ j is a step function, although itgenerally oﬀers a useful alternative way of taking into account pre-data knowledge about θ j over all choices for this function.

9. Closing comment

Since the theory of organic ﬁducial inference is a generalisation of the theory of subjectiveﬁducial inference, issues that were identiﬁed in the ﬁnal section of Bowater (2018a) asbeing relevant to the further development of this latter theory, i.e. the coherence ofinferences based on subsets of the data set of interest, alternative deﬁnitions of theﬁducial statistic and computational issues, also apply to the theory that has been putforward in the present paper. To save space the reader is referred to this earlier paperfor a discussion of these issues. 38 eferences

Berger, J. O., Bernardo, J. M. and Sun, D. (2015). Overall objective priors.

BayesianAnalysis , , 189–221.Bowater, R. J. (2017a). A formulation of the concept of probability based on the useof experimental devices. Communications in Statistics: Theory and Methods , ,4774–4790.Bowater, R. J. (2017b). A defence of subjective ﬁducial inference. AStA Advances inStatistical Analysis , , 177–197.Bowater, R. J. (2018a). Multivariate subjective ﬁducial inference. arXiv.org (CornellUniversity), Statistics , arXiv:1804.09804.Bowater, R. J. and Guzm´an, L. E. (2018b). On a generalized form of subjective prob-ability. arXiv.org (Cornell University), Statistics , arXiv:1810.10972.Brooks, S. P. and Roberts, G. O. (1998). Convergence assessment techniques for Markovchain Monte Carlo. Statistics and Computing , , 319–335.Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calculatingmarginal densities. Journal of the American Statistical Association , , 398–409.Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiplesequences. Statistical Science , , 457–472.Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and theBayesian restoration of images. IEEE Transactions on Pattern Analysis and Ma-chine Intelligence , , 721–741. 39andelkern, M. (2002). Setting conﬁdence intervals for bounded parameters (withdiscussion). Statistical Science , , 149–172.Shafer, G. (1985). Conditional probability (with discussion). International StatisticalReview , , 261–277.Tuyl, F. (2017). A note on priors for the multinomial model. The American Statistician ,71