aa r X i v : . [ m a t h . L O ] D ec Equational Axioms for Expected Value Operators
Jan A. BergstraInformatics Institute, University of Amsterdam ∗ December 25, 2018
Abstract
An equational axiomatisation of probability functions for one-dimensional event spacesin the language of signed meadows is expanded with conditional values. Conditionalvalues constitute a so-called signed vector meadow. In the presence of a probabilityfunction, equational axioms are provided for expected value, variance, covariance, andcorrelation squared, each defined for conditional values.Finite support summation is introduced as a binding operator on meadows whichsimplifies formulating requirements on probability mass functions with finite support.Conditional values are related to probability mass functions and to random variables.The definitions are reconsidered in a finite dimensional setting.
Keywords and phrases:
Boolean algebra, signed meadow, vector meadow, probabilityfunction, probability mass function, conditional value.
In [4] a proposal is made for a loose algebraic specification probability functions in the contextof signed meadows. The objective of this paper is to proceed on the basis of the results of [4]and to provide an account of some basic elements of probability calculus including probabilitymass functions, probability functions, expected value operators, variance, covariance, corre-lation, independence, sample space, and random variable. Ample use is made of the specialproperties of meadows, most notably 1 / S , an event space E is introduced as a subset of the power set of S . (ii)Then probability functions are defined over event spaces and (iii) discrete random variablesare introduced as real functions on S with a countable support. Given these ingredients, ∗ Email: [email protected], [email protected] . This is a significantly rewritten and im-proved version under a new title of a previous report with title “Conditional values in signed meadowbased probability calculus” ( https://arxiv.org/abs/1609.02812v3 ). In Section 2 meadows are discussed and so-called signed vector meadows are introduced. Anovel binding operator, called finite support summation is introduced and examples of its useare provided.In Section 3 the notion of a probability mass function (PMF) with finite support is intro-duced and its formal specification in the setting of meadows is provided with the help of finitesupport summation. By default a PMF is assumed to be univariate.Marginalisation is defined as a family of transformations from an FSS with more than oneargument to an FSS with a smaller number of arguments. Expected value and variance aredefined as functionals on (univariate) PMFs and covariance and correlation are defined asfunctionals on bivariate PMFs.Having developed an account of PMFs independently of axioms for probability functions,Section 4 proceeds with a recall from [4] of the combination of an event space (a Booleanalgebra) and value space (a meadow), and the equational specification of a probability function.Two versions of Bayes’ rule are considered and the relative position of these statements w.r.t.the various axioms is examined. 2 conditional operator is applied to events and the results of the operator are collectedin an additional sort C V of so-called conditional values (CVs), which constitutes a so-calledfinite dimensional vector space meadow.Thinking in terms of outcomes of a probabilistic process one may assume that the processproduces as an outcome an entity of some sort. Events from an event space E representassessments about the outcome. It is plausible that besides Boolean assessments also values,for instance rationals or reals, are considered attributes of an outcome. A CV directly relatesvalues to events. In the presence of a probability function two equations specify the expectedvalue of a CV.From [4] the specification of probability function families relative to an arity family isimported, and in Section 5 an corresponding axiomatisation for expected value operators isprovided for the finite dimensional case.According to [4] the equations of BA + Mb + Sign + ABS + PFBC P + PFA P constitute afinite equational basis for the class of Boolean algebra based, real valued probability functions,and the proof theoretic results, viz. soundness and completeness, concerning signed meadowsof [2, 3] extend to the case with Boolean algebra based probability functions. The axiom system BA + Mb + Sign + ABS + PFBC P + PFA P is merely a particular formalisation of Kolmogorov’saxioms for probability theory phrased in the context of meadows and the completeness resultasserts the completeness of this particular formalisation w.r.t. its standard model. The mainresult of the paper is to provide an extension of this axiomatisation with conditional valuesand expected value operators E P . By working in first order equational logic, I intend to provide and support a new axiomaticapproach to the elementary theory of probability. The objective of formalisation and axioma-tisation in this paper is not inherited from an overarching intention to avoid mistakes, as is theconventional rationale for formalisation in computer science. Instead the objective is to usethe axiomatic approach to obtain maximal clarity about assumptions, working hypotheses,patterns of reasoning, and patterns of calculation.In order to develop a valid presentation from the point of view of formal logic it may prac-tical to write all assertions in formal notation and to have only rudimentary basics explainedin conventional mathematical terms. And in the presence of formalised fragments of text,adjacent fragments written in conventional mathematical style may appear to lack rigour.Nevertheless a balance with readability is required, thus leaving room for ad hoc conven-tions. I will not distinguish between names for constants and functions of meadows and theirmathematical counterparts. Rather than writing say R | = t = r , in cases where ordinarymathematics suggests writing t = r and provided there is no risk of confusion “ t = r ” ispreferred. On the other hand sort names, e.g. E for events, will be distinguished from thecorresponding carriers, (e.g. || E || in the case of E ) and a specific probability function withintended to serve as an interpretation of P will be referred to as P .Below equational logic is applied with the following objectives in mind: (i) to demonstratethat an axiomatic approach in terms of equational logic to elementary probability calculus isboth feasible and attractive, (ii) to illustrate the compatibility of an axiomatic approach to3 x + y ) + z = x + ( y + z ) (1) x + y = y + x (2) x + 0 = x (3) x + ( − x ) = 0 (4)( x · y ) · z = x · ( y · z ) (5) x · y = y · x (6)1 · x = x (7) x · ( y + z ) = x · y + x · z (8)( x − ) − = x (9) x · ( x · x − ) = x (10)Table 1: Md: axioms for a meadow x = x · x (11) x/y = x · y − (12)1( x ) = x/x (13)0( x ) = 1 − x/x (14) x ⊳ y ⊲ z = 1( y ) · x + 0( y ) · z (15)Table 2: DO: axioms for derived operatorsprobability calculus with conventional mathematical style and notation, and (iii) to provideoptimal clarity about the assumptions which underly the various definitions, while (iv) usingmeadows as a tool throughout the presentation. Numbers will be viewed as elements of a meadow rather than as elements of a field. For theintroduction of meadows and elementary theory about meadows I refer to [7, 2, 3] and thepapers cited there. I will copy the tables of equational axioms for meadows and for the signfunction which plays a central role below. With ( R , s ) the expansion of the meadow R withthe sign function is denoted. The following completeness result was obtained in [3]. Theorem 1.
A conditional equation in the signature of signed meadows is valid in ( R , s ) ifand only if it is provable from the axiom system Mb + Sign . The axioms in Table 1 specify the variety of meadows, while Table 2 introduces somefunction symbols by means of defining equations serving as explicit definitions for derived4 (1( x )) = 1( x ) (16) s (0( x )) = 0( x ) (17) s ( −
1) = − s ( x − ) = s ( x ) (19) s ( x · y ) = s ( x ) · s ( y ) (20)0( s ( x ) − s ( y )) · s ( x + y ) = 0( s ( x ) − s ( y )) · s ( x ) (21)Table 3: Sign: axioms for the sign operator | x | = s ( x ) · x (22)Table 4: ABS: defining axiom the absolute value operatoroperations. Table 3 specifies the sign function, and Table 4 introduces the absolute valuefunction. Following [2], a meadow that satisfies the (nonequational) implication IL fromTable 5 is called a cancellation meadow. Let e , . . . , e n be a series of pairwise distinct objects outside the meadow M . The meadow M h e , . . . , e n i is defined as a direct sum of copies of M : M h e , . . . , e n i = e M ⊕ . . . ⊕ e n M Here the e i serve as new constants for orthogonal ( e i · e j = 0 for i = j ) idempotents ( e i · e i = e i such that the set { e , . . . , e n } is complete ( e + . . . + e n = 1). Moreover it is assumed that s ( e i ) = e i . Elements of this structure are given by sequences ( l , . . . , l n ) ∈ M n representingthe object e · l + . . . e n · l n . The meadow operations and sign are performed coordinate-wise,e.g. s ( e · l + . . . e n · l n ) = e · s ( l ) + . . . + e n · s ( l n ), thus obtaining an n -dimensional vectorspace over M . For n = 1 the construction is trivial: M h e i ∼ = M . For n >
1, and assumingthat M is non-trivial ( M | = 0 = 1) the resulting structures are not cancellation meadows, i.e. M h e , . . . , e n i 6| = IL. If the number of idempotents of a meadow is finite it is even becausewith idempotency of e comes that 1 − e is also an idempotent. Thus we may assume that n is even. x = 0 → x · x − = 1Table 5: IL: inverse law5 h e , . . . , e n i is the expansion of M h e , . . . , e n i with (new) names for the orthogonal idem-potents. Now R ( s ) h e , . . . , e n i | = Mb + Sign + E h e ,...,e n i , where E h e ,...,e n i captures thementioned identities involving the e i : idempotence for the e i , orthogonality for e i and e j with i = j , completeness, and the equations for s ( − ). Problem 1.
Is the axiom system Mb + Sign + E h e ,...,e n i complete for the equational theoryof the structure R ( s ) h e , . . . , e n i ? The completeness problem of [3] seems to carry over without complications, but there aresignificant details to be adapted. Σ m denotes the signature for meadows and Σ m + s denotesits extension with name and arity of the sign function. For a structure A and a signatureΓ, A | Γ is the reduct of A to Γ. Soundness for Mb + Sign + E h e ,...,e n i plus completenessin the case of n = 0 implies that the equational theory for the reduced vector meadows( R ( s ) h e , . . . , e n i ) | Σ m + s is the same for all n . The situation might be different for conditionalequations, however: Problem 2.
Are the conditional equational theories of the structures ( R ( s ) h e , . . . , e n i ) | Σ m + s the same for all n ? With disjunctive assertions (among which IL ≡ x ) = 0 ∨ x ) = 1) discrimination betweenvector meadows of different dimension is possible.Let φ ≡ def x · x = x ∧ y · y = y ∧ x + y = 1 ∧ x · y = 0 → ( x = 0 ∨ y = 0). Then for n ≥ R ( s ) h e , . . . , e n i 6| = φ while for n = 0: R ( s ) hi | = φ . The expression language may be extended with lambda abstraction thereby introducing λx.t asan expression denoting the function which maps v ∈ V to [ v/x ] t , i.e. the result of substituting v for x in t . A disadvantage of this approach is that it imports typed λ -calculus, definitely anon-trivial subject.Another option is to use L x.t to represent the same function. Now if y does not occurfreely in t , then L y. ([ y/x ] t constitutes a different representation for the same function, i.e.unlike in the λ -calculus alpha conversion does not apply to L x.t .In statistical theory Jeffrey’s notation t [ • ], with t [ − ] a context with zero or more “holes”,stands for λx.t [ x ], with x a fresh variable. Finally function abstraction may be left implicitwhen a specific binding mechanism is employed.When FSS is to be applied t“to a term t ” these four options lead to different notations: P ⋆ ( λx.t ) , P ⋆ ( L x.t ) , P ⋆ t [ • ] and P ⋆x t , respectively. There is no need to choose a single con-vention from these four options and below it is supposed to be clear from the context whichone of these conventions is used in each particular case. Given a meadow M and a term t in which variable x may or may not occur it may be useful todetermine the summation of all substitutions (or rather interpretations) [ v/x ] t with v ∈ || M || .6his sum is unambiguously defined, however, if the support in M of L x.t is finite, that is ifthere are only finitely many values v ∈ || M || such that [ v/x ] t is nonzero.The expression P ⋆x t denotes in M the sum of all [ v/x ] t if at most finitely many of thesesubstitutions [ v/x ] t yield a non-zero value and 0, otherwise.The P ⋆x operator will be referred to as finite support summation (FSS). At this stage wehave little information about the logical properties of this binding mechanism on terms but itis semantically unproblematic, being well-defined in each meadow, and it will be used below forpresenting several definitions. We first notice some technical facts concerning FSS, assumingthe interpretation of equations is performed in an arbitrary cancellation meadow M .1. L x.t has finite support iff
L x.t/t has finite support.2. P ⋆x P ⋆x x ) = 1,3. P ⋆x x and thus P ⋆x p ) and P ⋆x p and thereforevanishes modulo p .4. P ⋆x x ) = 0 if and only if M is infinite.5. P ⋆x x ) = − M is finite.6. P ⋆x ( t + 0( x )) = ( P ⋆x t ) + 1 if and only if L x.t has finite support.7. P ⋆x ( t + 0( x )) = P ⋆x t if and only if L x.t has infinite support.8. If x / ∈ F V ( t ) then P ⋆x ( r · t ) = ( P ⋆x r ) · t ).9. If x / ∈ F V ( t ) then P ⋆x ( x · t − x )) = t and P ⋆x ( x · t − x )) = ( P ⋆x t ) − [0 /x ] t .10. If both L x.t and
L x.t have finite support then ( P ⋆x t ) + ( P ⋆x r ) = P ⋆x ( t + r ) . If, moreover, M is signed:1. P ⋆x x ) = 0, because a signed meadow is infinite.2. Consider context C [ − ] with C [ X ] = 1( P ⋆x X )) ⊳ ( P ⋆x ( X + 0( x )) − P ⋆x X ) ⊲ , then C [ t ] = 1 if and only if the support of L x.t is nonempty, and otherwise C [ t ] = 0.3. C [ t ] · ( P ⋆x ( t + 0( x )) − ( P ⋆x t )) · − P ⋆x s ( t )) = 0 if and only if the support of L x.t is asingleton.
Proposition 1.
L x.t has finite support in Q if and only if it has finite support in R ,Proof. Because Q is a substructure of Q the number of non-zero values of λx.t in Q cannotexceed the number of nonzero values in R so the if part is immediate. Now for “only if”suppose that λx.t has infinitely many non-zero values in R . In [2] it is shown that non-zero t ( x ) is provably equal to a sum of simple fractions, i.e. fractions for which numeratorand denominator are each nonzero-polynomials.This implies that λ.t ( x ) is discontinuous on7t most finitely many arguments so that it must be nonzero at some real argument r whereit is continuous at the same time. This implies that λ.t ( x ) is nonzero in some neighbourhood( r − ǫ, r + ǫ ) of r so that it is nonzero on the infinitely many rational arguments in this sameneighbourhood. It follows that λx.t has infinite support in Q . Problem 3.
Is there a context C [ − ] (not involving s ) so that for all meadow expressionswithout sign and for all cancellation meadows (in particular those with non-zero characteristic) C [ t ] = 0 equals if t has empty support and C [ t ] = 1 otherwise? Problem 4.
Consider the meadows R enriched with FSS. Is equality between closed termsfor this structure computably enumerable, and if so is it decidable? Problem 5.
Consider the meadows Q enriched with FSS. Is equality between closed termsfor this structure decidable? The multivariate case of FSS operations requires separate definitions for each number of vari-ables because a stepwise reduction to the definition for the univariate case is unfeasible. Todemonstrate this difficulty we consider the bivariate case only, the case with three or morevariables following the same pattern. In a meadow M , P ⋆x,y t produces 0 if for infinitely maypairs of values a, b ∈ || M || the value of [ a/x ][ b/y ] t is nonzero, otherwise it produces the sumof the finitely many nonzero values thus obtained.The need for expressions of the form P ⋆x,y t transpires from an elementary example, whichdemonstrates that a 2-dimensional FSS cannot be simply reduced to a composition of 2 oc-currences of a 1-dimensional FFS. Let t ( x, y ) = 0( x ) · y ) + 0(1 − x ) . Because t (1 , y ) = 1 forall y , t ( x, y ) is nonzero on infinitely many pairs of values, so that ⋆ X x,y t ( x, y ) = 0 . Now notice that P ⋆y t (0 , y ) = 1, P ⋆y t (1 , y ) = 0, and if x = 0 ∧ x = 1, P ⋆y t ( x, y ) = 0. It followsthat ⋆ X x ⋆ X y t ( x, y ) = 1 . The main application of FSS in this paper is to enable the following definition of what itmeans for a term to represent a finitely supported probability mass function. Probabilitymass function will be abbreviated as PMF. Finitely supported PMFs constitute a special caseof “arbitrary” PMFs , a more general notion which cannot easily be defined on an arbitrarysigned meadow, and which will not be used in the sequel.
Definition 1.
Given a signed meadow M , a pair ( t ; x ) consisting of a term and variable x ,represents a PMF with finite support M if M | = P ⋆x | t | = 1 . R ( s ), the two requirements of Definition 1 indeedguarantee that the function represented by L x.t is a PMF with finite support according tostandard terminology.The property of being a representative of a finitely supported PMF is sensitive to themeadow at hand. For instance consider the expression t given by t = 0( x − · ((1 + s ( x )) · x + (1 − s ( x )) · (2 − x )) / . In R the function description L x.t represents a finitary PMF. To see this notice that
L x.t takes non-zero values only in −√ √ − / · √ / · √ L x.t represents a finitary PMF, while in Q it is not the case that L x.t represents a finitary PMF because t ( q ) vanishes for all q ∈ Q with the implication that P ⋆x t = 0. On the other hand when considering t ′ ( x ) = t ( x ) + 0( x ) it turns out that L x.t represents a finitely supported PMF in Q while it fails to do so in R . Given a signed cancellation meadow M , a joint PMF with finite support of arity n is a function L x , . . . , x n .F ( x , . . . , x n ) from M n to M which satisfies these two conditions:1. P ⋆x ,...,x n F ( x , . . . , x n ) = 1 , and2. for all x , . . . , x n ∈ R n , F ( x , . . . , x n ) = | F ( x , . . . , x n ) | . For example assuming that information about the graph of a joint PMF with finite support,with exception of argument vectors for which the result vanishes, is encoded in a set of triples: { ( y , , y , , z ) , . . . , ( y ,n , y ,n , z n ) } , a corresponding function expression F for the same jointPMF with key variables x and x is as follows: F ( x , x ) = n X i =1 (0(( x − y ,i ) + ( x − y ,i ) ) · z i ) . Given a finitely supported joint PMF G with n variables x , . . . , x n , marginalisation canbe defined to each subset x i , . . . , x i k with 1 ≤ i < . . . < i k ≤ n . Let x j , . . . , x j n − k be anenumeration without repetition of the variables in x , . . . , x n that are not listed in x i , . . . , x i k ,then G ( i ,...,i k ) represents a joint PMF with k variables x i , . . . , x i k as follows: G ( i ,...,i k ) ( x i , . . . , x i k ) = ⋆ X x j ...,x jn − k G ( x , . . . , x n ) . pmf ( F ) = ⋆ X x ( x · F ( x )) (Expected value of F) VAR pmf ( F ) = ⋆ X x ( x · F ( x )) − E pmf ( F ) (variance of F) COV pmf ( G ) = ⋆ X x,y ( x · y · G ( x, y )) − E pmf ( G (1) ) · E pmf ( G (2) ) (covariance of G) CORR sq pmf ( G ) = COV pmf ( G ) ( VAR pmf ( G (1) ) · VAR pmf ( G (2) ) (correlation of G squared)Table 6: expected value, (co)variance, and correlationFor a bivariate PMF G ( x, y ) independence is defined as independence of its two marginali-sations. IND ( G ) ≡ def ∀ x, y ∈ V.G ( x, y ) = G (1) ( x ) · G (2) ( y ) . Now F ( x ) is assumed to be a term representing a finite support PMF with x as the keyvariable, while G ( x, y ) represents a joint PMF with finite support with x as the first and y as the second key variable. Two PMFs G (1) and G (2) are derived from G by marginalization: G (1) ( x ) = P ⋆y G ( x, y ) and G (2) ( y ) = P ⋆x G ( x, y ). The expected value E pmf ( F ) of F andrelated operations are given in Table 6. The square of correlation is included in order not toburden the present exposition with the equational specification of a square root operator. Inthe context of meadows the square root can be made total, and the equationally specified, bywriting p ( − x ) = − p ( x ) (see [2]). The completeness result of Theorem 1 carries over in thepresence of the square root function.These definitions admit a justification on the basis of the conventional use of the definedterminology, the details which are worth mentioning. Given a PMF F with finite support, itssupport, say S , may be viewed as a sample space so that in conventional terminology id S , theidentity function of type S → R , qualifies as a random variable, say X . The powerset of S serves as an event space, say E S . Let the probability function P be generated by P ( { s } ) = F ( s )for s ∈ S . Now P ( X = x ) = F ( x ) and E pmf ( F ) = E P ( X ) = P s ∈ S ( X ( s ) · P ( X = s )) = P ⋆x ( x · F ( x )). From [4] I will recall equations for Boolean algebras, (signed) meadows, and probability func-tions. A Boolean algebra ( B, + , − , , ,
0) may be defined as a system with at least two elementssuch that ∀ x, y, z ∈ B the well-known postulates of Boolean algebra are valid. In order to10 x ∨ y ) ∧ y = y (23)( x ∧ y ) ∨ y = y (24) x ∧ ( y ∨ z ) = ( y ∧ x ) ∨ ( z ∧ x ) (25) x ∨ ( y ∧ z ) = ( y ∨ x ) ∧ ( z ∨ x ) (26) x ∧ ¬ x = ⊥ (27) x ∨ ¬ x = ⊤ (28)Table 7: BA: a self-dual equational basis for Boolean algebras P ( ⊤ ) = 1 (29) P ( ⊥ ) = 0 (30) P ( x ) = | P ( x ) | (31)Table 8: PFBC P : boundary conditions for a named probability functionavoid overlap with the operations of a meadow, Boolean algebras are equipped with notationfrom propositional logic, thus consider ( B, ∨ , ∧ , ¬ , ⊤ , ⊥ ) and adopt the axioms as presented inTable 7. In [14] it was shown that the axioms in Table 7 constitute an equational basis for theequational theory of Boolean algebras. In the setting of probability functions the elements ofthe underlying Boolean algebra are referred to as events. Events are closed under − ∨ − whichrepresents alternative occurrence and − ∧ − which represents simultaneous occurrence. Theterm “value” will refer to an element of a cancellation meadow, mainly the meadow of reals andthe meadow of rationals. A probability function from events to the values in a signed meadow.An expression of sort E is an event expression or an event term, an expression of type V is avalue expression or equivalently a value term. In this paper considerations are limited to struc-tures involving a single name for a probability function only, the function symbol P , at leastin the 1-dimensional case. Table 8 provides axioms that determine generally agreed boundaryconditions for a probability function. Table 9 contains the axiom for additivity that is includedin the axiomatisation of [4]. Together with the axioms for signed meadows and for Booleanalgebras we find the following set of axioms: BA + Mb + DO + Sign + ABS + PFBC P + PFA P .Table 10 provides explicit definitions of some useful conditional probability operators madetotal by choosing a value in case the condition has probability 0. P ( x ∨ y ) = P ( x ) + P ( y ) − P ( x ∧ y ) (32)Table 9: PFA P : addtivity axiom for a named probability function11 ( x | y ) = P ( x ∧ y ) P ( y ) (33) P ( x | y ) = P ( x | y ) ⊳ P ( y ) ⊲ P s ( x | y ) = P ( x | y ) ⊳ P ( y ) ⊲ P ( x ) (35)Table 10: conditional probability operators The reader is assumed to be familiar with the concept of a probability function, say P withname P , on an event space E , where P is supposed to comply with the informal Kolmogorovaxioms of probability theory. Being based on the availability of real numbers, sets, andmeasures on sets, the Kolmogorov axioms are more easily understood as providing a math-ematical definition, that is a set of requirements, governing which functions are consideredprobability functions than as constituting a formal system of axioms. The axiom system BA + Mb + DO + Sign + ABS + PFBC P + PFA P provides a formalisation of the Kolmogorovaxioms for probability functions.A probability function structure over an event space E is a two sorted structure having E (events) and V (values) as sorts with E interpreted by a Boolean algebra and V inter-preted as the real numbers, enriched with a probability function P from E to V . The Kol-mogorov axioms specify precisely which functions are probability functions. I will assumethat V is the domain of the meadow of reals, i.e. that the meadow version of real numbers isused. With EPV ( E , R ( s ) , P ) the class of probability function structures over a fixed eventstructure E is denoted, with values taken in || R ( s ) || . For a specific PMF P the pertinentstructure is denoted by EPV ( E , R ( s ) , P ). EPV ( BA , R ( s ) , P ) denotes the union of all col-lections EPV ( E , R ( s ) , P ) for all E with E | = BA . It is apparent from the construction that EPV ( E , R ( s ) , P ) | = BA + Mb + DO + Sign + ABS + PFBC P + PFA P . A completeness resultfor BA + Mb + DO + Sign + ABS + PFBC P + PFA P is taken from [4]. Theorem 2. BA + Mb + DO + Sign + ABS + PFBC P + PFA P is sound and complete for theequational theory of EPV ( BA , R ( s ) , P ) . It is a corollary of the completeness proof in [4] that the same axioms are complete for theclass
EPV ( BA f , R ( s ) , P ) containing those probability function structures which are expan-sions of a finite event structure. In [10] first order axioms are provided for probability calculus,and corresponding completeness is shown making use of the completeness result for the firstorder theory of real numbers, a fact which also underlies the result in [4]. As a comment to the specification of probability functions an excursion to Bayes’ rule isworthwile. First consider the following equation: P ( x ∧ y ) · P ( y ) · P ( y ) − = P ( x ∧ y ) (EQ1)12quation EQ1 follows from BA + Mb + DO + Sign + ABS + PFBC P + PFA P . This fact is aconsequence of Theorem 2 above. A direct proof reads as follows. φ ( u, v ) ≡ | u | + | v | ) · u. Now ( R , s ) | = φ ( u, v ) = 0, and using the completeness theoremof [3] one obtains that BA + Mb + Sign ⊢ φ ( u, v ) = 0. Substituting P ( y ∧ x ) for u and P ( y ∧¬ x )for v one derives: ⊢ φ ( P ( y ∧ x ) , P ( y ∧ ¬ x )) = 0( | P ( y ∧ x ) | + | P ( y ∧ ¬ x ) | ) · P ( y ∧ x ) =0( P ( y ∧ x ) + P ( y ∧ ¬ x )) · P ( y ∧ x ) = 0( P ( y )) · P ( y ∧ x ), from which the required result followsby expanding 0( P ( y )).Bayes’ rule, also known as Bayes’ theorem, occurs in different forms. The conditionaloperator P of Table 10 is used for its presentation below. The simplest form of Bayes’ rule,is an equation here referred to as BR: P ( x | y ) = P ( y | x ) · P ( x ) P ( y ) (BR)In [4] it is shown that BR follows from the specification BA + Mb + DO + Sign + ABS + PFBC P + EQ
1. As it turns out BR implies equation EQ1. This fact is shown as follows:by substituting x ∧ y for y in BR one obtains: P ( x | x ∧ y ) = ( P ( x ∧ y | x ) · P ( x )) /P ( x ∧ y ).Multiplying both sides with P ( x ∧ y ) gives L = R with L = P ( x | x ∧ y ) · P ( x ∧ y ) and R = (( P ( x ∧ y | x ) · P ( x )) /P ( x ∧ y )) · P ( x ∧ y ). Now L = ( P ( x ∧ ( x ∧ y )) /P ( x ∧ y )) · P ( x ∧ y ) =( P ( x ∧ y ) · P ( x ∧ y )) /P ( x ∧ y ) = P ( x ∧ y ), and R = ((( P (( x ∧ y ) ∧ x ) /P ( x )) · P ( x )) /P ( x ∧ y )) · P ( x ∧ y ) = ( P ( x ∧ y ) /P ( x ∧ y )) · P ( x ∧ y ) · ( P ( x ) /P ( x )) = P ( x ∧ y ) · P ( x ) · P ( x ) − . Proposition 2.
The axiom system BA + Mb + DO + Sign + ABS + PFBC P + EQ is strictlyweaker than BA + Mb + DO + ABS + Sign + PFBC P + PFA P .Proof. Consider a four element event space generated by an atomic event e and choose P asfollows: P ( ⊥ ) = P ( e ) = P ( ¬ e ) = 0 and P ( ⊤ ) = 1. The equations of PFBC P and EQ PFA P is not satisfied.This weakness persists if EQ1 is replaced by BR. A second and equally well-known form ofBayes’ rule is BRs from Table 11. BR follows from BA + PFBC P + BRs by taking z = ⊤ . Proposition 3. BA + PFBC P + BRs implies
P F A P .Proof. it suffices to derive the following equation EQ2 P ( y ) = P ( y ∧ z ) + P ( y ∧ ¬ z ) (EQ2)This suffices because, according to [4], it is the case that EQ2 in combination with BA + Mb + DO + Sign + ABS + PFBC P entails P F A P . To this end set x = y in BR , thereby obtaining P ( y | y ) = ( P ( y | y ) · P ( y )) / ( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) . To derive EQ2, notice P ( y | y ) = P ( y ∧ y ) /P ( y ) = P ( y ) /P ( y ), take the inverse at both sidesthus obtaining L = R with L = P ( y ) /P ( y ) and R = ( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) /P ( y ).Then multiplying L and R with P ( y ) yields L · P ( y ) = R · P ( y ). Now L · P ( y ) = ( P ( y ) /P ( y )) · P ( y ) = P ( y ) and R · P ( y ) = (( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) /P ( y )) · P ( y ) =(( P ( y ∧ z ) /P ( z )) · P ( z ) + ( P ( y ∧ ¬ z ) /P ( ¬ z )) · P ( ¬ z )) · ( P ( y ) /P ( y )) =( P ( y ∧ z ) + P ( y ∧ ¬ z )) · ( P ( y ) /P ( y )) = P ( y ∧ z ) · ( P ( y ) /P ( y )) + P ( y ∧ ¬ z ) · ( P ( y ) /P ( y )) = P ( y ∧ z ) + P ( y ∧ ¬ z ). 13 ( x | y ) = P ( y | x ) · P ( x ) P ( y | z ) · P ( z ) + P ( y | ¬ z ) · P ( ¬ z ) (BRs)Table 11: PFA ′ P : alternative axiom for additivity v ( − x ) = − v ( x ) (36) v ( x − ) = v ( x ) − (37) v ( x + y ) = v ( x ) + v ( y ) (38) v ( x · y ) = v ( x ) · v ( y ) (39) v ( s ( x )) = s ( v ( x )) (40)Table 12: UCV : axioms for unconditional values; x, y range over V .It may be concluded that 11 provides an adequate substitute of PFA P . This observationsuggests an alternative axiomatisation BA + Mb + DO + Sign + ABS + PFBC P + PFA ′ P basedon BRs as given in Table 11.For BR, however, there seems to be no role as an axiom in the axiomatic framework ofthis paper. For instance one may wonder if BR provides an implicit definition of conditionalprobability. Proposition 4.
It is not the case that in the presence of BA + Mb + DO + Sign + ABS + PFBC P + PFA P , though in the absence of the definitions of Table 10, BR serves as an implicitdefinition of P .Proof. Let Q ( x, y ) = 1( P ( y )) · P ( x ). Then Q ( x, y ) differs from P ( x | y ) in all but ex-ceptional cases. However, Q ( − , − ) satisfies BR considered as a requirement on P ( − | − ): Q ( y, x ) · P ( x ) P ( y ) = 1( P ( x )) · P ( y ) · P ( x ) P ( y ) = 1( P ( y )) · P ( x )) · P ( x ) = 1( P ( y )) · P ( x ) = Q ( x, y ). A third sort named C V containing so-called conditional values will be introduced. C V isgenerated by an embedding v : V → C V and a conditional operator − : → − : E × C V → C V . C V is equipped with all meadow operations while v (0) serves as 0 and v (1) serves as 1.A specification is given by combining (i) the axioms UCV of Table 12 with (ii) Mb cv = Mb [ v (0) / ,v (1) / , i.e the equations of Table 1, however with variables X, Y, Z now ranging over C V , and with v (0) substituted for 0 and v (1) substituted for 1, (iii) Sign cv , the equations ofTable 3, but now with its variables ranging over C V , and (iv) the specification Cond of theconditional operator − : → − : E × C V → C V as specified in Table 13.14 : → X = X (41) ⊥ : → X = v (0) (42) e : → ( X + Y ) = ( e : → X ) + ( e : → Y ) (43) e : → ( X · Y ) = ( e : → X ) · Y (44) e : → ( − X ) = − ( e : → X ) (45) e : → ( X − ) = ( e : → X ) − (46)( e ∨ f : → X ) = ( e : → X ) + ( f : → X ) − ( e ∧ f : → X ) (47) e ∧ f : → X = e : → ( f : → X ) (48) s ( e : → X ) = e : → s ( X ) (49)Table 13: Cond: axioms for the conditional operatorGiven a Boolean algebra E and a signed meadow M ( s ) there is a free term algebra CV ( E , M ))of elements for C V generated from E and M ( s ).The three sorted expansion ECV ( E , M ( s ) , CV ( E , M ( s ))) of M ( s ) and E includes a sort C V ,the conditional operator on E × C V , and the embedding v from V into C V .For a Boolean algebra E the subset E at consists of the atomic elements of || E || , where a ∈ || E || is atomic if a = ⊥ and whenever for b and c in || E || , E | = ( ¬ b ∨ a ) ∧ ( ¬ c ∨ a ) = ⊤ then E | = ¬ b ∨ a = ⊤ or E | = ∧¬ c ∨ a = ⊤ . E at contains the maximally consistent elements ofthe Boolean algebra.To each closed term X of type C V of the extended signature a mapping J X K : E at → V isassigned, with the rules of Table 14. The equivalence relation ≡ at on closed C V terms is givenby X ≡ at Y ⇐⇒ ∀ a ∈ E at ( J X K ( a ) = J Y K ( a )). X ≡ at is a congruence relation which meetsall requirements imposed by UCV + Sign cv + Mb cv + Cond and CV ( E , M ) can be defined asthe free term algebra for sort C V in the extended signature modulo ≡ at . This constructionguarantees the consistency of the given construction of the structure for C V as for arbitrary a ∈ E at : J v (0) K ( a ) = 0 = 1 = J v (1) K ( a ). Proposition 5. If M is nontrivial (that is M = 0 = 1 ) and || E || has more than two elementsthen CV ( E , M ) is not a cancellation meadow (that is CV ( E , M ) = X = 0 → X · X − = 1 ).Proof. The proof works by finding an X which differs from v (0) modulo ≡ at and so that X · X − differs from v (1) modulo ≡ at . Indeed If || E || > E at is non-empty, and let a be an atom.Now a : → v (1), violates IL. First notice that J ⊥ : → v (1) K ( a ) = 0 = 1 = J a : → v (1) K ( a ) so that ⊥ : → v (1) at a : → v (1), and similarly by application to ¬ a that a : → v (1) at ⊤ : → v (1).Now ( a : → v (1)) − = a : → v (1) − = a : → v (1 − ) = e : → v (1) whence ( a : → v (1)) · ( a : → v (1)) − = ( a : → v (1)) · ( a : → v (1)) = a : → v (1) at ⊤ : → v (1)( at a : → v (1)). Definition 2.
An expression X = e : → v ( t ) + . . . e n : → v ( t n ) of type C V is a flat C V expression. v ( m ) K ( a ) = m J − t K ( a ) = − ( J t K ( a )) J t − K ( a ) = ( J t K ( a )) − J t + r K ( a ) = J t K ( a ) + J r K ( a ) J t · r K ( a ) = J t K ( a ) · J r K ( a ) J e : → t K ( a ) = J t K ( a ) , if E | = ¬ a ∨ e = ⊤ J e : → t K ( a ) = 0 , if E | = a ∧ e = ⊥ . Table 14: Definition of J t K ( a ) for a ∈ E at Definition 3.
A flat C V expression X = e : → v ( t ) + . . . e n : → v ( t n ) is non-overlapping iffor all ≤ i, j ≤ n with i = j , it is the case that provably e i ∧ e j = ⊥ . Definition 4.
Two non-overlapping flat C V expressions are similar if both involve the samecollection of conditions, used in the same order. Proposition 6.
For each closed C V expression X there is a non-overlapping flat C V expres-sion Y such that Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond ⊢ X = Y . Proposition 7.
For closed C V expressions X and Y similar non-overlapping flat expressions X ′ and Y ′ can be found so that Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond ⊢ X = X ′ & Y = Y ′ . Proposition 8.
If we fix E as some finite minimal event space with E | = ⊤ 6 = ⊥ , then the C V expressions generated from E and R constitute a signed vector meadow meadow withdimension E at ) . If E at ) ≥ then the meadow of conditional values is not a cancellationmeadow. Instead it is a vector space meadow (see Paragraph 2.1). Elements of the form e : → with e ∈ E , are the idempotent elements of C V . CVs e : → and f : → are orthogonalif and only if e ∧ f = ⊥ in E . If a , . . . a n enumerates E at without repetition then C V ∼ = R ( s ) h e , . . . , e n i . Proposition 9.
Given closed C V expressions in flat form X = P ni =1 e i : → v ( t i ) and Y = P mj =1 f j : → v ( r j ) , a flat form representation for X · Y is: P ni =1 P mj =1 ( e i ∧ f j ) : → v ( t i · r j ) . Moreover, if X and Y are non-overlapping then so is the given expression for X · Y . A C V expression, say X , denotes a value which is conditional on an event, that is it dependson the actual event e chosen from E . Therefore CVs are well-suited suited for defining anexpected value, denoted with E P ( X ). The concept of an expectation lies at the basis offurther definitions of probabilistic quantities such as variance, covariance, and correlation.Defining the expected value for a conditional value can be done if a besides a probability16 P ( X + Y ) = E P ( X ) + E P ( Y ) (50) E P ( x : → v ( y )) = P ( x ) · y (51)Table 15: EV P , axioms for the expected value operator, x ranges over E , y over V function, say P , C V expression in flat form is available, say P ni =1 e i : → v ( t i ). E P ( n X i =1 e i : → v ( t i )) = n X i =1 ( P ( e i )) · t i ) . These identities provide an axiom scheme for the function E P : CV → V .Given a probability function structure EPV ( E , R ( s ) , P ) and a CV structure involving thesame event space, say ECV ( E , R ( s ) , CV ( E , R ( s ))) a joint expansion exists. Denoting thejoint expansion with EPCV ( E , R ( s ) , CV ( E , R ( s )) , P ) it can be further expanded with anexpected value operator named E P , interpreted in compliance with the mentioned scheme, toa structure EPCV ( E , R ( s ) , CV ( E , R ( s )) , P , E P ). Together the latter structures constitute aclass of probability structures K ( BA ).Instead of using an axiom scheme, a finite axiomatisation of E P ( − ) is given in Table 15,from which each instance of the scheme can be derived. The equations (named EV P ) ofTable 15 determine E P ( − ) on all C V expressions not involving variables of sort C V .Grouping together the axioms collected thus far one finds an equational theory: MBPC P = BA + Mb + DO + Sign + ABS + PFBC P + PFA P + UCV + Sign cv + Mb cv + Cond + EV P (meadow based probability calculus). A plausible class of models for MBPC P is K ( BA ). Witha proof similar to that of Theorem 2, it follows that MBPC P is complete for such equationsw.r.t. validity in K ( BA ). E P can be eliminated from expressions of sort V without free variables of sort C V . Thereforean expression of sort V without free variables of sort C V is provably equal within MBPC P toan expression not involving subterms of sort C V . On the basis of a definition of expectation, variance, covariance, and correlation on conditionalvalues can be introduced as derived operators as in Table 16.Let X and Y be C V expressions with flat forms X = P ni =1 e i : → v ( t i ) and Y = P mi =1 f i : → v ( r i ) . The equations in Table 15 provide explicit definitions of variance, covariance, and cor-relation for X , resp. Y .There is no novelty to these definitions except for the effort made to make each definitionfit a framework that has been setup on the basis of an algebraic specification. By proceedingin this manner an axiomatic framework is obtained for equational reasoning about each ofthese technical notions.Forgetting the subscript, that is using E ( X ) instead of E P ( X ), and similarly for the otheroperators, is common practice in probability theory. Doing so, however requires that it is17 AR P ( X ) = E P ( X ) − ( E P ( X )) (52) COV P ( X, Y ) = E P ( X · Y ) − E P ( X ) · E P ( Y ) (53) CORR sqP ( X, Y ) =
COV P ( X, Y ) VAR P ( X ) · VAR P ( Y ) (54)Table 16: EV P , axioms for variance, covariance, and correlation for conditional valuesapparent from the context which probability function is used. Moreover it must be assumedthat for X and for Y the same probability function applies. Given a conditional value X = P ni =1 e i : → v ( t i ) in non-overlapping flat form, and a probabilityfunction P the probability mass function, λx.P ( X = x ) for X is supposed to yield for eachvalue x the probability that X takes value x . An explicit definition for the PMF of X is asfollows: Pmf P ( X ) = L x ∈ V . n X i =1 (0( t i − x ) · P ( e i )) . This specification of
Pmf P is schematic and for that reason does not achieve the simplicityfound for the expected value operation. Problem 6.
Can Pmf P be specified by means of a fixed and finite number of equations ratherthan with an axiom scheme involving an equation for each non-overlapping closed C V expres-sion? Proposition 10.
Equivalence of definitions for expectation and variance for CV expressionsin non-overlapping flat form via (joint) PMFs extraction.1. E P ( X ) = E pmf ( Pmf P ( X )) ,
2. VAR P ( X ) = VAR pmf ( Pmf P ( X )) . Proof.
Let X = P ni =1 e i : → v ( t i ) be a non-overlapping flat C V expression. Making use of thefacts listed in Paragraph 2.3, one obtains: E pmf ( L x.P ( X = x )) = P ⋆x P ni =1 (0( t i − x ) · P ( e i )) = P ni =1 P ⋆x (0( t i − x ) · P ( e i )) = P ni =1 P ⋆x (0( t i − x ) · P ( e i )) = P ni =1 ( t i · P ( e i )) = E P ( X ) . Two conditional values are event sharing if both have conditions over the same domain. Ex-traction of a joint PMF from event sharing conditional values works as follows. Given two C V X and Y with similar nonoverlapping flat forms P ni =1 ( e i : → t i ) and P ni =1 ( e i : → r i )the joint PMF for these conditional values, denoted by P ( X = x, Y = y ), is defined by P ( X = x, Y = y ) = n X i =1 (0( t i − x ) · r i − y ) · P ( e i )) . Extending Proposition 4.6 the following connections between definitions involving a condi-tional value and definitions involving a PMF or a joint PMF can be found.
Proposition 11.
Equivalence of definitions for covariance and correlation (squared) via CVsand via (joint) PMFs.1. COV P ( X, Y ) =
COV pmf ( L x, y.P ( X = x, Y = y )) , CORR sqP ( X, Y ) =
CORR sq pmf ( L x, y.P ( X = x, Y = y )) . In the multidimensional case the event space is considered a product of event spaces. In themulti-dimensional case CVs occurring in a vector of CVs are supposed by default not to beevent space sharing and the notion of a joint probability function working over a tuple of eventspaces enters the picture.The multi-dimensional case becomes relevant once tuples (vectors) of CVs are consideredin combination with a plurality of joint probability functions for product spaces of higherdimensional event space corresponding to various vectors of CVs such that there may notexist a joint probability function for the full product space.
Let D = { a , . . . , a n } be a finite set. The elements of D will be called dimensions. D is calleda dimension set, and it is assumed that n = D ). Definition 5. (Arities over D ) ar D , the collection of arities over dimension set D , denotesthe set of finite non-empty sequences of elements of D without repetition. Elements of ar D will serve as arities of probability functions on multi-dimensional eventspaces. l ( w ) denotes the length of w ∈ ar D . Definition 6. (Arity family) Given an event space E , and a name P for a probability function,an arity family (for E and P ) is a finite subset W of ar D which is (i) closed under permutation,and (ii) closed under taking non-empty subsequences, and (iii) which contains for each d ∈ D the arity ( d ) , that is the one-dimensional arity consisting of dimension d only. For each dimension d ∈ D the presence of a sort E d of events for dimension d is assumed.For simplicity of notation it is assumed that these sorts are identical, so that only a sort E isrequired. 19 d,u,e,u ′ ( y , x . . . , x l , y , z , . . . , z l ′ ) = P e,u,d,u ′ ( y , x . . . , x l , y , z , . . . , z l ′ ) (55) P d ( ⊤ ) = 1 (56) P d ( ⊥ ) = 0 (57) P d,w ( ⊤ , x , . . . , x n ) = P w ( x , . . . , x n ) (58) P d,w ( ⊥ , x , . . . , x n ) = 0 (59) P w ( x , . . . , x n ) = | P w ( x , . . . , x n ) | (60) P d,u ( x ∨ y, x , . . . , x l ) = P d,u ( x, x , . . . , x l ) + P d,u ( y, x , . . . , x l ) − P d,u ( x ∧ y, x , . . . , x l ) (61)Table 17: PFF W,P : axioms for a probability function family with name P (with d, e ∈ D , w, ( d, u ) , ( e, u, d, u ′ ) ∈ W, n = l ( w ), and u, u ′ ∈ ar D ∪ { ǫ } , l = l ( u ) , l ′ = l ( u ′ ) .e : → a ( f : → b X ) = f : → b ( e : → a X ) (62)Table 18: Cond mv : commuting multivariate condition constructors Definition 7.
A probability family (denoted
P F F W ) for an arity family W ⊆ ar D consistsof a probability function P w : E l ( w ) → V for each w ∈ W , such that for all w ∈ W each theaxioms in Table 17 (taken from [4]) are satisfied. The axioms of Table 17 case correspond to the axioms for a probability function of Table 9in the one dimensional case.Because in an arity repetition of dimensions is disallowed these axioms reduce to what wehad already in the case of a single dimension.
Just as in the one-dimensional case, multivariate conditional values are the elements of sort C V . C v has, besides the embedding v from V into C V (which must meet the requirements ofTable 12), for each d ∈ D a constructor − : → d − of type E × C V → C V . − : → d − must satisfythe requirements Cond d which result from Cond in Table 13 by replacing operator − : → − by − : → d − in all equations. In addition to these requirements the equations Cond mv of Table 18must be satisfied for all different pairs a, b ∈ D . For a specification of the expected value operator it is assumed that d , . . . , d n is an enumera-tion without repetitions of D . For each w ∈ W a separate expected value operator E wP arises.20 wP ( X + Y ) = E wP ( X ) + E wP ( Y ) (63) E wP ( x : → d ( . . . ( x n : → d n v ( y ) . . . )) = P w ( x , . . . , x n ) · y (64)Table 19: EV P,w , axioms for the expected value operator for arity w Each operator is specified by means of two equations as displayed in Table 19.Given the multi dimensional expected value operator, corresponding operators for variance,covariance, and correlation can be derived un the usual manner.
Collecting the equations mentioned thus far for the multidimensional setting the axiom system
M BP C WP = BA + Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond d ( d ∈ D ) + PFF
W,P + EV P,w ( w ∈ W ) is obtained.Completeness of these axiomatisations can be shown with the same methods as for the 1Dcase. The design of these structures can be somewhat simplified if for each subset of D at mosta single probability function is admitted, having the arguments for the different dimensionsin a fixed order. When adopting this alternative, Table 17 needs to be redesigned as follows:permutation axioms are dropped and axioms involving the first argument must be replicatedfor each argument position. This paper is a sequel to [4] where a meadow based approach to the equational specification ofprobability functions was proposed. In [6] probabilistic choice is formalised with the meadowof reals as a number system. The equations in that paper demonstrate, just as well as theequations in Table 9, an attractive compatibility between the requirements of probabilitycalculus and the treatment of division in a meadow.In [15] an extensive survey is presented of the history leading up to Kolmogorov’s choiceof axioms, and to Kolmogorov’s claim that these axioms are what probability is about. Theequations in
PFBC P + PFA P do not take the 6th axiom into account, however, which assertsthat if ( e i ) i ∈ N is an infinite descending chain of events such that only ⊥ is below each elementof the chain, then lim i →∞ P ( e i ) = 0. A closer resemblance with Kolmogorov’s original axiomsis found if the equation in Table 9 is replaced by the conditional equation e ∧ f = ⊥ → P ( e ∨ f ) = P ( e ) + P ( v ). This replacement produces a logically equivalent axiom system. Theequation of Table 9 is preferred because it is logically simpler than a conditional equation.Conditional values play the role of a discrete random variables with finite range. By workingwith conditional values the use of a sample space underlying the event space is avoided whichhelps to maintain the style and simplicity of the axiomatisation of probability functions of [4].Instead of including an additional sort C V , the conditional values might be viewed as an21xtension of the sort V. A reason for not doing so, however, is to prevent P from taking valuesof the form say P ( e ) = f : → v (1 / / − ) and 0( − ) of Table 2 the original notation from [2, 3] is1( x ) = 1 x , resp. 0( x ) = 0 x , which notations may still be used as alternatives. The chosennotation is preferable if a sizeable expression is substituted for x . Table 1 makes use of inversivenotation. The phrase “inversive notation” was coined in [5] where it stands in contrast with“divisive notation” which involves a two place division operator symbol. In [5] the equivalenceof both notations is discussed. Two place division is provided as a derived operation inTable 2. Division commonly appears in a plurality of syntactical forms: x : y, x/y, x / y , and xy . These diverse forms are not in need of a separate defining equation, just as much as inthe specification of a meadow no mention is made of the existing notational variation formultiplication (viz. x × y, x · y, x.y and xy ). Acknowedgement
Yoram Hirschfeld, Kees Middelburg and Alban Ponse gave useful com-ments on a previous version of the paper.
References [1] D. Barber.
Bayesian Reasoning and Machine Learning.
Cambridge Univer-sity Press, 2012. (ISBN 0521518148, 9780521518147). On-line version availableat http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.Online (consulted version: 18 June 2013).[2] J.A. Bergstra, I. Bethke, and A. Ponse. Cancellation meadows: a generic basis theoremand some applications.
The Computer Journal , 56(1):3–14, 2013.[3] J.A. Bergstra, I. Bethke, and A. Ponse. Equations for formally real meadows.
Journal ofApplied Logic , 13(2) part B:1–23, 2015.[4] Jan A. Bergstra and Alban Ponse. Probability functions in the context of signed involutivemeadows. in: Recent Trends in Algebraic Development Techniques , Eds. Philip James &Markus Roggenbach, Proc. 23th IFIP WG1.2 International Workshop WADT, SpringerLNCS 10644, 73–87, (also https://arxiv.org/pdf/1307.5173.pdf ), 2017.[5] J.A. Bergstra and C.A. Middelburg. Inversive meadows and divisive meadows.
Journalof Applied Logic , 9(3): 203–220, 2011.[6] J.A. Bergstra and C.A. Middelburg. Probabilistic thread algebra.
SACS , 25(2): 211–243,2015. 227] J. A. Bergstra and J. V. Tucker. The rational numbers as an abstract data type.
J. ACM ,54, 2, Article 7 (April 2007) 25 pages, 2007.[8] D.P. Bertsekas and J.N. Tsitsiklis.
Introduction to Probability , Athena Scientific, NashuaUSA, ISBN 978-1-886529-23-6, 2008.[9] D. Davidson and P. Suppes. A Finitistic Axiomatization of Subjective Probability andUtility. Econometrica 24 (3) 264-275, 1956.[10] J.Y. Halpern. An analyisis of first-order logics of probability. Artificial Inteligence 46,311-350, 1990.[11] Khan Academy. Random variables and probability distributions. ,(consulted July 9 2016).[12] C.P.J. Koymans, and J. L. M. Vrancken. Extending process algebra with the emptyprocess. Electronic, report LGPS 1. Dept. of Philosophy, State University of Utrecht,The Netherlands (1985).[13] T. Matsuura and S. Saitoh. Matrices and Division by Zero. Advances in Linear Algebra& Matrix Theory 6: 51-58 ( http:dx.doi.org/10.4236/alamt.2016.62007 ), 2016.[14] H. Padmanabhan. A self-dual equational basis for Boolean algebras.
Canad. Math. Bull. ,26(1):9–12, 1983.[15] G. Shafer and V. Vovk. The Sources of Komogorov’s
Grundbegriffe . Statistical Science,21 (1) 70-98, 2006.[16] Wikipedia. https://en.wikipedia.org/wiki/Random_variable , consulted July 9,2016.
A Random variables
The notion of a random variable plays a central role in many presentations of probabilitytheory. In the presentation of the current paper the role of random variables is played by aconditional values (CVs) instead. In this Appendix it will be outlined how to view a CV as arandom variable provided that the event space is finite.
A.1 From implicit sample space to explicit sample space
Given event space E , the subset of its domain E at consisting of atoms as defined in Para-graph 4.3 can be taken for the corresponding sample space and then a random variable issupposed to be a function from sample space to values. Viewing E at as a sample space, foreach close conditional value expression X , the function J X K , as specified in Table 14, qualifiesas a random variable. 23 prefer not to have E at as a sort because the resulting setting with E at as a subsort of E is not easily reconciled with equational logic. Logical difficulties with the equational logicof subsorts persist in spite of the presence of many works that have been devoted to thatparticular complication.Now summation over the sample space E at is specified as follows. For an event space E anda term t of sort V , then P ⋆α ∈ E at t = 0 if there are either none or infinitely many atomic eventsin || E || and otherwise ⋆ X α ∈ E at t = [ a i /α ] t + . . . + [ a k /α ] t with a , . . . , a k an enumeration without repetitions of the atomic events of E . Provided E isfinite, the expectation of J X K can be defined by summation over the sample space, using anidentity tat lies outside fisrt order equational logic: E P ( J X K ) = ⋆ X α ∈ E at ( J X K ( α ) · P ( α )) A.2 Random variables in colloquial language
Random variables play a key role in many accounts of probability theory. However, theconcept of a random variable seems to be rather informal and its use is often cast in colloquiallanguage. A common wording states that “a random variable is the outcome of a stochasticprocess”. Complicating an understanding of a random variable, however, is the fact that themathematical definition of it, which reads “a function from sample space to reals” makes noreference to any variable or variable name, or to a probability function, or to a stochasticmechanism. In [16] it is asserted about a random variable that it is:... a variable whose value is subject to variations due to chance (i.e. randomness,in a mathematical sense).... A random variable can take on a set of possibledifferent values (similarly to other mathematical variables), each with an associatedprobability, in contrast to other mathematical variables.’In [11] a random variable is explained as a mapping from “outcomes” to values which providesquantification, while the main argument put forward for the introduction of an random variableis about the use of its name, and at the same time the suggestion is made that a randomvariable is linked to a probability function. In [8] it is stated thatA discrete random variable has an associated probability mass function ..In the introductory probability refresher of [1] the domain of a variable is said to be the setof states it can take, while the relation between (random) variables and events is explained asfollows:For our purposes, events are expressions about random variables, such as