[PDF] Equational Axioms for Expected Value Operators

Abstract

An equational axiomatisation of probability functions for one-dimensional event spaces in the language of signed meadows is expanded with conditional values. Conditional values constitute a so-called signed vector meadow. In the presence of a probability function, equational axioms are provided for expected value, variance, covariance, and correlation squared, each defined for conditional values. Finite support summation is introduced as a binding operator on meadows which simplifies formulating requirements on probability mass functions with finite support. Conditional values are related to probability mass functions and to random variables. The definitions are reconsidered in a finite dimensional setting.

Full PDF

aa r X i v : . [ m a t h . L O ] D ec Equational Axioms for Expected Value Operators

Jan A. BergstraInformatics Institute, University of Amsterdam ∗ December 25, 2018

Abstract

An equational axiomatisation of probability functions for one-dimensional event spacesin the language of signed meadows is expanded with conditional values. Conditionalvalues constitute a so-called signed vector meadow. In the presence of a probabilityfunction, equational axioms are provided for expected value, variance, covariance, andcorrelation squared, each deﬁned for conditional values.Finite support summation is introduced as a binding operator on meadows whichsimpliﬁes formulating requirements on probability mass functions with ﬁnite support.Conditional values are related to probability mass functions and to random variables.The deﬁnitions are reconsidered in a ﬁnite dimensional setting.

Keywords and phrases:

Boolean algebra, signed meadow, vector meadow, probabilityfunction, probability mass function, conditional value.

In [4] a proposal is made for a loose algebraic speciﬁcation probability functions in the contextof signed meadows. The objective of this paper is to proceed on the basis of the results of [4]and to provide an account of some basic elements of probability calculus including probabilitymass functions, probability functions, expected value operators, variance, covariance, corre-lation, independence, sample space, and random variable. Ample use is made of the specialproperties of meadows, most notably 1 / S , an event space E is introduced as a subset of the power set of S . (ii)Then probability functions are deﬁned over event spaces and (iii) discrete random variablesare introduced as real functions on S with a countable support. Given these ingredients, ∗ Email: [email protected], [email protected] . This is a signiﬁcantly rewritten and im-proved version under a new title of a previous report with title “Conditional values in signed meadowbased probability calculus” ( https://arxiv.org/abs/1609.02812v3 ). In Section 2 meadows are discussed and so-called signed vector meadows are introduced. Anovel binding operator, called ﬁnite support summation is introduced and examples of its useare provided.In Section 3 the notion of a probability mass function (PMF) with ﬁnite support is intro-duced and its formal speciﬁcation in the setting of meadows is provided with the help of ﬁnitesupport summation. By default a PMF is assumed to be univariate.Marginalisation is deﬁned as a family of transformations from an FSS with more than oneargument to an FSS with a smaller number of arguments. Expected value and variance aredeﬁned as functionals on (univariate) PMFs and covariance and correlation are deﬁned asfunctionals on bivariate PMFs.Having developed an account of PMFs independently of axioms for probability functions,Section 4 proceeds with a recall from [4] of the combination of an event space (a Booleanalgebra) and value space (a meadow), and the equational speciﬁcation of a probability function.Two versions of Bayes’ rule are considered and the relative position of these statements w.r.t.the various axioms is examined. 2 conditional operator is applied to events and the results of the operator are collectedin an additional sort C V of so-called conditional values (CVs), which constitutes a so-calledﬁnite dimensional vector space meadow.Thinking in terms of outcomes of a probabilistic process one may assume that the processproduces as an outcome an entity of some sort. Events from an event space E representassessments about the outcome. It is plausible that besides Boolean assessments also values,for instance rationals or reals, are considered attributes of an outcome. A CV directly relatesvalues to events. In the presence of a probability function two equations specify the expectedvalue of a CV.From [4] the speciﬁcation of probability function families relative to an arity family isimported, and in Section 5 an corresponding axiomatisation for expected value operators isprovided for the ﬁnite dimensional case.According to [4] the equations of BA + Mb + Sign + ABS + PFBC P + PFA P constitute aﬁnite equational basis for the class of Boolean algebra based, real valued probability functions,and the proof theoretic results, viz. soundness and completeness, concerning signed meadowsof [2, 3] extend to the case with Boolean algebra based probability functions. The axiom system BA + Mb + Sign + ABS + PFBC P + PFA P is merely a particular formalisation of Kolmogorov’saxioms for probability theory phrased in the context of meadows and the completeness resultasserts the completeness of this particular formalisation w.r.t. its standard model. The mainresult of the paper is to provide an extension of this axiomatisation with conditional valuesand expected value operators E P . By working in ﬁrst order equational logic, I intend to provide and support a new axiomaticapproach to the elementary theory of probability. The objective of formalisation and axioma-tisation in this paper is not inherited from an overarching intention to avoid mistakes, as is theconventional rationale for formalisation in computer science. Instead the objective is to usethe axiomatic approach to obtain maximal clarity about assumptions, working hypotheses,patterns of reasoning, and patterns of calculation.In order to develop a valid presentation from the point of view of formal logic it may prac-tical to write all assertions in formal notation and to have only rudimentary basics explainedin conventional mathematical terms. And in the presence of formalised fragments of text,adjacent fragments written in conventional mathematical style may appear to lack rigour.Nevertheless a balance with readability is required, thus leaving room for ad hoc conven-tions. I will not distinguish between names for constants and functions of meadows and theirmathematical counterparts. Rather than writing say R | = t = r , in cases where ordinarymathematics suggests writing t = r and provided there is no risk of confusion “ t = r ” ispreferred. On the other hand sort names, e.g. E for events, will be distinguished from thecorresponding carriers, (e.g. || E || in the case of E ) and a speciﬁc probability function withintended to serve as an interpretation of P will be referred to as P .Below equational logic is applied with the following objectives in mind: (i) to demonstratethat an axiomatic approach in terms of equational logic to elementary probability calculus isboth feasible and attractive, (ii) to illustrate the compatibility of an axiomatic approach to3 x + y ) + z = x + ( y + z ) (1) x + y = y + x (2) x + 0 = x (3) x + ( − x ) = 0 (4)( x · y ) · z = x · ( y · z ) (5) x · y = y · x (6)1 · x = x (7) x · ( y + z ) = x · y + x · z (8)( x − ) − = x (9) x · ( x · x − ) = x (10)Table 1: Md: axioms for a meadow x = x · x (11) x/y = x · y − (12)1( x ) = x/x (13)0( x ) = 1 − x/x (14) x ⊳ y ⊲ z = 1( y ) · x + 0( y ) · z (15)Table 2: DO: axioms for derived operatorsprobability calculus with conventional mathematical style and notation, and (iii) to provideoptimal clarity about the assumptions which underly the various deﬁnitions, while (iv) usingmeadows as a tool throughout the presentation. Numbers will be viewed as elements of a meadow rather than as elements of a ﬁeld. For theintroduction of meadows and elementary theory about meadows I refer to [7, 2, 3] and thepapers cited there. I will copy the tables of equational axioms for meadows and for the signfunction which plays a central role below. With ( R , s ) the expansion of the meadow R withthe sign function is denoted. The following completeness result was obtained in [3]. Theorem 1.

A conditional equation in the signature of signed meadows is valid in ( R , s ) ifand only if it is provable from the axiom system Mb + Sign . The axioms in Table 1 specify the variety of meadows, while Table 2 introduces somefunction symbols by means of deﬁning equations serving as explicit deﬁnitions for derived4 (1( x )) = 1( x ) (16) s (0( x )) = 0( x ) (17) s ( −

1) = − s ( x − ) = s ( x ) (19) s ( x · y ) = s ( x ) · s ( y ) (20)0( s ( x ) − s ( y )) · s ( x + y ) = 0( s ( x ) − s ( y )) · s ( x ) (21)Table 3: Sign: axioms for the sign operator | x | = s ( x ) · x (22)Table 4: ABS: deﬁning axiom the absolute value operatoroperations. Table 3 speciﬁes the sign function, and Table 4 introduces the absolute valuefunction. Following [2], a meadow that satisﬁes the (nonequational) implication IL fromTable 5 is called a cancellation meadow. Let e , . . . , e n be a series of pairwise distinct objects outside the meadow M . The meadow M h e , . . . , e n i is deﬁned as a direct sum of copies of M : M h e , . . . , e n i = e M ⊕ . . . ⊕ e n M Here the e i serve as new constants for orthogonal ( e i · e j = 0 for i = j ) idempotents ( e i · e i = e i such that the set { e , . . . , e n } is complete ( e + . . . + e n = 1). Moreover it is assumed that s ( e i ) = e i . Elements of this structure are given by sequences ( l , . . . , l n ) ∈ M n representingthe object e · l + . . . e n · l n . The meadow operations and sign are performed coordinate-wise,e.g. s ( e · l + . . . e n · l n ) = e · s ( l ) + . . . + e n · s ( l n ), thus obtaining an n -dimensional vectorspace over M . For n = 1 the construction is trivial: M h e i ∼ = M . For n >

1, and assumingthat M is non-trivial ( M | = 0 = 1) the resulting structures are not cancellation meadows, i.e. M h e , . . . , e n i 6| = IL. If the number of idempotents of a meadow is ﬁnite it is even becausewith idempotency of e comes that 1 − e is also an idempotent. Thus we may assume that n is even. x = 0 → x · x − = 1Table 5: IL: inverse law5 h e , . . . , e n i is the expansion of M h e , . . . , e n i with (new) names for the orthogonal idem-potents. Now R ( s ) h e , . . . , e n i | = Mb + Sign + E h e ,...,e n i , where E h e ,...,e n i captures thementioned identities involving the e i : idempotence for the e i , orthogonality for e i and e j with i = j , completeness, and the equations for s ( − ). Problem 1.

Is the axiom system Mb + Sign + E h e ,...,e n i complete for the equational theoryof the structure R ( s ) h e , . . . , e n i ? The completeness problem of [3] seems to carry over without complications, but there aresigniﬁcant details to be adapted. Σ m denotes the signature for meadows and Σ m + s denotesits extension with name and arity of the sign function. For a structure A and a signatureΓ, A | Γ is the reduct of A to Γ. Soundness for Mb + Sign + E h e ,...,e n i plus completenessin the case of n = 0 implies that the equational theory for the reduced vector meadows( R ( s ) h e , . . . , e n i ) | Σ m + s is the same for all n . The situation might be diﬀerent for conditionalequations, however: Problem 2.

Are the conditional equational theories of the structures ( R ( s ) h e , . . . , e n i ) | Σ m + s the same for all n ? With disjunctive assertions (among which IL ≡ x ) = 0 ∨ x ) = 1) discrimination betweenvector meadows of diﬀerent dimension is possible.Let φ ≡ def x · x = x ∧ y · y = y ∧ x + y = 1 ∧ x · y = 0 → ( x = 0 ∨ y = 0). Then for n ≥ R ( s ) h e , . . . , e n i 6| = φ while for n = 0: R ( s ) hi | = φ . The expression language may be extended with lambda abstraction thereby introducing λx.t asan expression denoting the function which maps v ∈ V to [ v/x ] t , i.e. the result of substituting v for x in t . A disadvantage of this approach is that it imports typed λ -calculus, deﬁnitely anon-trivial subject.Another option is to use L x.t to represent the same function. Now if y does not occurfreely in t , then L y. ([ y/x ] t constitutes a diﬀerent representation for the same function, i.e.unlike in the λ -calculus alpha conversion does not apply to L x.t .In statistical theory Jeﬀrey’s notation t [ • ], with t [ − ] a context with zero or more “holes”,stands for λx.t [ x ], with x a fresh variable. Finally function abstraction may be left implicitwhen a speciﬁc binding mechanism is employed.When FSS is to be applied t“to a term t ” these four options lead to diﬀerent notations: P ⋆ ( λx.t ) , P ⋆ ( L x.t ) , P ⋆ t [ • ] and P ⋆x t , respectively. There is no need to choose a single con-vention from these four options and below it is supposed to be clear from the context whichone of these conventions is used in each particular case. Given a meadow M and a term t in which variable x may or may not occur it may be useful todetermine the summation of all substitutions (or rather interpretations) [ v/x ] t with v ∈ || M || .6his sum is unambiguously deﬁned, however, if the support in M of L x.t is ﬁnite, that is ifthere are only ﬁnitely many values v ∈ || M || such that [ v/x ] t is nonzero.The expression P ⋆x t denotes in M the sum of all [ v/x ] t if at most ﬁnitely many of thesesubstitutions [ v/x ] t yield a non-zero value and 0, otherwise.The P ⋆x operator will be referred to as ﬁnite support summation (FSS). At this stage wehave little information about the logical properties of this binding mechanism on terms but itis semantically unproblematic, being well-deﬁned in each meadow, and it will be used below forpresenting several deﬁnitions. We ﬁrst notice some technical facts concerning FSS, assumingthe interpretation of equations is performed in an arbitrary cancellation meadow M .1. L x.t has ﬁnite support iﬀ

L x.t/t has ﬁnite support.2. P ⋆x P ⋆x x ) = 1,3. P ⋆x x and thus P ⋆x p ) and P ⋆x p and thereforevanishes modulo p .4. P ⋆x x ) = 0 if and only if M is inﬁnite.5. P ⋆x x ) = − M is ﬁnite.6. P ⋆x ( t + 0( x )) = ( P ⋆x t ) + 1 if and only if L x.t has ﬁnite support.7. P ⋆x ( t + 0( x )) = P ⋆x t if and only if L x.t has inﬁnite support.8. If x / ∈ F V ( t ) then P ⋆x ( r · t ) = ( P ⋆x r ) · t ).9. If x / ∈ F V ( t ) then P ⋆x ( x · t − x )) = t and P ⋆x ( x · t − x )) = ( P ⋆x t ) − [0 /x ] t .10. If both L x.t and

L x.t have ﬁnite support then ( P ⋆x t ) + ( P ⋆x r ) = P ⋆x ( t + r ) . If, moreover, M is signed:1. P ⋆x x ) = 0, because a signed meadow is inﬁnite.2. Consider context C [ − ] with C [ X ] = 1( P ⋆x X )) ⊳ ( P ⋆x ( X + 0( x )) − P ⋆x X ) ⊲ , then C [ t ] = 1 if and only if the support of L x.t is nonempty, and otherwise C [ t ] = 0.3. C [ t ] · ( P ⋆x ( t + 0( x )) − ( P ⋆x t )) · − P ⋆x s ( t )) = 0 if and only if the support of L x.t is asingleton.

Proposition 1.

L x.t has ﬁnite support in Q if and only if it has ﬁnite support in R ,Proof. Because Q is a substructure of Q the number of non-zero values of λx.t in Q cannotexceed the number of nonzero values in R so the if part is immediate. Now for “only if”suppose that λx.t has inﬁnitely many non-zero values in R . In [2] it is shown that non-zero t ( x ) is provably equal to a sum of simple fractions, i.e. fractions for which numeratorand denominator are each nonzero-polynomials.This implies that λ.t ( x ) is discontinuous on7t most ﬁnitely many arguments so that it must be nonzero at some real argument r whereit is continuous at the same time. This implies that λ.t ( x ) is nonzero in some neighbourhood( r − ǫ, r + ǫ ) of r so that it is nonzero on the inﬁnitely many rational arguments in this sameneighbourhood. It follows that λx.t has inﬁnite support in Q . Problem 3.

Is there a context C [ − ] (not involving s ) so that for all meadow expressionswithout sign and for all cancellation meadows (in particular those with non-zero characteristic) C [ t ] = 0 equals if t has empty support and C [ t ] = 1 otherwise? Problem 4.

Consider the meadows R enriched with FSS. Is equality between closed termsfor this structure computably enumerable, and if so is it decidable? Problem 5.

Consider the meadows Q enriched with FSS. Is equality between closed termsfor this structure decidable? The multivariate case of FSS operations requires separate deﬁnitions for each number of vari-ables because a stepwise reduction to the deﬁnition for the univariate case is unfeasible. Todemonstrate this diﬃculty we consider the bivariate case only, the case with three or morevariables following the same pattern. In a meadow M , P ⋆x,y t produces 0 if for inﬁnitely maypairs of values a, b ∈ || M || the value of [ a/x ][ b/y ] t is nonzero, otherwise it produces the sumof the ﬁnitely many nonzero values thus obtained.The need for expressions of the form P ⋆x,y t transpires from an elementary example, whichdemonstrates that a 2-dimensional FSS cannot be simply reduced to a composition of 2 oc-currences of a 1-dimensional FFS. Let t ( x, y ) = 0( x ) · y ) + 0(1 − x ) . Because t (1 , y ) = 1 forall y , t ( x, y ) is nonzero on inﬁnitely many pairs of values, so that ⋆ X x,y t ( x, y ) = 0 . Now notice that P ⋆y t (0 , y ) = 1, P ⋆y t (1 , y ) = 0, and if x = 0 ∧ x = 1, P ⋆y t ( x, y ) = 0. It followsthat ⋆ X x ⋆ X y t ( x, y ) = 1 . The main application of FSS in this paper is to enable the following deﬁnition of what itmeans for a term to represent a ﬁnitely supported probability mass function. Probabilitymass function will be abbreviated as PMF. Finitely supported PMFs constitute a special caseof “arbitrary” PMFs , a more general notion which cannot easily be deﬁned on an arbitrarysigned meadow, and which will not be used in the sequel.

Deﬁnition 1.

Given a signed meadow M , a pair ( t ; x ) consisting of a term and variable x ,represents a PMF with ﬁnite support M if M | = P ⋆x | t | = 1 . R ( s ), the two requirements of Deﬁnition 1 indeedguarantee that the function represented by L x.t is a PMF with ﬁnite support according tostandard terminology.The property of being a representative of a ﬁnitely supported PMF is sensitive to themeadow at hand. For instance consider the expression t given by t = 0( x − · ((1 + s ( x )) · x + (1 − s ( x )) · (2 − x )) / . In R the function description L x.t represents a ﬁnitary PMF. To see this notice that

L x.t takes non-zero values only in −√ √ − / · √ / · √ L x.t represents a ﬁnitary PMF, while in Q it is not the case that L x.t represents a ﬁnitary PMF because t ( q ) vanishes for all q ∈ Q with the implication that P ⋆x t = 0. On the other hand when considering t ′ ( x ) = t ( x ) + 0( x ) it turns out that L x.t represents a ﬁnitely supported PMF in Q while it fails to do so in R . Given a signed cancellation meadow M , a joint PMF with ﬁnite support of arity n is a function L x , . . . , x n .F ( x , . . . , x n ) from M n to M which satisﬁes these two conditions:1. P ⋆x ,...,x n F ( x , . . . , x n ) = 1 , and2. for all x , . . . , x n ∈ R n , F ( x , . . . , x n ) = | F ( x , . . . , x n ) | . For example assuming that information about the graph of a joint PMF with ﬁnite support,with exception of argument vectors for which the result vanishes, is encoded in a set of triples: { ( y , , y , , z ) , . . . , ( y ,n , y ,n , z n ) } , a corresponding function expression F for the same jointPMF with key variables x and x is as follows: F ( x , x ) = n X i =1 (0(( x − y ,i ) + ( x − y ,i ) ) · z i ) . Given a ﬁnitely supported joint PMF G with n variables x , . . . , x n , marginalisation canbe deﬁned to each subset x i , . . . , x i k with 1 ≤ i < . . . < i k ≤ n . Let x j , . . . , x j n − k be anenumeration without repetition of the variables in x , . . . , x n that are not listed in x i , . . . , x i k ,then G ( i ,...,i k ) represents a joint PMF with k variables x i , . . . , x i k as follows: G ( i ,...,i k ) ( x i , . . . , x i k ) = ⋆ X x j ...,x jn − k G ( x , . . . , x n ) . pmf ( F ) = ⋆ X x ( x · F ( x )) (Expected value of F) VAR pmf ( F ) = ⋆ X x ( x · F ( x )) − E pmf ( F ) (variance of F) COV pmf ( G ) = ⋆ X x,y ( x · y · G ( x, y )) − E pmf ( G (1) ) · E pmf ( G (2) ) (covariance of G) CORR sq pmf ( G ) = COV pmf ( G ) ( VAR pmf ( G (1) ) · VAR pmf ( G (2) ) (correlation of G squared)Table 6: expected value, (co)variance, and correlationFor a bivariate PMF G ( x, y ) independence is deﬁned as independence of its two marginali-sations. IND ( G ) ≡ def ∀ x, y ∈ V.G ( x, y ) = G (1) ( x ) · G (2) ( y ) . Now F ( x ) is assumed to be a term representing a ﬁnite support PMF with x as the keyvariable, while G ( x, y ) represents a joint PMF with ﬁnite support with x as the ﬁrst and y as the second key variable. Two PMFs G (1) and G (2) are derived from G by marginalization: G (1) ( x ) = P ⋆y G ( x, y ) and G (2) ( y ) = P ⋆x G ( x, y ). The expected value E pmf ( F ) of F andrelated operations are given in Table 6. The square of correlation is included in order not toburden the present exposition with the equational speciﬁcation of a square root operator. Inthe context of meadows the square root can be made total, and the equationally speciﬁed, bywriting p ( − x ) = − p ( x ) (see [2]). The completeness result of Theorem 1 carries over in thepresence of the square root function.These deﬁnitions admit a justiﬁcation on the basis of the conventional use of the deﬁnedterminology, the details which are worth mentioning. Given a PMF F with ﬁnite support, itssupport, say S , may be viewed as a sample space so that in conventional terminology id S , theidentity function of type S → R , qualiﬁes as a random variable, say X . The powerset of S serves as an event space, say E S . Let the probability function P be generated by P ( { s } ) = F ( s )for s ∈ S . Now P ( X = x ) = F ( x ) and E pmf ( F ) = E P ( X ) = P s ∈ S ( X ( s ) · P ( X = s )) = P ⋆x ( x · F ( x )). From [4] I will recall equations for Boolean algebras, (signed) meadows, and probability func-tions. A Boolean algebra ( B, + , − , , ,

0) may be deﬁned as a system with at least two elementssuch that ∀ x, y, z ∈ B the well-known postulates of Boolean algebra are valid. In order to10 x ∨ y ) ∧ y = y (23)( x ∧ y ) ∨ y = y (24) x ∧ ( y ∨ z ) = ( y ∧ x ) ∨ ( z ∧ x ) (25) x ∨ ( y ∧ z ) = ( y ∨ x ) ∧ ( z ∨ x ) (26) x ∧ ¬ x = ⊥ (27) x ∨ ¬ x = ⊤ (28)Table 7: BA: a self-dual equational basis for Boolean algebras P ( ⊤ ) = 1 (29) P ( ⊥ ) = 0 (30) P ( x ) = | P ( x ) | (31)Table 8: PFBC P : boundary conditions for a named probability functionavoid overlap with the operations of a meadow, Boolean algebras are equipped with notationfrom propositional logic, thus consider ( B, ∨ , ∧ , ¬ , ⊤ , ⊥ ) and adopt the axioms as presented inTable 7. In [14] it was shown that the axioms in Table 7 constitute an equational basis for theequational theory of Boolean algebras. In the setting of probability functions the elements ofthe underlying Boolean algebra are referred to as events. Events are closed under − ∨ − whichrepresents alternative occurrence and − ∧ − which represents simultaneous occurrence. Theterm “value” will refer to an element of a cancellation meadow, mainly the meadow of reals andthe meadow of rationals. A probability function from events to the values in a signed meadow.An expression of sort E is an event expression or an event term, an expression of type V is avalue expression or equivalently a value term. In this paper considerations are limited to struc-tures involving a single name for a probability function only, the function symbol P , at leastin the 1-dimensional case. Table 8 provides axioms that determine generally agreed boundaryconditions for a probability function. Table 9 contains the axiom for additivity that is includedin the axiomatisation of [4]. Together with the axioms for signed meadows and for Booleanalgebras we ﬁnd the following set of axioms: BA + Mb + DO + Sign + ABS + PFBC P + PFA P .Table 10 provides explicit deﬁnitions of some useful conditional probability operators madetotal by choosing a value in case the condition has probability 0. P ( x ∨ y ) = P ( x ) + P ( y ) − P ( x ∧ y ) (32)Table 9: PFA P : addtivity axiom for a named probability function11 ( x | y ) = P ( x ∧ y ) P ( y ) (33) P ( x | y ) = P ( x | y ) ⊳ P ( y ) ⊲ P s ( x | y ) = P ( x | y ) ⊳ P ( y ) ⊲ P ( x ) (35)Table 10: conditional probability operators The reader is assumed to be familiar with the concept of a probability function, say P withname P , on an event space E , where P is supposed to comply with the informal Kolmogorovaxioms of probability theory. Being based on the availability of real numbers, sets, andmeasures on sets, the Kolmogorov axioms are more easily understood as providing a math-ematical deﬁnition, that is a set of requirements, governing which functions are consideredprobability functions than as constituting a formal system of axioms. The axiom system BA + Mb + DO + Sign + ABS + PFBC P + PFA P provides a formalisation of the Kolmogorovaxioms for probability functions.A probability function structure over an event space E is a two sorted structure having E (events) and V (values) as sorts with E interpreted by a Boolean algebra and V inter-preted as the real numbers, enriched with a probability function P from E to V . The Kol-mogorov axioms specify precisely which functions are probability functions. I will assumethat V is the domain of the meadow of reals, i.e. that the meadow version of real numbers isused. With EPV ( E , R ( s ) , P ) the class of probability function structures over a ﬁxed eventstructure E is denoted, with values taken in || R ( s ) || . For a speciﬁc PMF P the pertinentstructure is denoted by EPV ( E , R ( s ) , P ). EPV ( BA , R ( s ) , P ) denotes the union of all col-lections EPV ( E , R ( s ) , P ) for all E with E | = BA . It is apparent from the construction that EPV ( E , R ( s ) , P ) | = BA + Mb + DO + Sign + ABS + PFBC P + PFA P . A completeness resultfor BA + Mb + DO + Sign + ABS + PFBC P + PFA P is taken from [4]. Theorem 2. BA + Mb + DO + Sign + ABS + PFBC P + PFA P is sound and complete for theequational theory of EPV ( BA , R ( s ) , P ) . It is a corollary of the completeness proof in [4] that the same axioms are complete for theclass

EPV ( BA f , R ( s ) , P ) containing those probability function structures which are expan-sions of a ﬁnite event structure. In [10] ﬁrst order axioms are provided for probability calculus,and corresponding completeness is shown making use of the completeness result for the ﬁrstorder theory of real numbers, a fact which also underlies the result in [4]. As a comment to the speciﬁcation of probability functions an excursion to Bayes’ rule isworthwile. First consider the following equation: P ( x ∧ y ) · P ( y ) · P ( y ) − = P ( x ∧ y ) (EQ1)12quation EQ1 follows from BA + Mb + DO + Sign + ABS + PFBC P + PFA P . This fact is aconsequence of Theorem 2 above. A direct proof reads as follows. φ ( u, v ) ≡ | u | + | v | ) · u. Now ( R , s ) | = φ ( u, v ) = 0, and using the completeness theoremof [3] one obtains that BA + Mb + Sign ⊢ φ ( u, v ) = 0. Substituting P ( y ∧ x ) for u and P ( y ∧¬ x )for v one derives: ⊢ φ ( P ( y ∧ x ) , P ( y ∧ ¬ x )) = 0( | P ( y ∧ x ) | + | P ( y ∧ ¬ x ) | ) · P ( y ∧ x ) =0( P ( y ∧ x ) + P ( y ∧ ¬ x )) · P ( y ∧ x ) = 0( P ( y )) · P ( y ∧ x ), from which the required result followsby expanding 0( P ( y )).Bayes’ rule, also known as Bayes’ theorem, occurs in diﬀerent forms. The conditionaloperator P of Table 10 is used for its presentation below. The simplest form of Bayes’ rule,is an equation here referred to as BR: P ( x | y ) = P ( y | x ) · P ( x ) P ( y ) (BR)In [4] it is shown that BR follows from the speciﬁcation BA + Mb + DO + Sign + ABS + PFBC P + EQ

1. As it turns out BR implies equation EQ1. This fact is shown as follows:by substituting x ∧ y for y in BR one obtains: P ( x | x ∧ y ) = ( P ( x ∧ y | x ) · P ( x )) /P ( x ∧ y ).Multiplying both sides with P ( x ∧ y ) gives L = R with L = P ( x | x ∧ y ) · P ( x ∧ y ) and R = (( P ( x ∧ y | x ) · P ( x )) /P ( x ∧ y )) · P ( x ∧ y ). Now L = ( P ( x ∧ ( x ∧ y )) /P ( x ∧ y )) · P ( x ∧ y ) =( P ( x ∧ y ) · P ( x ∧ y )) /P ( x ∧ y ) = P ( x ∧ y ), and R = ((( P (( x ∧ y ) ∧ x ) /P ( x )) · P ( x )) /P ( x ∧ y )) · P ( x ∧ y ) = ( P ( x ∧ y ) /P ( x ∧ y )) · P ( x ∧ y ) · ( P ( x ) /P ( x )) = P ( x ∧ y ) · P ( x ) · P ( x ) − . Proposition 2.

The axiom system BA + Mb + DO + Sign + ABS + PFBC P + EQ is strictlyweaker than BA + Mb + DO + ABS + Sign + PFBC P + PFA P .Proof. Consider a four element event space generated by an atomic event e and choose P asfollows: P ( ⊥ ) = P ( e ) = P ( ¬ e ) = 0 and P ( ⊤ ) = 1. The equations of PFBC P and EQ PFA P is not satisﬁed.This weakness persists if EQ1 is replaced by BR. A second and equally well-known form ofBayes’ rule is BRs from Table 11. BR follows from BA + PFBC P + BRs by taking z = ⊤ . Proposition 3. BA + PFBC P + BRs implies

P F A P .Proof. it suﬃces to derive the following equation EQ2 P ( y ) = P ( y ∧ z ) + P ( y ∧ ¬ z ) (EQ2)This suﬃces because, according to [4], it is the case that EQ2 in combination with BA + Mb + DO + Sign + ABS + PFBC P entails P F A P . To this end set x = y in BR , thereby obtaining P ( y | y ) = ( P ( y | y ) · P ( y )) / ( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) . To derive EQ2, notice P ( y | y ) = P ( y ∧ y ) /P ( y ) = P ( y ) /P ( y ), take the inverse at both sidesthus obtaining L = R with L = P ( y ) /P ( y ) and R = ( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) /P ( y ).Then multiplying L and R with P ( y ) yields L · P ( y ) = R · P ( y ). Now L · P ( y ) = ( P ( y ) /P ( y )) · P ( y ) = P ( y ) and R · P ( y ) = (( P ( y | z ) · P ( z ) + P ( y |¬ z ) · P ( ¬ z )) /P ( y )) · P ( y ) =(( P ( y ∧ z ) /P ( z )) · P ( z ) + ( P ( y ∧ ¬ z ) /P ( ¬ z )) · P ( ¬ z )) · ( P ( y ) /P ( y )) =( P ( y ∧ z ) + P ( y ∧ ¬ z )) · ( P ( y ) /P ( y )) = P ( y ∧ z ) · ( P ( y ) /P ( y )) + P ( y ∧ ¬ z ) · ( P ( y ) /P ( y )) = P ( y ∧ z ) + P ( y ∧ ¬ z ). 13 ( x | y ) = P ( y | x ) · P ( x ) P ( y | z ) · P ( z ) + P ( y | ¬ z ) · P ( ¬ z ) (BRs)Table 11: PFA ′ P : alternative axiom for additivity v ( − x ) = − v ( x ) (36) v ( x − ) = v ( x ) − (37) v ( x + y ) = v ( x ) + v ( y ) (38) v ( x · y ) = v ( x ) · v ( y ) (39) v ( s ( x )) = s ( v ( x )) (40)Table 12: UCV : axioms for unconditional values; x, y range over V .It may be concluded that 11 provides an adequate substitute of PFA P . This observationsuggests an alternative axiomatisation BA + Mb + DO + Sign + ABS + PFBC P + PFA ′ P basedon BRs as given in Table 11.For BR, however, there seems to be no role as an axiom in the axiomatic framework ofthis paper. For instance one may wonder if BR provides an implicit deﬁnition of conditionalprobability. Proposition 4.

It is not the case that in the presence of BA + Mb + DO + Sign + ABS + PFBC P + PFA P , though in the absence of the deﬁnitions of Table 10, BR serves as an implicitdeﬁnition of P .Proof. Let Q ( x, y ) = 1( P ( y )) · P ( x ). Then Q ( x, y ) diﬀers from P ( x | y ) in all but ex-ceptional cases. However, Q ( − , − ) satisﬁes BR considered as a requirement on P ( − | − ): Q ( y, x ) · P ( x ) P ( y ) = 1( P ( x )) · P ( y ) · P ( x ) P ( y ) = 1( P ( y )) · P ( x )) · P ( x ) = 1( P ( y )) · P ( x ) = Q ( x, y ). A third sort named C V containing so-called conditional values will be introduced. C V isgenerated by an embedding v : V → C V and a conditional operator − : → − : E × C V → C V . C V is equipped with all meadow operations while v (0) serves as 0 and v (1) serves as 1.A speciﬁcation is given by combining (i) the axioms UCV of Table 12 with (ii) Mb cv = Mb [ v (0) / ,v (1) / , i.e the equations of Table 1, however with variables X, Y, Z now ranging over C V , and with v (0) substituted for 0 and v (1) substituted for 1, (iii) Sign cv , the equations ofTable 3, but now with its variables ranging over C V , and (iv) the speciﬁcation Cond of theconditional operator − : → − : E × C V → C V as speciﬁed in Table 13.14 : → X = X (41) ⊥ : → X = v (0) (42) e : → ( X + Y ) = ( e : → X ) + ( e : → Y ) (43) e : → ( X · Y ) = ( e : → X ) · Y (44) e : → ( − X ) = − ( e : → X ) (45) e : → ( X − ) = ( e : → X ) − (46)( e ∨ f : → X ) = ( e : → X ) + ( f : → X ) − ( e ∧ f : → X ) (47) e ∧ f : → X = e : → ( f : → X ) (48) s ( e : → X ) = e : → s ( X ) (49)Table 13: Cond: axioms for the conditional operatorGiven a Boolean algebra E and a signed meadow M ( s ) there is a free term algebra CV ( E , M ))of elements for C V generated from E and M ( s ).The three sorted expansion ECV ( E , M ( s ) , CV ( E , M ( s ))) of M ( s ) and E includes a sort C V ,the conditional operator on E × C V , and the embedding v from V into C V .For a Boolean algebra E the subset E at consists of the atomic elements of || E || , where a ∈ || E || is atomic if a = ⊥ and whenever for b and c in || E || , E | = ( ¬ b ∨ a ) ∧ ( ¬ c ∨ a ) = ⊤ then E | = ¬ b ∨ a = ⊤ or E | = ∧¬ c ∨ a = ⊤ . E at contains the maximally consistent elements ofthe Boolean algebra.To each closed term X of type C V of the extended signature a mapping J X K : E at → V isassigned, with the rules of Table 14. The equivalence relation ≡ at on closed C V terms is givenby X ≡ at Y ⇐⇒ ∀ a ∈ E at ( J X K ( a ) = J Y K ( a )). X ≡ at is a congruence relation which meetsall requirements imposed by UCV + Sign cv + Mb cv + Cond and CV ( E , M ) can be deﬁned asthe free term algebra for sort C V in the extended signature modulo ≡ at . This constructionguarantees the consistency of the given construction of the structure for C V as for arbitrary a ∈ E at : J v (0) K ( a ) = 0 = 1 = J v (1) K ( a ). Proposition 5. If M is nontrivial (that is M = 0 = 1 ) and || E || has more than two elementsthen CV ( E , M ) is not a cancellation meadow (that is CV ( E , M ) = X = 0 → X · X − = 1 ).Proof. The proof works by ﬁnding an X which diﬀers from v (0) modulo ≡ at and so that X · X − diﬀers from v (1) modulo ≡ at . Indeed If || E || > E at is non-empty, and let a be an atom.Now a : → v (1), violates IL. First notice that J ⊥ : → v (1) K ( a ) = 0 = 1 = J a : → v (1) K ( a ) so that ⊥ : → v (1) at a : → v (1), and similarly by application to ¬ a that a : → v (1) at ⊤ : → v (1).Now ( a : → v (1)) − = a : → v (1) − = a : → v (1 − ) = e : → v (1) whence ( a : → v (1)) · ( a : → v (1)) − = ( a : → v (1)) · ( a : → v (1)) = a : → v (1) at ⊤ : → v (1)( at a : → v (1)). Deﬁnition 2.

An expression X = e : → v ( t ) + . . . e n : → v ( t n ) of type C V is a ﬂat C V expression. v ( m ) K ( a ) = m J − t K ( a ) = − ( J t K ( a )) J t − K ( a ) = ( J t K ( a )) − J t + r K ( a ) = J t K ( a ) + J r K ( a ) J t · r K ( a ) = J t K ( a ) · J r K ( a ) J e : → t K ( a ) = J t K ( a ) , if E | = ¬ a ∨ e = ⊤ J e : → t K ( a ) = 0 , if E | = a ∧ e = ⊥ . Table 14: Deﬁnition of J t K ( a ) for a ∈ E at Deﬁnition 3.

A ﬂat C V expression X = e : → v ( t ) + . . . e n : → v ( t n ) is non-overlapping iffor all ≤ i, j ≤ n with i = j , it is the case that provably e i ∧ e j = ⊥ . Deﬁnition 4.

Two non-overlapping ﬂat C V expressions are similar if both involve the samecollection of conditions, used in the same order. Proposition 6.

For each closed C V expression X there is a non-overlapping ﬂat C V expres-sion Y such that Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond ⊢ X = Y . Proposition 7.

For closed C V expressions X and Y similar non-overlapping ﬂat expressions X ′ and Y ′ can be found so that Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond ⊢ X = X ′ & Y = Y ′ . Proposition 8.

If we ﬁx E as some ﬁnite minimal event space with E | = ⊤ 6 = ⊥ , then the C V expressions generated from E and R constitute a signed vector meadow meadow withdimension E at ) . If E at ) ≥ then the meadow of conditional values is not a cancellationmeadow. Instead it is a vector space meadow (see Paragraph 2.1). Elements of the form e : → with e ∈ E , are the idempotent elements of C V . CVs e : → and f : → are orthogonalif and only if e ∧ f = ⊥ in E . If a , . . . a n enumerates E at without repetition then C V ∼ = R ( s ) h e , . . . , e n i . Proposition 9.

Given closed C V expressions in ﬂat form X = P ni =1 e i : → v ( t i ) and Y = P mj =1 f j : → v ( r j ) , a ﬂat form representation for X · Y is: P ni =1 P mj =1 ( e i ∧ f j ) : → v ( t i · r j ) . Moreover, if X and Y are non-overlapping then so is the given expression for X · Y . A C V expression, say X , denotes a value which is conditional on an event, that is it dependson the actual event e chosen from E . Therefore CVs are well-suited suited for deﬁning anexpected value, denoted with E P ( X ). The concept of an expectation lies at the basis offurther deﬁnitions of probabilistic quantities such as variance, covariance, and correlation.Deﬁning the expected value for a conditional value can be done if a besides a probability16 P ( X + Y ) = E P ( X ) + E P ( Y ) (50) E P ( x : → v ( y )) = P ( x ) · y (51)Table 15: EV P , axioms for the expected value operator, x ranges over E , y over V function, say P , C V expression in ﬂat form is available, say P ni =1 e i : → v ( t i ). E P ( n X i =1 e i : → v ( t i )) = n X i =1 ( P ( e i )) · t i ) . These identities provide an axiom scheme for the function E P : CV → V .Given a probability function structure EPV ( E , R ( s ) , P ) and a CV structure involving thesame event space, say ECV ( E , R ( s ) , CV ( E , R ( s ))) a joint expansion exists. Denoting thejoint expansion with EPCV ( E , R ( s ) , CV ( E , R ( s )) , P ) it can be further expanded with anexpected value operator named E P , interpreted in compliance with the mentioned scheme, toa structure EPCV ( E , R ( s ) , CV ( E , R ( s )) , P , E P ). Together the latter structures constitute aclass of probability structures K ( BA ).Instead of using an axiom scheme, a ﬁnite axiomatisation of E P ( − ) is given in Table 15,from which each instance of the scheme can be derived. The equations (named EV P ) ofTable 15 determine E P ( − ) on all C V expressions not involving variables of sort C V .Grouping together the axioms collected thus far one ﬁnds an equational theory: MBPC P = BA + Mb + DO + Sign + ABS + PFBC P + PFA P + UCV + Sign cv + Mb cv + Cond + EV P (meadow based probability calculus). A plausible class of models for MBPC P is K ( BA ). Witha proof similar to that of Theorem 2, it follows that MBPC P is complete for such equationsw.r.t. validity in K ( BA ). E P can be eliminated from expressions of sort V without free variables of sort C V . Thereforean expression of sort V without free variables of sort C V is provably equal within MBPC P toan expression not involving subterms of sort C V . On the basis of a deﬁnition of expectation, variance, covariance, and correlation on conditionalvalues can be introduced as derived operators as in Table 16.Let X and Y be C V expressions with ﬂat forms X = P ni =1 e i : → v ( t i ) and Y = P mi =1 f i : → v ( r i ) . The equations in Table 15 provide explicit deﬁnitions of variance, covariance, and cor-relation for X , resp. Y .There is no novelty to these deﬁnitions except for the eﬀort made to make each deﬁnitionﬁt a framework that has been setup on the basis of an algebraic speciﬁcation. By proceedingin this manner an axiomatic framework is obtained for equational reasoning about each ofthese technical notions.Forgetting the subscript, that is using E ( X ) instead of E P ( X ), and similarly for the otheroperators, is common practice in probability theory. Doing so, however requires that it is17 AR P ( X ) = E P ( X ) − ( E P ( X )) (52) COV P ( X, Y ) = E P ( X · Y ) − E P ( X ) · E P ( Y ) (53) CORR sqP ( X, Y ) =

COV P ( X, Y ) VAR P ( X ) · VAR P ( Y ) (54)Table 16: EV P , axioms for variance, covariance, and correlation for conditional valuesapparent from the context which probability function is used. Moreover it must be assumedthat for X and for Y the same probability function applies. Given a conditional value X = P ni =1 e i : → v ( t i ) in non-overlapping ﬂat form, and a probabilityfunction P the probability mass function, λx.P ( X = x ) for X is supposed to yield for eachvalue x the probability that X takes value x . An explicit deﬁnition for the PMF of X is asfollows: Pmf P ( X ) = L x ∈ V . n X i =1 (0( t i − x ) · P ( e i )) . This speciﬁcation of

Pmf P is schematic and for that reason does not achieve the simplicityfound for the expected value operation. Problem 6.

Can Pmf P be speciﬁed by means of a ﬁxed and ﬁnite number of equations ratherthan with an axiom scheme involving an equation for each non-overlapping closed C V expres-sion? Proposition 10.

Equivalence of deﬁnitions for expectation and variance for CV expressionsin non-overlapping ﬂat form via (joint) PMFs extraction.1. E P ( X ) = E pmf ( Pmf P ( X )) ,

2. VAR P ( X ) = VAR pmf ( Pmf P ( X )) . Proof.

Let X = P ni =1 e i : → v ( t i ) be a non-overlapping ﬂat C V expression. Making use of thefacts listed in Paragraph 2.3, one obtains: E pmf ( L x.P ( X = x )) = P ⋆x P ni =1 (0( t i − x ) · P ( e i )) = P ni =1 P ⋆x (0( t i − x ) · P ( e i )) = P ni =1 P ⋆x (0( t i − x ) · P ( e i )) = P ni =1 ( t i · P ( e i )) = E P ( X ) . Two conditional values are event sharing if both have conditions over the same domain. Ex-traction of a joint PMF from event sharing conditional values works as follows. Given two C V X and Y with similar nonoverlapping ﬂat forms P ni =1 ( e i : → t i ) and P ni =1 ( e i : → r i )the joint PMF for these conditional values, denoted by P ( X = x, Y = y ), is deﬁned by P ( X = x, Y = y ) = n X i =1 (0( t i − x ) · r i − y ) · P ( e i )) . Extending Proposition 4.6 the following connections between deﬁnitions involving a condi-tional value and deﬁnitions involving a PMF or a joint PMF can be found.

Proposition 11.

Equivalence of deﬁnitions for covariance and correlation (squared) via CVsand via (joint) PMFs.1. COV P ( X, Y ) =

COV pmf ( L x, y.P ( X = x, Y = y )) , CORR sqP ( X, Y ) =

CORR sq pmf ( L x, y.P ( X = x, Y = y )) . In the multidimensional case the event space is considered a product of event spaces. In themulti-dimensional case CVs occurring in a vector of CVs are supposed by default not to beevent space sharing and the notion of a joint probability function working over a tuple of eventspaces enters the picture.The multi-dimensional case becomes relevant once tuples (vectors) of CVs are consideredin combination with a plurality of joint probability functions for product spaces of higherdimensional event space corresponding to various vectors of CVs such that there may notexist a joint probability function for the full product space.

Let D = { a , . . . , a n } be a ﬁnite set. The elements of D will be called dimensions. D is calleda dimension set, and it is assumed that n = D ). Deﬁnition 5. (Arities over D ) ar D , the collection of arities over dimension set D , denotesthe set of ﬁnite non-empty sequences of elements of D without repetition. Elements of ar D will serve as arities of probability functions on multi-dimensional eventspaces. l ( w ) denotes the length of w ∈ ar D . Deﬁnition 6. (Arity family) Given an event space E , and a name P for a probability function,an arity family (for E and P ) is a ﬁnite subset W of ar D which is (i) closed under permutation,and (ii) closed under taking non-empty subsequences, and (iii) which contains for each d ∈ D the arity ( d ) , that is the one-dimensional arity consisting of dimension d only. For each dimension d ∈ D the presence of a sort E d of events for dimension d is assumed.For simplicity of notation it is assumed that these sorts are identical, so that only a sort E isrequired. 19 d,u,e,u ′ ( y , x . . . , x l , y , z , . . . , z l ′ ) = P e,u,d,u ′ ( y , x . . . , x l , y , z , . . . , z l ′ ) (55) P d ( ⊤ ) = 1 (56) P d ( ⊥ ) = 0 (57) P d,w ( ⊤ , x , . . . , x n ) = P w ( x , . . . , x n ) (58) P d,w ( ⊥ , x , . . . , x n ) = 0 (59) P w ( x , . . . , x n ) = | P w ( x , . . . , x n ) | (60) P d,u ( x ∨ y, x , . . . , x l ) = P d,u ( x, x , . . . , x l ) + P d,u ( y, x , . . . , x l ) − P d,u ( x ∧ y, x , . . . , x l ) (61)Table 17: PFF W,P : axioms for a probability function family with name P (with d, e ∈ D , w, ( d, u ) , ( e, u, d, u ′ ) ∈ W, n = l ( w ), and u, u ′ ∈ ar D ∪ { ǫ } , l = l ( u ) , l ′ = l ( u ′ ) .e : → a ( f : → b X ) = f : → b ( e : → a X ) (62)Table 18: Cond mv : commuting multivariate condition constructors Deﬁnition 7.

A probability family (denoted

P F F W ) for an arity family W ⊆ ar D consistsof a probability function P w : E l ( w ) → V for each w ∈ W , such that for all w ∈ W each theaxioms in Table 17 (taken from [4]) are satisﬁed. The axioms of Table 17 case correspond to the axioms for a probability function of Table 9in the one dimensional case.Because in an arity repetition of dimensions is disallowed these axioms reduce to what wehad already in the case of a single dimension.

Just as in the one-dimensional case, multivariate conditional values are the elements of sort C V . C v has, besides the embedding v from V into C V (which must meet the requirements ofTable 12), for each d ∈ D a constructor − : → d − of type E × C V → C V . − : → d − must satisfythe requirements Cond d which result from Cond in Table 13 by replacing operator − : → − by − : → d − in all equations. In addition to these requirements the equations Cond mv of Table 18must be satisﬁed for all diﬀerent pairs a, b ∈ D . For a speciﬁcation of the expected value operator it is assumed that d , . . . , d n is an enumera-tion without repetitions of D . For each w ∈ W a separate expected value operator E wP arises.20 wP ( X + Y ) = E wP ( X ) + E wP ( Y ) (63) E wP ( x : → d ( . . . ( x n : → d n v ( y ) . . . )) = P w ( x , . . . , x n ) · y (64)Table 19: EV P,w , axioms for the expected value operator for arity w Each operator is speciﬁed by means of two equations as displayed in Table 19.Given the multi dimensional expected value operator, corresponding operators for variance,covariance, and correlation can be derived un the usual manner.

Collecting the equations mentioned thus far for the multidimensional setting the axiom system

M BP C WP = BA + Mb + DO + Sign + ABS + UCV + Sign cv + Mb cv + Cond d ( d ∈ D ) + PFF

W,P + EV P,w ( w ∈ W ) is obtained.Completeness of these axiomatisations can be shown with the same methods as for the 1Dcase. The design of these structures can be somewhat simpliﬁed if for each subset of D at mosta single probability function is admitted, having the arguments for the diﬀerent dimensionsin a ﬁxed order. When adopting this alternative, Table 17 needs to be redesigned as follows:permutation axioms are dropped and axioms involving the ﬁrst argument must be replicatedfor each argument position. This paper is a sequel to [4] where a meadow based approach to the equational speciﬁcation ofprobability functions was proposed. In [6] probabilistic choice is formalised with the meadowof reals as a number system. The equations in that paper demonstrate, just as well as theequations in Table 9, an attractive compatibility between the requirements of probabilitycalculus and the treatment of division in a meadow.In [15] an extensive survey is presented of the history leading up to Kolmogorov’s choiceof axioms, and to Kolmogorov’s claim that these axioms are what probability is about. Theequations in

PFBC P + PFA P do not take the 6th axiom into account, however, which assertsthat if ( e i ) i ∈ N is an inﬁnite descending chain of events such that only ⊥ is below each elementof the chain, then lim i →∞ P ( e i ) = 0. A closer resemblance with Kolmogorov’s original axiomsis found if the equation in Table 9 is replaced by the conditional equation e ∧ f = ⊥ → P ( e ∨ f ) = P ( e ) + P ( v ). This replacement produces a logically equivalent axiom system. Theequation of Table 9 is preferred because it is logically simpler than a conditional equation.Conditional values play the role of a discrete random variables with ﬁnite range. By workingwith conditional values the use of a sample space underlying the event space is avoided whichhelps to maintain the style and simplicity of the axiomatisation of probability functions of [4].Instead of including an additional sort C V , the conditional values might be viewed as an21xtension of the sort V. A reason for not doing so, however, is to prevent P from taking valuesof the form say P ( e ) = f : → v (1 / / − ) and 0( − ) of Table 2 the original notation from [2, 3] is1( x ) = 1 x , resp. 0( x ) = 0 x , which notations may still be used as alternatives. The chosennotation is preferable if a sizeable expression is substituted for x . Table 1 makes use of inversivenotation. The phrase “inversive notation” was coined in [5] where it stands in contrast with“divisive notation” which involves a two place division operator symbol. In [5] the equivalenceof both notations is discussed. Two place division is provided as a derived operation inTable 2. Division commonly appears in a plurality of syntactical forms: x : y, x/y, x / y , and xy . These diverse forms are not in need of a separate deﬁning equation, just as much as inthe speciﬁcation of a meadow no mention is made of the existing notational variation formultiplication (viz. x × y, x · y, x.y and xy ). Acknowedgement

Yoram Hirschfeld, Kees Middelburg and Alban Ponse gave useful com-ments on a previous version of the paper.

References [1] D. Barber.

Bayesian Reasoning and Machine Learning.

Cambridge Univer-sity Press, 2012. (ISBN 0521518148, 9780521518147). On-line version availableat http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.Online (consulted version: 18 June 2013).[2] J.A. Bergstra, I. Bethke, and A. Ponse. Cancellation meadows: a generic basis theoremand some applications.

The Computer Journal , 56(1):3–14, 2013.[3] J.A. Bergstra, I. Bethke, and A. Ponse. Equations for formally real meadows.

Journal ofApplied Logic , 13(2) part B:1–23, 2015.[4] Jan A. Bergstra and Alban Ponse. Probability functions in the context of signed involutivemeadows. in: Recent Trends in Algebraic Development Techniques , Eds. Philip James &Markus Roggenbach, Proc. 23th IFIP WG1.2 International Workshop WADT, SpringerLNCS 10644, 73–87, (also https://arxiv.org/pdf/1307.5173.pdf ), 2017.[5] J.A. Bergstra and C.A. Middelburg. Inversive meadows and divisive meadows.

Journalof Applied Logic , 9(3): 203–220, 2011.[6] J.A. Bergstra and C.A. Middelburg. Probabilistic thread algebra.

SACS , 25(2): 211–243,2015. 227] J. A. Bergstra and J. V. Tucker. The rational numbers as an abstract data type.

J. ACM ,54, 2, Article 7 (April 2007) 25 pages, 2007.[8] D.P. Bertsekas and J.N. Tsitsiklis.

Introduction to Probability , Athena Scientiﬁc, NashuaUSA, ISBN 978-1-886529-23-6, 2008.[9] D. Davidson and P. Suppes. A Finitistic Axiomatization of Subjective Probability andUtility. Econometrica 24 (3) 264-275, 1956.[10] J.Y. Halpern. An analyisis of ﬁrst-order logics of probability. Artiﬁcial Inteligence 46,311-350, 1990.[11] Khan Academy. Random variables and probability distributions. ,(consulted July 9 2016).[12] C.P.J. Koymans, and J. L. M. Vrancken. Extending process algebra with the emptyprocess. Electronic, report LGPS 1. Dept. of Philosophy, State University of Utrecht,The Netherlands (1985).[13] T. Matsuura and S. Saitoh. Matrices and Division by Zero. Advances in Linear Algebra& Matrix Theory 6: 51-58 ( http:dx.doi.org/10.4236/alamt.2016.62007 ), 2016.[14] H. Padmanabhan. A self-dual equational basis for Boolean algebras.

Canad. Math. Bull. ,26(1):9–12, 1983.[15] G. Shafer and V. Vovk. The Sources of Komogorov’s

Grundbegriﬀe . Statistical Science,21 (1) 70-98, 2006.[16] Wikipedia. https://en.wikipedia.org/wiki/Random_variable , consulted July 9,2016.

A Random variables

The notion of a random variable plays a central role in many presentations of probabilitytheory. In the presentation of the current paper the role of random variables is played by aconditional values (CVs) instead. In this Appendix it will be outlined how to view a CV as arandom variable provided that the event space is ﬁnite.

A.1 From implicit sample space to explicit sample space

Given event space E , the subset of its domain E at consisting of atoms as deﬁned in Para-graph 4.3 can be taken for the corresponding sample space and then a random variable issupposed to be a function from sample space to values. Viewing E at as a sample space, foreach close conditional value expression X , the function J X K , as speciﬁed in Table 14, qualiﬁesas a random variable. 23 prefer not to have E at as a sort because the resulting setting with E at as a subsort of E is not easily reconciled with equational logic. Logical diﬃculties with the equational logicof subsorts persist in spite of the presence of many works that have been devoted to thatparticular complication.Now summation over the sample space E at is speciﬁed as follows. For an event space E anda term t of sort V , then P ⋆α ∈ E at t = 0 if there are either none or inﬁnitely many atomic eventsin || E || and otherwise ⋆ X α ∈ E at t = [ a i /α ] t + . . . + [ a k /α ] t with a , . . . , a k an enumeration without repetitions of the atomic events of E . Provided E isﬁnite, the expectation of J X K can be deﬁned by summation over the sample space, using anidentity tat lies outside ﬁsrt order equational logic: E P ( J X K ) = ⋆ X α ∈ E at ( J X K ( α ) · P ( α )) A.2 Random variables in colloquial language

Random variables play a key role in many accounts of probability theory. However, theconcept of a random variable seems to be rather informal and its use is often cast in colloquiallanguage. A common wording states that “a random variable is the outcome of a stochasticprocess”. Complicating an understanding of a random variable, however, is the fact that themathematical deﬁnition of it, which reads “a function from sample space to reals” makes noreference to any variable or variable name, or to a probability function, or to a stochasticmechanism. In [16] it is asserted about a random variable that it is:... a variable whose value is subject to variations due to chance (i.e. randomness,in a mathematical sense).... A random variable can take on a set of possiblediﬀerent values (similarly to other mathematical variables), each with an associatedprobability, in contrast to other mathematical variables.’In [11] a random variable is explained as a mapping from “outcomes” to values which providesquantiﬁcation, while the main argument put forward for the introduction of an random variableis about the use of its name, and at the same time the suggestion is made that a randomvariable is linked to a probability function. In [8] it is stated thatA discrete random variable has an associated probability mass function ..In the introductory probability refresher of [1] the domain of a variable is said to be the setof states it can take, while the relation between (random) variables and events is explained asfollows:For our purposes, events are expressions about random variables, such as