Semantic information and artificial intelligence
DD R A F T Semantic information and artificial intelligence
Anderson de Ara´ujo
Abstract
For a computational system to be intelligent, it should be able to perform,at least, basic deductions. Nonetheless, since deductions are, in some sense, equiv-alent to tautologies, it seems that they do not provide new information. The presentarticle proposes a measure the degree of semantic informativity of valid deductionsin a dynamic setting. Concepts of coherency and relevancy, displayed in terms ofinsertions and deletions on databases, are used to define semantic informativity. Inthis way, the article shows that a solution to the problem about the informativity ofdeductions provides a heuristic principle to improve the deductive power of compu-tational systems.
For Aristotle, “every belief comes either through syllogism or from induction”(Aristotle (1989)). From that, we can infer that every computational system thataspires to exhibit characteristics of intelligence needs to have deductive as well asinductive abilities. With respect to the latter, there are theories that explain whyinduction is important for artificial intelligence; for instance, Valiant’s probably ap-proximately correct semantics of learning (Valiant (1984, 2008)). Nonetheless, asfar as the former is concerned, we have a problem first observed by Hintikka (Hin-tikka (1973)), which can be stated in the following way:1. A deduction is valid if, and only if, the conjunction of its premisses, says φ , . . . , φ n , implies its conclusion, ψ .2. In this case, φ ∧ · · · ∧ φ n → ψ is a tautology, i.e., valid deductions are equivalentto propositions without information.3. Therefore, deductions are uninformative.This was called by Hintikka the scandal of deduction . This is a scandal not onlybecause it contradicts Aristotle’s maxim that deductions are important for obtainingbeliefs, but, mainly, in virtue of the fact that we actually obtain information via de-ductions. For this and other reasons, Floridi has proposed a theory of strong seman- Anderson de Ara´ujoFederal University of ABC (UFABC), Center for Natural and Human Sciences (CCNH), S˜aoBernardo do Campo, SP, Brazil, e-mail: [email protected] a r X i v : . [ c s . L O ] A p r Ara´ujo, Anderson de tic information in which semantic information is true well-formed data (Cf. Floridi(2004)). From this standpoint, Floridi is capable of explaining why some logicalformulas are more informative than others. If we want to explain why deductions,not only propositions, are important for knowledge acquisition and intelligent pro-cessing, we cannot, however, apply Floridi’s theory. The main reason is that it wasdesigned to measure the static semantic information of the data expressed by propo-sitions. In contrast, knowledge acquisition and intelligence are dynamic phenomena,associated in some way to the flow of information.In the present work, we propose to overcome that limitation by defining a mea-sure of semantic information in Floridi’s sense, but in the context of a dynamicperspective about the logical features of databases associated to valid deductions(Section 2). We will restrict ourselves to first-order deductions and will adopt a se-mantic perspective about them, which means that deductions will be analyzed interms of structures. There is two reasons for that choice. The first is that the scandalof deduction is usually conceived in terms of structures associated to valid deduc-tions (Cf. Sequoiah-Grayson (2008)). In other words, we have a problem with thesemantic informativity of deductions. The second reason is technical: databases arefinite structures and so there is, in general, no complete deductive first-order logicalsystem for finite structures (Cf. Ebbinghaus and Flum (1999)).Besides that, we impose to ourselves the methodological constraint that a goodapproach of semantic informativity should be robust enough to be applicable to realcomputational systems. More specifically, we will look for a solution that enablesto link semantic information and artificial intelligence. Because of that, we pro-pose to measure the degree of semantic informativity of deductions as a dynamicphenomenon, based on certain explicit definitions of insertions and deletions ondatabases. In that context, the concepts of coherency and relevancy will be explainedby the operations of insertion and deletion (Section 3), and so semantic informativ-ity will be conceived in terms of relevancy and coherency (Section 4). This approachleads us to a solution the scandal of deduction (Section 4). Moreover, using straight-forward definitions, we shows that our definition of semantic informativity providesan immediate heuristic principle to improve the deductive power of computationalsystems in semantic terms.
We intend to analyze the semantic informativity obtained via deductions. Accordingto Floridi, semantic information is true well-defined data (Floridi (2011)). As far aslogic is concerned, we can say that data is in some way expressed by propositions.In general, deductions are compounded of two or more propositions. Thus, we needindeed to consider databases, because they are just organized collections of data (Cf.Kroenke and Auer (2007)). From a logical point of view, the usual notion of databasecan, in turn, be understood in terms of the mathematical concept of structure. emantic information and artificial intelligence 3
Definition 2.1 A database is a pair D = ( A , T ) where A is a finite first-order struc-ture over a signature S and T is a correct (all propositions in T are true in A) finitefirst-order theory about A. Example 2.1
Let D = ( A , T ) be a database with signatureS = ( { s , l , a } , { C , E , H } ) , for s = (cid:112) S˜ao Paulo (cid:113) , l = (cid:112) London (cid:113) , a = (cid:112) Avenida Paulista (cid:113) , C = (cid:112) City (cid:113) , E = (cid:112) Street (cid:113) and H = (cid:112) To have (cid:113) , suchthat A = ( { ¯ s , ¯ l , ¯ a } , ¯ s s ¯ l l , ¯ a a , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) and T = {∀ x ( Cx →∃ yHxy ) , ∀ x ( Cx ∨ Ex ) , ¬ El , Cs } . Remark 2.1
In the example 2.1 we have used (cid:112) α (cid:113) = β to mean that the symbol β is a formal representation of the expression α . Besides, X β is the interpretation of β in the structure A and we use a bar above letters to indicate individuals of thedomain of A. The fact that theory T is correct with respect to A does not exclude, however,the possibility that our database does not correspond to reality. In the example 2.1,it is true in A that Hla ∧ Ea , in words, it is true in A that London has a city calledAvenida Paulista, which, until date of the present paper, it is not true. The theory T represents the fundamental facts of the database that are took as true, that is tosay, they are the beliefs of the database. It is important to observe that T may notbe complete about A , it is possible that not all true propositions about A are in T ;example 2.1 shows this.We turn now to the dynamics of changes in databases that will permit us tomeasure semantic informativity. We propose that these changes in the structure ofdatabases should preserve the truth proposition of their theories through operationsthat we call structural operations . The first structural operation consists in put pos-sibly new objects in the structure of the given database and to interpret a possiblynew symbols in terms of these objects. Definition 2.2
Let D = ( A , T ) be a database over a signature S. An insertion of then-ary symbol σ ∈ S (cid:48) in D is a database D (cid:48) = ( A (cid:48) , T ) where A (cid:48) is an structure overS (cid:48) = S ∪ { σ } with the following properties:1. A (cid:48) ( τ ) = A ( τ ) for all τ (cid:54) = σ such that τ ∈ S;2. If n = , then A (cid:48) = A ∪ { a } and A (cid:48) ( σ ) = a, provided that, for all φ ∈ T , A (cid:48) (cid:15) φ ;3. If n > , then A (cid:48) = A ∪ { a , . . . , a n } and A (cid:48) ( σ ) = A ( σ ) ∪ { ( a , . . . , a n ) } , providedthat, for all φ ∈ T , A (cid:48) (cid:15) φ . Example 2.2
Let D = ( A , T ) be the database of the example 2.1. The databaseD = ( A , T ) with signature S (cid:48) = S ∪ { b } , where b = (cid:112) Shaftesbury Avenue (cid:113) , andA = ( { ¯ s , ¯ l , ¯ a } , ¯ s s , ¯ l l , ¯ a a , ¯ a b , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) is an insertion of b inD. On the other hand, D = ( A , T ) is an insertion of E in D where A =( { ¯ s , ¯ l , ¯ a , ¯ b } , ¯ s s , ¯ l l , ¯ a a , ¯ a b , { ¯ s , ¯ l } C , { ¯ a , ¯ b } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) is an S (cid:48) -structure. Nonethe-less, for A ∗ = ( { ¯ s , ¯ l , ¯ a , ¯ b } , ¯ s s , ¯ l l , ¯ a a , ¯ b b , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) , an S (cid:48) -structure,we have that D ∗ = ( A ∗ , T ) is not an insertion of b in D because in this caseA ∗ (cid:50) ∀ x ( Cx ∨ Ex ) . Ara´ujo, Anderson de
The example 2.2 shows that it is not necessary to introduce a new object in thestructure of the database to make an insertion (Cf. database D ), it is sufficient toadd a possibly new element in the interpretation of some symbol. On the other hand,it also shows that it is not sufficient to introduce a new object in the structure of thedatabase to make an insertion (Cf. database D ∗ ), it is necessary to guarantee that thebeliefs of the database are still true in the new structure.The second structural operation consists in removing possibly old objects in thestructure of the database and to interpret a possibly new symbol in terms of theremaining objects in the database. Definition 2.3
Let D = ( A , T ) be a database over a signature S. A deletion of then-ary symbol σ ∈ S (cid:48) , S − { σ } ⊆ S (cid:48) ⊆ S, from D is a database D (cid:48) = ( A (cid:48) , T ) where A (cid:48) is an structure over S (cid:48) with the following properties:1. A (cid:48) ( τ ) = A ( τ ) for all τ (cid:54) = σ such that τ ∈ S;2. If n = , A − { A ( σ ) } ⊆ A (cid:48) ⊆ A and A (cid:48) ( σ ) ∈ A (cid:48) , provided that, for all φ ∈ T ,A (cid:48) (cid:15) φ ;3. If n > , A − { a , . . . , a n } ⊆ A (cid:48) ⊆ A and A (cid:48) ( σ ) = A ( σ ) − { ( a , . . . , a n ) } , providedthat, for all φ ∈ T , A (cid:48) (cid:15) φ . Example 2.3
Let D = ( A , T ) be the database of the exam-ple 2.1. The database D (cid:48) = ( A (cid:48) , T ) with signature S and A (cid:48) =( { ¯ s , ¯ l , ¯ a } , ¯ a s , ¯ l l , ¯ a a , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) is a deletion of s in D.On the other hand, D (cid:48) = ( A (cid:48) , T ) is a deletion of H in D (cid:48) whereA (cid:48) = ( { ¯ s , ¯ l , ¯ a } , ¯ a s , ¯ l l , ¯ a a , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ l , ¯ a ) } H ) is a S-structure. Nonetheless,for A (cid:48) = ( { ¯ l , ¯ a } , ¯ a s , ¯ l l , ¯ a a , { ¯ l } C , { ¯ a } E , { ( ¯ l , ¯ a ) } H ) , an structure over the signatureS, we have that D (cid:48) = ( A , T ) is not a deletion of C in D (cid:48) because in this case, indespite of A (cid:48) (cid:15) φ for φ ∈ T , we have that D (cid:48) ( H ) (cid:54) = D (cid:48) ( H ) . Note, however, that D (cid:48) is a deletion of C in D (cid:48) . Example 2.3 illustrates that the restriction A − { a , . . . , a n } ⊆ A (cid:48) ⊆ A means thatwe can delete at most just the elements of the domain that we remove in some wayfrom the interpretation of the symbol that we are considering in the deletion.Insertions and deletions on databases are well known primitive operations (Cf.Kroenke and Auer (2007)). Nevertheless, to the best of our knowledge, they havebeing conceived as undefined notion. Here we have proposed, however, a logicalconception about databases and we have defined explicitly the operations of inser-tion and deletion sufficient in order to analyze the importance of semantic infor-mation to artificial intelligence. In Ara´ujo (2014), a more strict notion of structuraloperation is given. In this section, we propose a dynamic perspective about coherency and relevancy.This approach will permit us to evaluate how many structural operations a propo- emantic information and artificial intelligence 5 sition requires to become true. We will use these concepts to define the semanticinformativity in the next section.
Definition 3.1 An update ¯ D of an S-database D is a finite or infinite sequence ¯ D =( D i : 0 < i ≤ ω ) where D = D and each D i + is a insertion or deletion in D i . Anupdate ¯ D of D is coherent with a proposition φ if ¯ D = ( D , D , . . . , D n ) and A n (cid:15) φ ;otherwise, ¯ D is said to be incoherent with φ . Example 3.1
Let D be the database of the example 2.1 and D be the databasesof the example 2.2. The sequence ¯ D = ( D , D ) is an update of D coherent withEb and Hlb. Let D be the database of the example 2.1 and D (cid:48) , D (cid:48) and D (cid:48) be thedatabases of the example 2.3. The sequence ¯ D (cid:48) = ( D , D (cid:48) , D (cid:48) , D (cid:48) ) is an update of Dcoherent with Es ∧ ¬ Hsa but not with s = a because the last proposition is false inA (cid:48) = ( { ¯ l , ¯ a } , ¯ a s , ¯ l l , ¯ a a , { ¯ l } C , { ¯ a } E , { ( ¯ l , ¯ a ) } H ) . In other words, an update for a proposition φ is a sequence of changes in a givendatabase that produces a structure in which φ is true. In this way, we can measurethe amount of coherency of propositions. Definition 3.2
Let ¯ D = ( D , D , . . . , D n ) be an update of the database D. If ¯ D iscoherent with φ , we define the coherency of φ with ¯ D byH ¯ D ( φ ) = min { m ≤ n : A m (cid:15) φ } ∑ mi = ibut if ¯ D is incoherent with φ , thenH ¯ D ( φ ) = . A proposition φ is said to be coherent with the database D if H ¯ D ( φ ) > for someupdate ¯ D, otherwise, φ is incoherent with D. Remark 3.1
In the definition of coherency the denominator ∑ mi = i is used in to or-der to normalize the definition (the coherency is a non-negative real number smallerthan or equal to 1). Example 3.2
The coherence of Eb and Hlb with the update ¯ D of the example 3.1is the same / , i.e., H ¯ D ( Eb ) = H ¯ D ( Hlb ) ≈ . and so H ¯ D ( Eb ∧ Hlb ) = H ¯ D ( Eb ∨ Hlb ) ≈ . . On the other hand, with respect to the coherence of Es, ¬ Hsa and ¬ s = a and with the update ¯ D (cid:48) of the example 3.1, we have H ¯ D (cid:48) ( Es ) ≈ . , H ¯ D (cid:48) ( ¬ Hsa ) = . , H ¯ D (cid:48) ( s = a ) = and so H ¯ D (cid:48) ( Es ∧ ¬ Hsa ) = . but H ¯ D (cid:48) ( Es ∧ s = a ) = . The example 3.2 exhibits that, given an update, we can have different proposi-tions with different coherency, but we can have different propositions with the samecoherency as well. The fact that so H ¯ D ( Eb ) = H ¯ D ( Hlb ) = H ¯ D ( Eb ∧ Hlb ) ≈ . is not a measure of the complexity of propo-sitions. It seems natural to think that Eb ∧ Hlb is in some sense more complexthan Eb and Hlb . Here we do not have this phenomena. Moreover, the fact that
Ara´ujo, Anderson de H ¯ D ( Eb ∧ Hlb ) = H ¯ D ( Eb ∨ Hlb ) ≈ .
66 makes clear that, since some propositionshave a given coherency, many others will have the same coherency. Another inter-esting point is that H ¯ D (cid:48) ( Es ) > H ¯ D (cid:48) ( ¬ Hsa ) but H ¯ D (cid:48) ( ¬ Hsa ) = H ¯ D ( Eb ∧ Hlb ) ≈ . Es coherentwith ¯ D (cid:48) , and later ¬ Hsa we made coherent with ¯ D . When ¬ Hsa is coherent with ¯ D (cid:48) there is nothing more to be done, as far as the conjunction Es ∧ ¬ Hsa is concerned.These remarks show that our approach is very different from the one given in (Cf.D’Agostino and Floridi (2009)). It is not an analysis of some concept of complexityassociated to semantic information. In Ara´ujo (2014), we do an analysis of infor-mational complexity similar to one presented here about coherency, but this twoconcepts are different. In further works, we will examine the relation between them.For now, we are interested in artificial intelligence. With respect to that, we canobtain an important result in the direction of a solution to the scandal of deduction.
Proposition 3.1
For every database D = ( A , T ) and update ¯ D coherent with φ ,H ¯ D ( φ ) = for every φ such that A (cid:15) φ . In particular, for φ a tautology in thelanguage of D, H ¯ D ( φ ) = , but if φ is not in the language of D, < H D ( φ ) < . Incontrast, for every contradiction ψ in any language, H ¯ D ( ψ ) = for every update ¯ Dof D.
In virtue of our focus in this paper is conceptual, we will not provide proofshere (Cf. Ara´ujo (2014)). By now, we only observe that if a tautology has symbolsdifferent from the ones in the language of the database, it will be necessary to makesome changes in order to make that tautology become true. In contrast, a propositionis incoherent with a database when there is no way to change it in order to becomethe proposition true and, for this reason, contradictions are never coherent.We turn now to the concept of relevancy. For that, let us introduce a notation.Consider ( φ , φ , . . . , φ n ) a valid deduction of formulas over the signature S whosepremisses are the set Γ = { φ , φ , . . . , φ m } and its conclusion is φ = φ n . We representthis deduction by Γ { φ } . Definition 3.3
Let ¯ D = ( D , . . . , D n ) be an update of the S-database D = ( A , T ) coherent with φ . The relevant premises of the deduction Γ { φ } with respect to ¯ Dare the premises that are true in D n but are not logical consequences of T , i.e., thepropositions in the set ¯ D ( Γ ) of all ψ ∈ Γ for which D n (cid:15) ψ but T (cid:50) ψ . Example 3.3
Let ¯ D (cid:48)(cid:48) = ( D ) . Then, ¯ D (cid:48)(cid:48) ( { Ea }{∃ xEx } ) = { Ea } . Now let usconsider a more complex example. Let ¯ D = ( D , D ) be the update of 3.1.In this case, ¯ D ( {∀ x ( Cx → ¬ Ex ) , Cb }{¬ Eb } ) is not defined because ¬ Eb isfalse in A = ( { ¯ s , ¯ l , ¯ a } , ¯ s s , ¯ l l , ¯ a a , ¯ a b , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) . Nonetheless, con-sider the new update ¯ D (cid:48)(cid:48)(cid:48) = ( D , D , D , D , D ) such that D is the insertionin example 2.2, A = ( { ¯ s , ¯ l , ¯ a , ¯ b } , ¯ s s , ¯ l l , ¯ a a , ¯ b b , { ¯ s , ¯ l } C , { ¯ a , ¯ b } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) andA = ( { ¯ s , ¯ l , ¯ a , ¯ b } , ¯ s s , ¯ l l , ¯ a a , ¯ b b , { ¯ s , ¯ l } C , { ¯ a } E , { ( ¯ s , ¯ a ) , ( ¯ l , ¯ a ) } H ) . Then, ¯ D (cid:48)(cid:48)(cid:48) ( {∀ x ( Cx →¬ Ex ) , Cb }{¬ Eb } ) = {∀ x ( Cx → ¬ Ex ) } . In our definition of relevant premises, we have adopted a semantic perspectiveoriented to conclusion of deductions: the relevancy of the premises of a deductions emantic information and artificial intelligence 7 are determined according to an update in which its conclusion is true. Example3.3 illustrates that point, because it is only possible to evaluate the relevancy of {∀ x ( Cx → ¬ Ex ) , Cb }{¬ Eb } in an update like ¯ D (cid:48) in which the conclusion ¬ Eb istrue. Another point to be noted is that we have chosen a strong requirement aboutwhat kind of premises could be relevant: the relevant premises are just the non-logical consequences of our believes. Definition 3.4
Let D be an S-database. If ¯ D is an update of D coherent with φ , the relevancy R ¯ D ( Γ ) of the deduction Γ { φ } in ¯ D is the cardinality of ¯ D ( Γ ) divided bythe cardinality of Γ ,i.e., R ¯ D ( Γ ) = | ¯ D ( Γ ) || Γ | , but, if ¯ D is incoherent with φ , then R ¯ D ( Γ ) = . Example 3.4
We have showed in example 3.3 that R ¯ D (cid:48)(cid:48) ( { Ea }{∃ xEx } ) = andR ¯ D (cid:48)(cid:48)(cid:48) ( {∀ x ( Cx → ¬ Ex ) , Cb }{¬ Eb } ) = . . In the example above, R ¯ D (cid:48)(cid:48)(cid:48) ( {∀ x ( Cx → ¬ Ex ) , Cb }{¬ Eb } ) = . R ¯ D (cid:48)(cid:48) ( { Ea }{∃ xEx } ) = Proposition 3.2
For every database D = ( A , T ) , update ¯ D of D and deduction Γ { φ } , if T is a complete theory of A or Γ = (cid:11) , then R ¯ D ( Γ ) = . In particular,tautologies and contradictions have null relevancy. Therefore, deductions can be relevant only when we do not have a completetheory of the structure of the database. Moreover, as deductions, isolated logicalfacts (tautologies and contradictions) have no relevance. This means that we have athand a deductive notion of relevancy.
Having at hand the dynamic concepts of coherence and relevance, now it seemsreasonable to say that the more coherent the conclusion of a valid deduction is themore informative it is, but the more relevant its premises are the more informationthey provide. We use this intuition to define the semantic informativity of validdeductions.
Definition 4.1
The semantic informativity I ¯ D ( Γ { φ } ) of a valid deduction Γ { φ } inthe update ¯ D of the database D is defined byI ¯ D ( Γ { φ } ) = R ¯ D ( Γ ) H ¯ D ( φ ) . Ara´ujo, Anderson de
The idea behind the definition of semantic informativity of a valid deduction Γ { φ } is that I ¯ D ( Γ { φ } ) is directly proportional to the relevance of its premises Γ andto the coherency of its conclusion φ . Given Γ { φ } and an update ¯ D of D , if we have R ¯ D ( Γ ) = H ¯ D ( φ ) =
0, then the semantic informativity of Γ { φ } is zero, it doesnot matter how Γ { φ } is. Now, if H ¯ D ( φ ) =
0, then, by definition, R ¯ D ( Γ ) =
0. Thus,if the computational system, whose database is D , intends to evaluate I ¯ D ( Γ { φ } ) for some update ¯ D , it should look for a ¯ D coherent with φ , i.e., a ¯ D for which H ¯ D ( φ ) >
0. In other words, our analysis of the semantic informativity is oriented tothe conclusion of valid deductions - as we did with respect to relevancy.
Example 4.1
Given the updates ¯ D (cid:48)(cid:48) and ¯ D (cid:48)(cid:48)(cid:48) of the example 3.3. Then,I ¯ D (cid:48)(cid:48) ( { Ea }{∃ xEx } ) = · = and I ¯ D (cid:48)(cid:48)(cid:48) ( {∀ x ( Cx → ¬ Ex ) , Cb }{¬ Eb } ) = . · / ≈ . . In the definition of I ¯ D ( Γ { φ } ) the relevancy of the premisses, R ¯ D ( Γ ) , is a factor ofthe coherency of the conclusion, H ¯ D ( φ ) . For that reason, if a computational systemsintends to evaluate the semantic informativity of a proposition φ , it should measure H ¯ D ( φ ) and, then, multiply it by its relevancy R ¯ D ( { φ } ) . Hence, the semantic infor-mativity of a proposition φ can be conceived as a special case of the informativityof the valid deduction { φ }{ φ } . Definition 4.2
The semantic informativity I ¯ D ( φ ) of a proposition φ in the update ¯ Dof the database D is defined byI ¯ D ( φ ) = I ¯ D ( { φ }{ φ } ) . Example 4.2
Considering the update ¯ D (cid:48)(cid:48) of example 3.3, we have that I ¯ D (cid:48)(cid:48)(cid:48) ( Ea ) = I ¯ D (cid:48)(cid:48)(cid:48) ( ∃ xEx ) = but I ¯ D (cid:48)(cid:48)(cid:48) ( Ea → ∃ xEx ) = . If we consider the update ¯ D (cid:48)(cid:48)(cid:48) of exam-ple 3.3, we have that also have that I ¯ D (cid:48)(cid:48)(cid:48) (( ∀ x ( Cx → ¬ Ex ) ∧ Cb ) → ¬ Eb ) = , butI ¯ D (cid:48)(cid:48)(cid:48) ( ∀ x ( Cx → ¬ Ex )) = , I ¯ D (cid:48)(cid:48)(cid:48) ( Cb ) = and I ¯ D (cid:48)(cid:48)(cid:48) ( ¬ Eb ) = . . Example 4.2 shows that semantic informativity measures how many structuraloperations we do in order to obtain the semantic information of a proposition. It isfor that reason that I ¯ D (cid:48)(cid:48)(cid:48) ( Cb ) =
0, false well-defined data is not semantically infor-mative; it should be true. In other words, it is a measure of semantic informationin Floridi’s sense (Cf. Floridi (2011)). From this, we can solve Hintikka’ scandal ofdeduction.
Proposition 4.1
For every valid deduction ψ , . . . , ψ n (cid:15) φ in the language of D,I ¯ D (( ψ ∧ · · · ∧ ψ n ) → φ ) = for every update ¯ D. Nonetheless, if ψ , . . . , ψ n (cid:15) φ isnot in the language of D, I ¯ D (( ψ ∧ · · · ∧ ψ n ) → φ ) > for ¯ D coherent with ( ψ ∧· · · ∧ ψ n ) → φ . This proposition is a solution to the scandal of deduction in two different senses.First, it shows that we can have an informative valid deduction { ψ , . . . , ψ n }{ φ } whose associated conditional ψ , . . . , ψ n → φ is uninformative, for example, the onegiven in example 4.1. Second, it shows that it is not completely true that tautologiesare always uninformative. When we need to interpret new symbols, we have some emantic information and artificial intelligence 9 semantic information, notably, the one sufficient to perceive that we have a trueproposition - this is a natural consequence of our approach.From this standpoint, we are going to make a simple, but important, remark toestablish a relationship between semantic informativity and artificial intelligence.Given an update ¯ D = ( D , . . . , D n ) of D = ( A , T ) and a deduction { φ }{ φ } , ei-ther R ¯ D ( { φ } ) = R ¯ D ( { φ } ) =
1. If R ¯ D ( { φ } ) =
0, then either T (cid:15) φ or D n (cid:50) φ . If T (cid:15) φ , then there is an update ¯ D (cid:48) of D such that H ¯ D (cid:48) ( φ ) =
1, notably, ¯ D (cid:48) = ( D ) .If D n (cid:50) φ , then H ¯ D (cid:48) ( φ ) =
0. Finally, if R ¯ D ( { φ } ) =
1, then T (cid:50) φ as well as D n (cid:15) φ and so I ¯ D ( φ ) = H ¯ D ( φ ) >
0. Therefore, we conclude that the relevancy ofa proposition does not determine its coherency. On the other hand, if H ¯ D ( φ ) = R ¯ D ( { φ } ) =
0, but if H ¯ D ( φ ) >
0, this neither necessarily imply that either R ¯ D ( { φ } ) = R ¯ D ( { φ } ) =
0, because this depends whether T (cid:15) φ . Hence, wealso conclude that the coherency of a proposition does not determine its relevancytoo. Combining this two conclusions we obtain a general conclusion: the semanticinformativity of propositions cannot be determined by its coherency or relevancyalone. This reinforces our definition of semantic informativity. It is not an arbitrarydefinition, in fact semantic informativity is a relationship between both, coherenceand relevancy. What is the moral for artificial intelligence?In the studies of pragmatics (a linguistics’ area of research), Wilson and Sperberformulated two principles about relevant information in human linguistic practice: “Relevance may be assessed in terms of cognitive effects and processing effort: (a) otherthings being equal, the greater the positive cognitive effects achieved by processing an input,the greater the relevance of the input to the individual at that time; (b) other things beingequal, the greater the processing effort expended, the lower the relevance of the input to theindividual at that time.” Wilson and Sperber (2004)[p.608] Our general conclusion that semantic informativity of propositions cannot be de-termined by its coherency or relevancy alone shows that the two Wilson and Sper-ber’s principles (a) and (b) are in fact parts of one general principle associated tosemantic information. Let us put that in precise terms.
Definition 4.3
The changes that a proposition φ requires are the structural oper-ations, insertions and deletions, that generate an update ¯ D of a given databaseD = ( A , T ) coherent with φ . A proposition φ is new if φ is not false in A and is nota consequence of the theory T of the database D = ( A , T ) . Proposition 4.2
The less changes a new proposition requires, the more informativeit is.
Proposition 4.2 is a direct consequence of definition 4.2. As we have showed thatour definition 4.2 is not, in turn, arbitrary, this means that 4.2 is not arbitrary. 4.2 is areformulation of the Wilson and Sperber’s principle (b) above, but it is important tonote the differences between them. Wilson and Sperber’s principle (b) is an empiri-cal matter under discussion among linguistics (Cf. Wilson and Sperber (2004)). Theproposition 4.2 is a reformulation of our definition of semantic informativity. Us-ing the same strategy, we can also obtain a formal version of Wilson and Sperber’sprinciple (a).
Definition 4.4
If a valid deduction Γ { φ } has non-null relevancy in a given update ¯ D and its conclusion φ is new, then the results that it produces are its relevantpremisses and its conclusion, i.e., ¯ D ( Γ ) ∪ { φ } , but if φ is not new, then the results that it produces are just its relevant premisses ¯ D ( Γ ) . Proposition 4.3
The more results a valid deduction produces, the more informativeit is.
We can, then, combine these two propositions in an schematic one.
Proposition 4.4 (Principle of semantic informativity)
To increase semantic in-formativity, an intelligent agent, with respect to its database, should perform littlechanges and produce big results.
In recent works (Cf. Valiant (2008)), Valiant have argued that one of the mostimportant challenges in artificial intelligence is that of understanding how computa-tional systems that acquire and manipulate commonsense knowledge can be created.With respect to that point, he explains that some of the lessons from his PAC theoryis this: “We note that an actual system will attempt to learn many concepts simultaneously. It willsucceed for those for which it has enough data, and that are simple enough when expressedin terms of the previously reliably learned concepts that they lie in the learnable class.”Valiant (2008)[p.6]
We can read Valiant’s perspective in terms of the principle of semantic infor-mativity. The simple propositions are the more coherent propositions, the ones thatrequires little changes in the database. To have enough data is to have propositionssufficient to deduce another propositions and this means that deductions with moreresults are preferable. Of course, Valiant’s remark relies on PAC, a theory aboutlearnability, not on deductivity. It is necessary to develop further works to makeclear the relationship between this two concepts. Since we have designed a conceptof semantic informativity implementable in real systems, it seems, however, that thepossibility of realizing that is open.
We have proposed to measure the degree of semantic informativity of deductions bymeans of dynamic concepts of relevancy and coherency. In an schematic form, wecan express our approach in the following way:Semantic informativity = Relevancy × Coherency.In accordance with this conception, we showed how the scandal of deduction canbe solved. Our solution is that valid deductions are not always equivalent to propo-sitions without information. It is important to note, however, that this problem is emantic information and artificial intelligence 11 not solve in its totality, because here we have analyzed semantic information onlyfrom the point of view of relevancy and coherency. Another crucial concept associ-ated to semantic information is the notion of complexity. In Ara´ujo (2014), we treatthis subject, but a complete analysis of the relation among semantic information,relevance, coherency and complexity is necessary.In this respect, we have derived a principle of semantic informativity that, whenapplied to computational intelligent systems, means that an intelligent agent shouldmake few changes in its database and obtain big results. This seems an obviousobservation, but it is not. The expressions “few changes” and “big results” here havea technical sense which opens the possibility of relating semantic information andartificial intelligence in a precise way. Indeed, there is a lot of possible developmentsto be explored, we would like to indicate three.The first one is to investigate the connections between semantic informativityand machine learning, specially, with respect to Valiant’s semantic theory of learn-ing (PAC). As in Valiant’s PAC, we could establish probability distributions on thepossible updates and delineate goals for them - the principle of semantic informa-tivity could play an important role in this point. Moreover, we can also introducecomputational complexity constrains to agent semantic informativity. Thus, it willbe possible to analyze how many efficient updates (time and space requirementsbounded by a function of the proposition size) are necessary for a given propositionto be coherent with the database. In this way, we will be able, for example, to com-pare the learnability of the concepts which occur in propositions, in the Valiant’ssense (Cf. Valiant (1984)), with respect to their semantic information.The second possible line of research is to develop a complete dynamic theory ofthe semantic informativity by incorporating belief revision in the line of AGM the-ory (Cf. Alchourr´on et al (1985)). In the present paper, the believes of the databasehave been maintained fixed, but a more realistic approach should incorporate revi-sion of believes. For example, if we consider distributed systems, the agents proba-bly will have some different beliefs. In this case, it will be necessary to analyze thechanges of semantic information, conflicting data and so on.The last point to be explored, but no less important, is to analyze the relation-ship between our dynamic perspective about semantic information and other staticapproaches, mainly, with respect to Floridi’s theory of strong semantic information(Floridi (2004)). It is important to observe that we have proposed a kind of hegelianconception about semantic information, according to which semantic informativityis analyzed in semantic terms, whereas, for example, Floridi’s conception is kantian,in the sense that it analyzes the relationship between propositions and the worldin order to understand the transcendental conditions of semantic information (Cf.Floridi (2011)). Our approach seems to be a hegelian turn in the philosophy of in-formation similar to what Brandom did with respect to the philosophy of language(Cf. Brandom (1989)).
Acknowledgements
I would like to thank Viviane Beraldo for her encouragement, to LucianoFloridi for his comments on my talk given at PT-AI2013, to Pedro Carrasqueira and William Steinle2 Ara´ujo, Anderson defor their comments and to an anonymous referee for his (her) criticism on a previous version of thispaper. This work was supported by S˜ao Paulo Research Foundation (FAPESP) [2011/07781-2].