aa r X i v : . [ ec on . T H ] N ov Measuring knowledge for recognition and knowledge entropy
Fujun Hou ∗ School of Management and EconomicsBeijing Institute of TechnologyBeijing, China, 100081November 16, 2018
Abstract
People employ their knowledge to recognize things. This paper is concerned with how to measurepeople’s knowledge for recognition and how it changes. The discussion is based on three assumptions.Firstly, we construct two evolution process equations, of which one is for uncertainty and knowledge,and the other for uncertainty and ignorance. Secondly, by solving the equations, formulas for measuringthe levels of knowledge and the levels of ignorance are obtained in two particular cases. Thirdly,a new concept knowledge entropy is introduced. Its similarity with Boltzmann’s entropy and itsdifference with Shannon’s Entropy are examined. Finally, it is pointed out that the obtained formulasof knowledge and knowledge entropy reflect two fundamental principles: (1) The knowledge level ofa group is not necessarily a simple sum of the individuals’ knowledge levels; and (2) An individual’sknowledge entropy never increases if the individual’s thirst for knowledge never decreases.
Keywords : knowledge, ignorance, recognition, uncertainty, entropy
It is widely agreed that people employ knowledge to recognize things. The term of ’knowledge’ can bedefined in many ways. In this paper, knowledge refers to the understanding of someone or something, andit is related to the capacity of acknowledgement in human beings [1]. When people have full knowledgeabout the objects of their concerned, they will differentiate objects well from each other. If one has noknowledge at all, he/she will not make any distinction on the objects. In other cases (only with part ofknowledge), one may show some uncertainty which means that he/she can not distinguish some objectsbut think that they are similar. Although knowledge is virtually an abstract concept, but people dohave always depended on their knowledge to make distinctions among the objects under consideration.A question thus arises: How to mathematically measure people’s knowledge for recognition and how doesit change? This paper aims to answer this question.Our discussion is based on some assumptions. The rationality of the assumptions is illustrated bysome famous quotes. This is presented in Section 2. We investigate the influence of knowledge onuncertainty and the influence of ignorance on uncertainty in Section 3. The model includes two first-order ordinary differential equations and their general solution formulas. Particular boundary conditionsfor two particular cases and their corresponding formulas for measuring knowledge and ignorance arediscussed and presented in Section 4. In Section 5, we introduce the expression of knowledge entropy bycomparing the obtained ignorance formula and Boltzmann’s entropy formula. The properties of knowledgeentropy are examined and two principles are thus deduced in Section 6. In Section 7, we point out a maindifference between the knowledge entropy and the Shannon Entropy. Section 8 concludes the paper. ∗ Tel.: +86 10 6891 8960; Fax: +86 10 6891 2483; email: [email protected]. Notations and assumptions
Knowledge makes a great help for recognition as a motto puts it ”Knowledge enables you to add apair of eyes”. There are also two other terms related to knowledge, of which one is the ignorance, andthe other is the uncertainty. Ignorance means a lack of knowledge. Good knowledge results in goodrecognition. On the contrary, absence of knowledge means ignorance and often leads to uncertainty inrecognition. In our opinion, knowledge, ignorance and uncertainty are all variables that can be measuredmathematically.We will use the following notations: ⋄ K : the level of knowledge for recognition. ⋄ I : the level of ignorance. ⋄ U : the level of uncertainty as a result of the absence of knowledge. ⋄ | A | : cardinal number of set A . ⋄ X : a set of objects, X = { x , x , . . . , x n } , 1 < n < + ∞ . ⋄ ∼ : an equivalence relation on X . ⋄ (cid:22) : a weak order relation on X .When we assume that K , I and U are measurable variables, then these three variables are bounded ones.Without loss of generality, we assume 0 ≤ K ≤ ≤ I ≤
1. We also write U min ≤ U ≤ U max .Moreover, to conduct our discussion, the following assumptions are needed: Assumption I:
People are in enthusiasm for acquiring knowledge to get to know the world aroundthem, and the process of acquiring knowledge is a continuous process.
Assumption II:
The more knowledge the people possess, the less uncertainty will the people’srecognitions have.
Assumption III:
The more the uncertainty of the people’s recognition has, the more likely thepeople will seek knowledge so as to make a better recognition.The above assumptions are rational for ordinary people because they are compatible with some famousquotes:(1) ”All men by nature desire knowledge” (Aristotle (384 BC - 322 BC), Metaphysics), and ”The desireof knowledge, like the thirst of riches, increases ever with the acquisition of it” (Laurence Sterne,1713-1768).(2) ”Knowledge is power” (Sir Francis Bacon (1561 - 1626)).(3) ”To be conscious that you are ignorant is a great step to knowledge” (Benjamin Disraeli (1804- 1881)), ”Perplexity is the beginning of knowledge” (Kahlil Gibran (1883 - 1931)), and ”Thebeginning of knowledge is the discovery of something we do not understand” (Frank Herbert (1920- 1986)). 2
Models
According to Assumptions I and II, the influence of knowledge on uncertainty can be described by aformula dUdK < . Similarly, the influence of ignorance on uncertainty can be described by dUdI > . According to Assumption III, the relation of knowledge and uncertainty, as well as the relation ofignorance and uncertainty can be (but not have to be) further written as dUdK = aU, (1)and dUdI = bU, (2)respectively, where a and b are two non-zero constants.Formulas (1) and (2) are two first-order ordinary differential equations. Their general solution formulasare given as follows: ln U = aK + c , (3)and ln U = bI + c , (4)respectively, where c and c are two constants.In particular cases, if the boundary conditions were known, then their solutions could be knownas well. We will discuss this in next section. We remark that Eqs. (3) and (4) represent continuousprocesses. In the following sections, however, we will consider some particular cases where U will takeinteger values. Equivalence relation and weak order relation are two basic relations, which are closely related tohuman cognition. Living in the society, people have to make out things and make optimal choices. Whenmaking out things, one may say ”This object is equal to that one” or ”These two objects are different”,and so on. This kind of sayings indicate that people made distinctions under an equivalence relation evenif they might not realize this. When making optimal choices, people also use their knowledge to expresspreferences, such as ”These alternatives are indifferent to me” or ”This alternative is more preferred tome than that one”, and so on. Even though we cannot say someone’s preference is good while the other’sis bad, however, people do use their knowledge to discern objects in their preferences. Frequently, peopleuse the equivalence relations to differentiate things, and the weak order relations to express preferences.In this section, we consider how to measure the knowledge for recognition (particularly, for classification)with respect to an equivalence relation and a weak order relation that are both constructed on finite-element sets. 3 .1 When making distinctions under equivalence relations
An equivalence relation over an object set corresponds to a partition of the object set, and vice versa.The partition is composed of disjoint equivalence classes. Two objects are equivalent to each other if andonly if they belong to the same equivalence class [2].Consider an equivalence relation ∼ which is defined on object set X = { x , x , . . . , x n } . x i ∼ x j means that x i and x j are equivalent to each other. Let [ x i ] be the equivalence class corresponding to x i , where [ x i ] = { x j | x j ∈ X, x j ∼ x i } . Clearly, if we have full knowledge about objects then we willdifferentiate any one from others. In this case, one equivalence class will include only one object; on thecontrary, if we have no knowledge at all then we will differentiate none from each others. In this case, allthe objects will be included in one equivalence class. Therefore, the cardinal numbers of the equivalenceclasses reflect a relationship of the knowledge for recognition and the uncertainty. Mathematically, if wedefine U = W = n X i =1 | [ x i ] | , (5)then we have W min = n and W max = n , and the minimum value and the maximum value correspondto the ’full knowledge (hence no uncertainty)’ case and the ’no knowledge (hence full uncertainty)’ case,respectively. In this case the boundary condition of formula (3) can be ( K = 1 , if U = W = n,K = 0 , if U = W = n . (6)Thus formula (3) is changed into ( − ln n ) K + ln n = ln W, namely, K = 1ln n ln n W . (7)Similarly, the boundary condition of formula (4) can be ( I = 0 , if U = W = n,I = 1 , if U = W = n . (8)Under this condition, formula (4) is changed into(ln n ) I + ln n = ln W, namely, I = 1ln n ln Wn . (9)By synthesizing Eqs.(7) and (9) we have K + I = 1 . (10)Because n ≤ W ≤ n , thus we have K ∈ [0 , I ∈ [0 , K and I , which indicates ”the more knowledge, the less ignorance”.For illustrative purpose, we consider an example. Example 1
Suppose that there were 5 eggs, X = { egg , egg , egg , egg , egg } , and that two persons, { J ohn, J ack } , were asked to use their individual scales to check the weights of the eggs. They reportedtheir weighing results as shown by Table 1. 4able 1: Weighing results egg a John Jackegg1 60 60.2egg2 63 62.8egg3 63 63.1egg4 61 61.1egg5 61 61.1 a Weighed in grams.
We use formula (7) to measure the two persons’ knowledge levels in their recognitions of the eggs’weights.The equivalence classes deduced from John’s weighing results are:[ egg
1] = { egg } , [ egg
2] = { egg , egg } , [ egg
3] = { egg , egg } , [ egg
4] = { egg , egg } , and [ egg
5] = { egg , egg } . By using formula (7), John’s knowledge level is measured as K ( John ) = 1ln 5 ln 5 . = 0 . . The equivalence classes deduced from Jack’s weighing results are:[ egg
1] = { egg } , [ egg
2] = { egg } , [ egg
3] = { egg } , [ egg
4] = { egg , egg } , and [ egg
5] = { egg , egg } . By using formula (7), Jack’s knowledge level is measured as K ( Jack ) = 1ln 5 ln 5 . = 0 . . One can see that Jack has a higher knowledge level than John. This result is consistent with ourintuitive perception of the data shown by Table 1.
Consider a weak order relation (cid:22) which is defined on object set X = { x , x , . . . , x n } . On the onehand, if one thinks that x i and x j are indifferent and thus a tie arises, then we write x i ∼ x j . On theother hand, if one thinks that x i is preferred to x j , then we write x i ≻ x j [3]. We assume that people useties-permitted ordinal rankings to express their preferences.A ties-permitted ordinal ranking of a set can be represented by a preference sequence whose entriesare sets containing possible ranking positions of objects [4, 5]. A preference sequence P S = (
P S i ) n × isdefined by P S i = {| ξ i | + 1 , | ξ i | + 2 , . . . , | ξ i | + | η i |} , (11)where ξ i = { x k | x k ∈ X, x k ≻ x i } and η i = { x k | x k ∈ X, x i ∼ x k } .5or instance, let X = { x , x , x , x } and a weak ordering on the set be: x ≻ x ∼ x ≻ x , then thecorresponding preference sequence is as follows: P S = (
P S i ) × = ( { } , { , } , { , } , { } ) T . Evidently, different entries of a preference sequence constitute a partition of the set { , , . . . , n } [4].This is understandable because the elements in an entry are deduced based on the relation x i ∼ x k asshown by formula (11), which is in nature an equivalence relation. Therefore, the results obtained insubsection 4.1 can be directly applied to the discussion in this subsection.We define the cardinal number of a preference sequence, P S = (
P S i ) n × , as follows: | P S | = n X i =1 | P S i | . (12)Clearly, we have | P S | min = n and | P S | max = n , and the larger the cardinal number of a person’spreference sequence is, the more distinct the objects will be to the person. Therefore, the knowledge forrecognition in a person’s preference sequence can also be measured by formula (7), where we need onlyto substitute W by | P S | . That is K P S = 1ln n ln n | P S | . (13)Similar to formula (9), the ignorance in a person’s preference sequence can be measured by I P S = 1ln n ln | P S | n . (14)Because n ≤| P S |≤ n we also have K P S ∈ [0 , I P S ∈ [0 ,
1] and K P S + I P S = 1.Next we consider an example.
Example 2
Suppose that there were an alternative set { x , x , x , x , x } , and a group of experts, { Expert , Expert , Expert } . Assume that the experts provided their weak orderings on the alternativeset as follows.Expert1: x ≻ x ∼ x ≻ x ∼ x .Expert2: x ≻ x ∼ x ∼ x ≻ x .Expert3: x ∼ x ≻ x ≻ x ≻ x .We use formula (13) to measure the experts’ knowledge levels for recognition in their preferences. Wefirst write out their preference sequences according to formula (11): Expert Expert Expert A A A A A { }{ , }{ , }{ , }{ , } , { }{ , , }{ , , }{ , , }{ } , { , }{ , }{ }{ }{ } . By using formula (13), the experts’ knowledge levels are measured asExpert1: K ( E P S = ln . = 0 . K ( E P S = ln . = 0 . K ( E P S = ln . = 0 . Knowledge entropy
In thermodynamics and statistical mechanics, there is a well-known formula called Boltzmann’s en-tropy formula S = k B ln Ω . (15)In the above formula, S is the entropy, k B is Boltzmann constant, and Ω is the number of microstatesconsistent with the given equilibrium macrostate.If we rewrite the formula of ignorance, namely, formula (9), as I = 1ln n ln Wn = 1ln n ln W − I + 1 = 1ln n ln W, then we have an expression that is quite similar to Boltzmann’s entropy formula. Thus we introduce aconcept knowlege entropy , denoted S K , as follows S K = 1ln n ln W. (16)and we have S K = I + 1 . (17)Similarly, we can also define another kind of knowlege entropy based on formala (14), as in thefollowing S Kps = 1ln n ln | P S | . (18)and we also have S Kps = I P S + 1 . (19)From formulas (10) and (17), we have K = K − − I = 2 − S k . Thus we obtain K − − S k , which can be further written as K − n ln n n = 1ln n ln n − S k . Taking into account that W max = n and plugging formulas (7) and (16) into the above expression, wehave K ( W ) − K ( W max ) = S K ( W max ) − S K ( W ) . (20)Formula (20) indicates that ”acquiring knowledge means the decrease of knowledge entropy”.Further, from formula (18) we know that ”the lower is the knowledge entropy, the more distinctiveis the people’s ranking order”, since a lower entropy indicates a lower cardinal number of a preferencesequence hence a more distinctive ranking.We illustrate the usage of the knowledge entropy formula when the data of Example 2 are used. From K P S + I P S = 1 and S Kps = I P S + 1, we have S Kps = 2 − K P S . We have obtained the knowledge levelsin Example 2. Therefore, we can calculate the knowledge entropies of the experts’ preferences:Expert1: S ( E Kps = 2 − K ( E P S . = 1 . S ( E Kps = 2 − K ( E P S . = 1 . S ( E Kps = 2 − K ( E P S . = 1 . Two deduced principles
Taking into account Assumption I ”People are in enthusiasm for acquiring knowledge to get to knowthe world around them” and the implication of formula (20) ”Acquiring knowledge means the decreaseof knowledge entropy”, we have the following deduction:
Principle 1
An individual’s knowledge entropy never increases if the individual’s thirst for knowledgenever decreases.Next we will introduce another principle by examining the additivity of the knowledge measure.
Proposition 1
The knowledge measure expressed by formula (7) (or formula (13)) does not fulfil theproperty of sub-additivity.
Proof
The above proposition has two implications: • The overall knowledge of two persons is not necessarily the sum of their individual knowledge. • For one person, his/her knowledge on a set of objects is not necessarily the sum of his/her knowledgeon the sub-sets.We use counter examples to illustrate.- Suppose that the object set is { x , x , x } . Thus we have n = 3 and in this case 3 ≤ W ≤ .Suppose that two persons Alan and Barbara were asked to discern the objects.Assume that Alan said that ” x is different from others, while x and x are equivalent”. Hencethe equivalence classes deduced from Alan’s knowledge for recognition are[ x ] = { x } , [ x ] = { x , x } , and [ x ] = { x , x } . By using formula (7), Alan’s knowledge is measured as K ( A ) = 1ln 3 ln 3 . Assume that Barbara said that ”the objects are all different from each other”. Hence the equivalenceclasses deduced from Barbara’s knowledge for recognition are[ x ] = { x } , [ x ] = { x } , and [ x ] = { x } . By using formula (7), Barbara’s knowledge is measured as K ( B ) = 1ln 3 ln 3 . Let K ( A + B ) = K ( A ) + K ( B ) , namely, ln W = ln + ln .However, we cannot find a W satisfying both 3 ≤ W ≤ and ln W = ln + ln . Indeed,the only solution of the equation ln W = ln + ln is W = , but beyond 3 ≤ W ≤ .Therefore, the first implication is verified.- Suppose that the object set is { x , x , x , x } . Thus we have n = 4 and in this case 4 ≤ W ≤ .Suppose that a person Cassie was asked to discern the objects. Assume that Cassie said that ” x is different from others, so does x , while x and x are equivalent”. Hence the equivalence classesdeduced from Cassie’s knowledge for recognition are[ x ] = { x } , [ x ] = { x , x } , [ x ] = { x , x } , and [ x ] = { x } .
8y using formula (7), Cassie’s knowledge is measured as K ( C ) = 1ln 4 ln 4 . Now we divide X into two sub-sets: X = { x , x } and X = { x , x } . By using formula (7),Cassie’s knowledge on these two sub-sets is measured as K ( X ) = 1ln 2 ln 2 , and K ( X ) = 1ln 2 ln 2 , respectively. Evidently, we have K ( C ) = K ( X ) + K ( X ) . Therefore, the second implication is verifiedand hence the proof is finished. (cid:3) From Proposition 1 we can deduce another principle as follows.
Principle 2
The knowledge level of a group is not necessarily a simple sum of the individuals’knowledge levels.
Because the knowledge measure (formula (7)) and the knowledge entropy measure (formula (16)) havea relationship of K = 2 − S K , namely K + S K = 2, we thus have a result from Proposition 1 as follows. Proposition 2
The knowledge entropy does not fulfil the property of sub-additivity.Shannon entropy, also called information entropy, was introduced by Claude Shannon [6]. It is ex-pressed as S = − X i =1 P i ln P i . It is known that Shannon entropy satisfies an additivity property, namely, if two subsets are disjoint,then the total entropy of the two subsets will be the sum of their individual entropies of the two subsets.However, as shown by Proposition 2, the knowledge entropy introduced in this paper does not havethe sub-additivity property. This is a main difference of the knowledge entropy and Shannon entropy.
We have discussed how to measure the knowledge for recognition (particularly, for classification)and how it changes. Our discussion were based upon three assumptions. We proposed an equation fordescribing the influence of knowledge on uncertainty, and obtained two particular formulas for measuringthe levels of knowledge for recognitions in two particular cases. Moreover, by investigating the evolutionprocess of ignorance with uncertainty, a concept knowledge entropy was introduced and its formulawas presented. Its similarity with Boltzmann’s entropy and its difference with Shannon’s entropy wereexamined. Furthermore, based on a mathematical analysis, we obtained or evidenced the following results:(1) Acquiring knowledge means the decrease of knowledge entropy.(2) The lower is the knowledge entropy, the more distinctive is the people’s ranking order.(3) The knowledge level of a group is not necessarily a simple sum of the individuals’ knowledge levels.94) An individual’s knowledge entropy never increases if the individual’s thirst for knowledge neverdecreases.
Acknowledgments
The work was supported by the National Natural Science Foundation of China (No. 71571019).
References [1] Cavell S, Knowing and Acknowledging, Must We Mean What We Say? Cambridge University Press, 2002,238–266.[2] Raymond W, Introduction to the Foundations of Mathematics 2nd edition. John Wiley & Sons, Chapter 2-8:Axioms defining equivalence, 48–50 (1965).[3] Roberts F, Tesman B, Applied Combinatorics (2nd ed.). CRC Press, Section 4.2.4 Weak Orders, 254–256(2011).[4] Hou F, A consensus gap indicator and its application to group decision making. Group Decis. Negot 24,415–428 (2015). doi: 10.1007/s10726-014-9396-4[5] Hou F, The prametric-based GDM selection procedure under linguistic assessments. Fuzzy Systems (FUZZ-IEEE), 2015 IEEE International Conference on, 1–8 (2015).[6] Shannon CE, A Mathematical Theory of Communication. Bell System Technical Journal 27, 379-423 (1948).doi:10.1002/j.1538-7305.1948.tb01338[1] Cavell S, Knowing and Acknowledging, Must We Mean What We Say? Cambridge University Press, 2002,238–266.[2] Raymond W, Introduction to the Foundations of Mathematics 2nd edition. John Wiley & Sons, Chapter 2-8:Axioms defining equivalence, 48–50 (1965).[3] Roberts F, Tesman B, Applied Combinatorics (2nd ed.). CRC Press, Section 4.2.4 Weak Orders, 254–256(2011).[4] Hou F, A consensus gap indicator and its application to group decision making. Group Decis. Negot 24,415–428 (2015). doi: 10.1007/s10726-014-9396-4[5] Hou F, The prametric-based GDM selection procedure under linguistic assessments. Fuzzy Systems (FUZZ-IEEE), 2015 IEEE International Conference on, 1–8 (2015).[6] Shannon CE, A Mathematical Theory of Communication. Bell System Technical Journal 27, 379-423 (1948).doi:10.1002/j.1538-7305.1948.tb01338