Analysing Fuzzy Sets Through Combining Measures of Similarity and Distance
AAnalysing Fuzzy Sets Through CombiningMeasures of Similarity and Distance
Josie McCulloch
Student Member, IEEE , Christian Wagner
Senior Member, IEEE and Uwe Aickelin
Abstract —Reasoning with fuzzy sets can be achievedthrough measures such as similarity and distance. How-ever, these measures can often give misleading results whenconsidered independently, for example giving the samevalue for two different pairs of fuzzy sets. This is particu-larly a problem where many fuzzy sets are generated fromreal data, and while two different measures may be usedto automatically compare such fuzzy sets, it is difficultto interpret two different results. This is especially truewhere a large number of fuzzy sets are being comparedas part of a reasoning system. This paper introduces amethod for combining the results of multiple measuresinto a single measure for the purpose of analysing andcomparing fuzzy sets. The combined measure alleviatesambiguous results and aids in the automatic comparisonof fuzzy sets. The properties of the combined measure aregiven, and demonstrations are presented with discussionson the advantages over using a single measure.
I. I
NTRODUCTION T O compare two fuzzy sets (FSs) one may con-sider their similarity or distance. To assess theirsimilarity, we measure the similarity of the membershipvalues for each element in each set. The result is givenwithin the interval [0 , , where 0 indicates that there areno elements shared between both FSs and 1 indicatesthat the sets are identical. Alternatively, to assess thedistance between FSs, given as a value in R , we measurethe distance between the elements which belong to eachset; typically the distance between elements is alsoweighted by their membership values.Measures of similarity and distance have frequentlybeen applied to a variety of different applications. Forexample, similarity has often been used to measurethe similarity between different word models [1], [2],or to find similar patterns in classification [3] andclustering [3]. Distance Measures (DMs), though lesscommonly researched, have been used to compare FSs,for example, in the ranking of fuzzy numbers [4]. Josie McCulloch, Christian Wagner and Uwe Aickelin are with theSchool of Computer Science, University of Nottingham, Nottingham,United Kingdom (email: { psxjm5, christian.wagner, uwe.aickelin } @nottingham.ac.uk)This work was partially funded by the EPSRCs TowardsData-Driven Environmental Policy Design grant, EP/K012479/1and the RCUKs Horizon Digital Economy Research Hub grant,EP/G065802/1. Measures of similarity and distance evaluate twofundamentally different aspects of FSs, and it is dueto the unique properties of these measures, or moredirectly, through the nature of what the measures actu-ally measure that their applicability to a given problemsetting is determined. For example, there are cases inwhich a similarity measure (SM) may not be useful,such as when the FSs are disjoint. In this case, theresult of the SM is always zero. This does not tell ushow far apart the FSs are placed in the universe ofdiscourse (UoD); they may be far apart or right nextto each other. Where this is of concern, a DM may bebeneficial. However, likewise, a DM is also not alwaysa useful measure, for example when one FS is a subsetof another. In this case the results become ambiguousas DMs are not ideal for detecting overlap between FSs.Current research within the literature has generallymade a choice between using either measures of sim-ilarity or distance, however in many cases, it is nottrivial to make this choice, in particular when FSs aredynamically created from data such as for approacheslike [2] and [5]. This paper proposes the fusion of bothmeasures into a single measure which can be applied inthe comparison of FSs and produces meaningful resultsregardless of the exact nature of the FSs to be measured.The fusion is achieved by an ordered weighted average(OWA) operator, and is applied to data-driven FSs todemonstrate the benefits of the measure.Section II introduces FSs, SMs, DMs, and OWAoperators, followed by an examination of what exactlythe measures measure in Section III. In Section IV,a combined measure is presented which utilises theunique properties of both similarity and distance, anddemonstrations of the combined measure are shown inSection V. Finally, conclusions are given in Section VI.II. B
ACKGROUND
A. Fuzzy Sets
Fuzzy sets have been applied to many applications inwhich uncertainty is present; some examples of whichinclude data mining [6] and Computing with Words [7].Unlike traditional logic, for which the membership ofeach element to the set is a Boolean value, the elementsof a FS have a membership value that lies anywhere in a r X i v : . [ c s . A I] S e p he interval [0 , . A FS F may be represented as a setof ordered pairs as follows [8]: F = ( x, µ F ( x )) | x ∈ X (1)where µ F ( x ) indicates the membership value of theelement x in the FS F . For a discrete UoD, the FS F may be written as [8] F = (cid:88) x µ F ( x ) / x (2)where (cid:80) denotes the collection of all points x ∈ X with associated membership value µ F ( x ) . B. Similarity Measures
SMs are a common tool used within fuzzy logic. ASM s ( A, B ) → [0 , calculates how similar two FSsare to each other through a comparison of the degreesof membership within each set. Common properties ofa SM s for FSs A , B and C are as follows: Reflexivity: s ( A, B ) = 1 ⇐⇒ A = B Symmetry: s ( A, B ) = s ( B, A ) Transitivity: If A ⊆ B ⊆ C , then s ( A, B ) ≥ s ( A, C ) Overlapping: If A ∩ B (cid:54) = ∅ , then s ( A, B ) > ;otherwise, s ( A, B ) = 0
Note that it is not necessary for a SM to haveall of these properties as the application for whichthe measure is used may not depend on all of them.However, it is typical that a SM always follows theproperty of reflexivity.Throughout this paper, similarity is measured usingthe Jaccard SM, which supports all of the four proper-ties listed above [1]. The Jaccard measure s for FSs A and B is given as: s ( A, B ) = (cid:80) ni =1 min ( µ A ( x i ) , µ B ( x i )) (cid:80) ni =1 max ( µ A ( x i ) , µ B ( x i )) (3)where n is the total number of discretisations along the x -axis. C. Distance Measures
A DM d ( A, B ) → R + is used to asses the distancebetween FSs by calculating the distances between theelements in each set.A DM d on FSs A , B and C holds the followingproperties: Self-identity : d ( A, A ) = 0
Separability : d ( A, B ) ≥ Symmetry: d ( A, B ) = d ( B, A ) Transitivity: If A ⊆ B ⊆ C , then d ( A, B ) ≤ d ( A, C ) Triangle inequality : d ( A, C ) ≤ d ( A, B ) + d ( B, C ) The distance between two FSs is most commonlymeasured by taking α -cuts of FSs and measuring thedistance between the α -cuts. The α -cut of the FS A is a non-FS comprised of all the elements whosemembership grade within A is greater than or equal to α [9]; this is written formally as A α = { x | µ A ( x ) ≥ α } .Chaudhuri and Rosenfeld [10] proposed the followingmetric to measure the distance between two convex,normal FSs A and B : d ( A, B ) = (cid:80) mi =1 y α i h ( A α i , B α i ) (cid:80) mi =1 y α i (4)where the y -axis is discretised into m points( y , y , ..., y m ), A α i is the non-fuzzy α -cut (given asan interval) of the FS A at y-coordinate y α i , and h isthe conventional Hausdorff metric for two continuousintervals ¯ A and ¯ B as follows [11]: h ( ¯ A, ¯ B ) = max {| ¯ A l − ¯ B l | , | ¯ A r − ¯ B r |} (5)where ¯ A = [ ¯ A l , ¯ A r ] and ¯ B = [ ¯ B l , ¯ B r ] .In addition to the Hausdorff distance given above, adirectional DM (DDM) is given as follows [12]: h ( ¯ A, ¯ B ) = (cid:40) ¯ B l − ¯ A l , if | ¯ B l − ¯ A l | > | ¯ B r − ¯ A r | . ¯ B r − ¯ A r , otherwise . (6)for which a positive distance is given where A < B ,and a negative value of distance is given where
B > A .The DDM, however, does not hold the property ofsymmetry and instead follows partial symmetry, definedas | d ( A, B ) | = | d ( B, A ) | and d ( A, B ) (cid:54) = d ( B, A ) where A (cid:54) = B . Throughout this paper, (4) is used inconjunction with (6).Having reviewed SMs and DMs, a brief overview ofOWA operators is given next, which will be used toaggregate similarity and distance. D. Ordered Weighted Average
OWA operators [13] are used to aggregate sub-components of a problem. An OWA involves as-signing objects to an ordered set of weights w = { w , w , ....., w n } , for which w i ∈ [0 , and (cid:80) ni =1 w i = 1 . The objects which are to be aggregatedare sorted into descending order, and each object is mul-tiplied by the corresponding weight. Thus, for a givenlist of objects a , a , ..., a n and weights w , w , ..., w n ,the OWA is calculated as follows [13]: F ( a , a , ....a n ) = w b + w b + .... + w n b n (7)where b i is the i th largest element in the collection a , a , ..., a n .OWAs have been commonly used in the literatureto solve a variety of problems. For example, [14] usesan OWA in decision making applied to the personnelselection problem. In [15], an OWA is used to aggregateifferent performance indicators to assess the perfor-mance of small drinking water utilities, and [15] usesand OWA to aid in the selection of financial products.III. C OMPARISON OF M EASURES ON F UZZY S ETS
In this Section, SMs and DMs are compared ona series of real-data driven FSs. This is in order toclarify their respective outputs in an applied context,and to demonstrate the proposition that it can be morebeneficial to use a combination of both measures.As previously discussed, SMs and DMs have uniqueproperties which lead to them measuring fundamentallydifferent concepts. To demonstrate the nature of themeasures, and the strengths of using both similarityand distance together to analyse FSs, consider the FSsshown in Fig. 1. These FSs have been constructedfrom the Movie Lens data set [16], in which films arerated between 1 (poor) and 5 (great). Histograms werecreated to represent the distribution of ratings and eachhistogram was normalised by dividing the membershipvalue at each x -coordinate by the peak membershipvalue of the histogram. Linear-interpolation was used todetermine membership values between known points.The SM and DDM introduced in Sections II-B andII-C were applied to each pair of movies, respectively.Their results are shown in Table I. The results ofthe combined measure are also shown in Table I forcomparison purposes, and will be introduced in the nextsection. For each pair, the FS A was given as the firstparameter for the measure, and the FS B was given asthe second parameter. TABLE IV
ALUES GIVEN BY SM S AND DM S ON THE FS S IN F IG . 1Fig. 1 - part: a b c d e fSimilarity (3) 0.050 0.067 0.170 0.242 0.0 0.892Distance (4) -3.628 2.936 2.723 -1.999 3.258 0.169Comparative (8) -0.915 0.864 0.857 -0.806 0.938 0.072 The following is a discussion of the results forsimilarity (3) and distance (4) in Table I for the FSs inFig. 1. For each case, a brief discussion highlights whereboth measures contribute information that is particularlyhelpful when considered together.
Sets a & b For the sets in Fig. 1(a) and 1(b), the SMindicates that the FSs are almost disjoint, but there is asmall degree of similarity between them. However, thereis no indication of where this similarity lies and howmuch the FSs differ. One can, however, see that the signof the DM may be helpful to indicate the actual regionof similarity. In this case, the direction of the DM tellsus that the similarity is likely towards the lower end of the UoD of the FS A for Fig. 1(a) and the higher endof A for Fig. 1(b). Sets c & d The SM indicates a small difference insimilarity between the sets of c and the sets of d , but ittells us very little else; in both cases, there is a smallamount of overlap but we do not know where. However,the DM reveals that this overlap is to the right of theFS A for c , and to the left A for d . Sets e In this case, the SM indicates that there is nosimilarity between the FSs, i.e. they are disjoint, and theDM indicates that there is a large amount of distancebetween the FSs.
Sets f Both the SM and DM are able to identify whentwo FSs are identical or, in this case, almost identical.For the results of Fig. 1(f), each measure indicates thatthe membership functions of both FSs are very close toeach other.Given the results above, it is clear that SMs and DMsare each unique functions with distinct properties. Thisresults in the common necessity to choose between bothtypes of measure or indeed to apply both individually.While the application of both measures individuallyas conducted here can provide some insight, it canbe challenging to interpret the two distinct outputssimultaneously for given FSs.In the next section, the measures are combined into asingle measure resulting in a single value, which can beused to determine the similarity, distance and directionbetween FSs.IV. C
OMBINING M EASURES
The comparative measure removes the need to choosebetween measures of similarity or distance, and bycombining both measures it creates a more detailedcomparison of FSs. Both of these aspects are par-ticularly important in cases where a potentially largenumber of FSs are generated from data. In such cases,an appropriate decision between the individual measures(and/or joint result interpretation) cannot be conductedby a human expert but has to be done automatically.Thus, a single measure is proposed to provide a detailedcomparison of FSs.
A. A Single Comparative Measure
Note that both measures commonly yield results indifferent domains; SMs within [0 , and DMs within R (or R + if it is non-directional). A decision musttherefore be made as to which domain will be usedfor the results of the combined measure. The fol-lowing presents a measure which yields results in [0 , , for which the value 0 indicates minimum dis-tance/maximum similarity, and the value 1 indicatesmaximum distance/minimum similarity. a)(b)(c) To fuse the measures, it is important to consider thatsimilarity and distance represent two fundamentally dif-ferent comparisons of FSs; i.e. both measures measure“opposite” concepts. The SM indicates how similar orhow close two FSs are placed, and the DM indicateshow far apart they are positioned. To fuse these mea-sures they must both represent the same concept, eitherboth similarity/closeness or both dissimilarity/distance .The following considers the latter case.As similarity is within the domain [0 , , to achieve ameasure of dissimilarity for the combined measure thecomplement of the SM may be used (i.e. − s ( A, B ) )[17]. This can then be used in conjunction with the DM.Considering the DM is within R , it should be changed (d)(e)(f)Fig. 1. Fuzzy sets used to demonstrate the attributes of SMs andDMs in Table I. such that the result falls within [0 , to enable ameaningful fusion of both measures. To alter the result,it is necessary to take into account the UoD in whichthe measure has been applied. For example, if the UoDis in { , , , , } then the maximum distance that canbe achieved is 4. The result of the DM may be givenas a ratio to the maximum possible distance. Taking theabove into account, d ( A,B ) λ is used to obtain a ratio ofcloseness from the DM, where λ is the largest possibledistance within the UoD. For a finite UoD X describedas { x , x , ...., x n } , λ will be x n − x .Given the above, the OWA provides a reasonablepproach to fusing both measures ((3) and (4)). (8)presents the comparative measure as an OWA basedaggregation of the measures of similarity and distancefor FSs A and B : c ( A, B ) = (cid:40) F (cid:0) (1 − s ( A,B )) , (cid:0) d ( A,B ) λ (cid:1)(cid:1) , d ( A, B ) ≥ F (cid:0) − (1 − s ( A,B )) , (cid:0) d ( A,B ) λ (cid:1)(cid:1) , otherwise (8)where F is an OWA as shown in (7) with weights w = { . , . } and d is the DDM (4) with (6).The weights w = { . , . } are chosen such that thelargest of the dissimilarity measure and normalised DMwithin (8) is assigned the weight 0.7, and the smallestis assigned the weight 0.3. Note that the absolute valuesof the measures are used when assigning the weights,thus a measure of -0.45 is considered larger than ameasure of 0.3. These weights have been determinedheuristically as outlined in Section IV-B, and in thefuture other ways of determining such weights may beinvestigated.Note that if the result from the DDM gives a negativevalue then the result of (8) will also be a negative value.Likewise, if the DDM gives a positive value then theresult of the combined measure will also be positive.In (8), a value of 0 represents identical FSs, as provenin theorem 1 below, and a value of 1 (or -1) representsthe maximum distance possible of two disjoint FSs. Ifone wishes to have the value 1 to represent identicalFSs, the complement of (8) may be used as c (cid:48) ( A, B ) = (cid:40) − c ( A, B ) , if c ( A, B ) ≥ − − c ( A, B ) , otherwise (9)Note that (9) maintains the direction according to thecomparative measure (8).Within (8), the measure of similarity is altered suchthat it reflects dissimilarity or distance. Note that this isjust one method of combining the measures proposedbecause of both its simplicity and its ability to representboth similarity and distance as demonstrated in theexamples within the next section. Another method, forexample, is the special case where the weights are both0.5, resulting in the standard average of both measures.It is also possible that the result may be altered to yieldresults in the domain R by multiplying the SM by thevalue λ and fusing the result with the unaltered DM. B. Choosing the OWA Weights
The following discusses how the weights of the OWAoperator (7) may be chosen for the comparative measure(8). Referring to Fig. 2, the FS pairs ( A, B ) and ( A, C ) are compared. According to the DM (4) the distance for both pairs is 0.331. However, according to the SM (3)the similarity of ( A, B ) is 0.182, and the similarity of ( A, C ) is 0.0. Due to the DM giving the same result for ( A, B ) and ( A, C ) one would assume that the B and C are the same FS. It is only by also referring to the SM(or by viewing the FSs) that it becomes clear that theFSs are different. By using the comparative measure,however, it is possible to distinguish between differentpairs of FSs which give equal values of similarity ordistance. It is also important to note that, by using thecomparative measure, this can be confirmed by usinga single measure; a user does not have to check theresults of both the SM and DM to ensure pairs of FSsare different.The weights of the comparative measure play animportant role in determining the difference betweendifferent pairs of FSs which result in equal values froma measure. Table II shows the difference between pairs ( A, B ) and ( A, C ) using a variety of weights. As thefirst weight increase in value the difference between thetwo pairs also increases; the results begin to signify that B is closer to A than C is to A . This is because thedissimilarity measure gives 1 for disjoint sets (such aspairs ( A, C ) ) and thus, in such cases, will always begiven the first weight. As the first weight increases invalue the overall measure is, in effect, placing moreimportance on the fact that the sets are disjoint.However, it is unhelpful to have too large of a valuefor the first weight. If the first weight equals 1 then theoutput of the combined measure will always equal 1 fordisjoint sets. Considering this, the first weight shouldbe low enough such that it is possible to distinguishbetween different pairs of disjoint sets. However, it mustalso be large enough such that it is possible to make adistinction between FSs which give equivalent valuesof similarity or distance, such as those in Fig. 2.Ideally, when the FSs are known beforehand, theweights should be tuned such that the widest rangeof values are given by the measure. This decreasespossible confusions over pairs of FSs which wouldgive close or identical values from a single measure.However, if the weights cannot be tuned, the weights { . , . } are ideal and are used throughout this paper.This is because tests showed that these weights areuseful for preventing disjoint FSs from resulting in alower distance/dissimilarity than non-disjoint FSs. C. Properties of the combined measure
This Section introduces and proves the properties ofthe combined measure (8).
Theorem 1 (Self-identity) . The comparative mea-sure (8) follows the property of self-identity. That is, ig. 2. Three FSs, A , B and C .TABLE IIC OMPARATIVE MEASURE ON THE FS S WITHIN F IG . 2 USINGDIFFERENT WEIGHTS (* INDICATES THE CHOSEN WEIGHTSWITHIN THIS PAPER ).Weight 0 Weight 1 c(A,B) c(A,C)0.0 1.0 0.331 0.3310.1 0.9 0.380 0.3980.2 0.8 0.428 0.4650.3 0.7 0.477 0.5320.4 0.6 0.526 0.5980.5 0.5 0.574 0.6650.6 0.4 0.623 0.732*0.7 0.3 0.672 0.7990.8 0.2 0.721 0.8660.9 0.1 0.769 0.9331.0 0.0 0.818 1.0 c ( A, B ) = 0 ⇐⇒ A = B .Proof: If A = B then s ( A, B ) = 1 according theproperty of reflexivity, and so w (1 − s ( A, B )) = 0 .Also, if A = B then d ( A, B ) = 0 according to theproperty of self-identity, and so w (cid:18) d ( A,B ) λ (cid:19) = 0 forany w . Thus c ( A, B ) = 0 if A = B .Alternatively, if A (cid:54) = B then s ( A, B ) (cid:54) = 1 and d ( A, B ) (cid:54) = 0 , thus c ( A, B ) (cid:54) = 0 . Theorem 2 (Symmetry) . The comparative measure (8)follows the property of symmetry. That is, c ( A, B ) = c ( B, A ) .Proof: If the SM and DM that are aggregated areboth symmetrical, then the same values will be given tothe comparative measure for both c ( A, B ) and c ( B, A ) ,thus the comparative measure is also symmetrical. Theorem 3 (Partial Symmetry) . Where the DDM (6) isused with the comparative measure (8), the property ofpartial symmetry holds. That is | c ( A, B ) | = | c ( B, A ) | ,and c ( A, B ) (cid:54) = c ( B, A ) where A (cid:54) = B Proof:
When the distance is a positive value, thevalues of both distance and dissimilarity given to theOWA are in the positive domain. Likewise, where thedistance is negative, both inputs given to the OWA arein the negative domain. In each case, the absolute valuesof the positive and negative inputs are the same, thusthe absolute values of the outputs are also the same,and the sign of the final value is in same domain as theinput values.
Theorem 4 (Separability) . The result of the compara-tive measure is always greater than or equal to zero,i.e. c ( A, B ) ≥ when aggregating with the non-DDM.Proof: Given that − s ( A, B ) ∈ [0 , , and d ( A,B ) λ ∈ [0 , , as λ never exceeds the maximumvalue of d ( A, B ) , it follows that c ( A, B ) ∈ [0 , thus c ( A, B ) ≥ .Note, however, if the DDM is used to construct thecomparative measure, then c ( A, B ) ∈ [ − , . Thus,separability does not apply where the DDM is used. Theorem 5 (Transitivity) . The comparative measure (8)follows the property of transitivity. That is, if A ⊆ B ⊆ C , then d ( A, B ) ≤ d ( A, C ) Proof:
Given that both the dissimilarity measureand the DM follow transitivity, when they are aggre-gated the resulting comparative measure also followstransitivity.
Theorem 6 (Triangle Inequality) . The comparativemeasure (8) follows the property of triangle inequality.That is, c ( A, C ) ≤ c ( A, B ) + c ( B, C ) Proof:
Given that both the dissimilarity measure[17] and the DM follow triangle inequality, when theyare aggregated the resulting comparative measure alsofollows triangle inequality.The complement of the comparative measure (9),likewise, follows the properties of symmetry, sepa-rately, transitivity and triangle inequality. However, thecomplement does not satisfy self-identity and insteadfollows reflexivity ( c (cid:48) ( A, B ) = 1 ⇔ A = B ). Thisis because the complement uses 1 to indicate identicalFSs, where as the comparative measure uses 0 instead.It is trivial to see from theorem 1 that the complementof the comparative measure satisfies reflexivity.Note that the comparative measure does not followthe property of overlapping (i.e. if A ∩ B (cid:54) = ∅ , then c ( A, B ) > , otherwise c ( A, B ) = 0 ) unless theweights w = [1 . , are given, such that the maximumweight is given to the dissimilarity measure when theFSs are identified as disjoint.. D EMONSTRATIONS
Examples of the comparative measure (8) are givenin Table I in which the measure is applied to theFSs in Fig. 1. A demonstration and discussion of thecomparative measure in an applied context are presentednext, and compared against using a single measure ofsimilarity or distance.
A. Demonstration - MovieLens
The following is a discussion of comparisons be-tween the FSs in Fig. 1 according to the comparativemeasure (8), the results of which are shown in Table I.
Sets a & b The results of a and b in Table I havea high degree of dissimilarity/distance. Additionally,the sign of the comparative measure shows in whichdirection the dissimilar regions of the FSs reside. Tothe left of A in the case of the FSs in a and to theright of A in the case of the sets in b . The comparativemeasure also shows that the FSs within b are closerthan those in a . Sets c & d The FSs within c and d both have aslightly increased degree of dissimilarity/distance thanindicated by the original SM and DM. This is due toa large difference between the range of elements con-tained within the sets, which increases the dissimilarityaccording to the measure. In both cases, the FS B coversthe range [1 , whereas A only covers [1 , in c , and [4 , in d . It can also now be observed, by using thecomparative measure alone, that B is to the right of A in c , and is to the left of A in d . The ordering anddirection of the distance between c and d are the sameusing both the DM and the comparative measure. Sets e The FSs within e are disjoint and were thusgiven the value 0 by the Jaccard SM. However, thecomparative measure gives a non-zero value for e . Notethat this value is still the largest dissimilarity/distancecompared to the other pairs within Table I. This valuealso helps to identify the direction of the FSs indicatingthat the FS B is to the right of A . Sets f The comparative measure indicates with a highdegree of certainty that the FSs of f are nearly identical.A possible application of the comparative measure isto the problem of ranking. Comparing the comparativemeasure against the DM, which may also be used forranking [4], the ordering of the FSs differs. By observ-ing the absolute values of the measures, according tothe DM the most distant pair is a and the second mostdistant is e , however, it is the other way round accordingto the comparative measure. This is because the SMindicates there is some similarity between the FSswithin Fig. 1(a), which causes the comparative measureto decrease in dissimilarity/distance. However, the FSs TABLE IIIS
IMILARITY AND DISTANCE BETWEEN THE RESTAURANT ANDWORD MODELS IN F IG . 3Poor OK GreatSimilarity (3) 0.081 0.493 0.469Distance (4) 5.573 1.064 -3.360Comparative (9) 0.171 0.609 -0.516 in Fig. 1(e) are disjoint, so the dissimilarity/distanceremains high. This leaves the FSs with no similarityas the most distant. One could argue that this is anexpected result of the comparative measure because theFSs within e are disjoint where as the FSs within a arenot disjoint. Thus the measure may be considered moreintuitive as it is natural to consider the sets in e as beingmore distant than the sets in a .As stated earlier, the unique properties of similarityand distance enable the measures to be applied to awide variety of fields, and the same can be said forthe comparative measure, which, as demonstrated, canbe used in terms of a measure of similarity and ameasure of distance. For example, with the FSs inFig. 1 the comparative measure may be used to findsimilarly rated films by choosing FSs with a low valueof dissimilarity/distance, or it may be used to rank thefilm ratings by ordering the results of the measure. B. Demonstration - Classification
This section presents a synthetic example of thecomparative measure applied to the problem of classifi-cation. In this example, three initial FSs are given whichrepresent three different descriptions. In this case, theyeach represent different levels of ambience within anestablishment on a scale from 1 to 10. These levels, asshown in Fig. 3, are labelled as
Poor , OK and Great .Given a FS representing the ambience of a restaurant, asshown in Fig. 3, the aim is to classify which descriptionbest fits the restaurant.In Table III, comparisons are given between theambience of the restaurant and the descriptions. Themeasures of similarity (3) and distance (4) are shown,as well as the complement of the comparative measure(9). For each measure, the word model is given as thefirst parameter, and the restaurant is given as the secondparameter. The complement of the comparative measureis given to match the SM, such that both measures givethe value 1 for identical FSs.According to the SM (3), the restaurant’s ambienceis similar to the descriptions poor , OK and great to thedegree 0.081, 0.493 and 0.469, respectively. It is clearfrom these results that the restaurant’s ambience cannot ig. 3. Three FSs modelling degrees of ambience, with a FSrepresenting the ambience of a given restaurant. be described as poor , however, it is almost equally validthat it may be described as OK or Great .The DM (4), however, gives a clearer view of whichFS the restaurant most closely matches; the restauranthas a smaller distance to OK than to Great . Thus, byfusing the distance and similarity as in (8) and (9), amore distinct match is achieved. Now, it is clear fromthe results of the comparative measure in Table III thatthe restaurant most closely matches OK ambience tothe degree of 0.609, whereas it only matches Great by0.516 and
Poor by 0.171. It can now be determined withgreater certainty that OK is the correct classification.It should be noted that this can be determined byobserving a single measure (9), rather than viewing theSM and DM separately.VI. C ONCLUSIONS
This paper has introduced a novel measure, referredto as a comparative measure, which analyses and com-pares FSs by combining a SM and DM. When thesemeasures are viewed separately the results may bedifficult and time-consuming to interpret as similarityand distance each measure fundamentally different con-cepts. By joining the measures together, the comparisonof FSs is simplified by reducing any ambiguity in theresults. Additionally, compared to a single measure, thecombined measure provides is a richer comparison asit may be swayed towards a preference in representingsimilarity or distance. This is especially useful for theautomatic comparison of a large number of FSs whichhave been constructed from data. Additionally, throughusing an OWA operator, it is possible to refine theweights to further alleviate ambiguous values resultingfrom the original measures.Demonstrations using data-driven FSs have shownthat the comparative measure may be applied in terms of both similarity and distance, and as such may beapplied to applications of these measures. Though thedemonstrations have been applied to type-1 FSs only,as the comparative measure uses the outputs of the SMand DM, it may also be applied to type-2 FSs, wherethe original measures are a type-2 SM and DM.Future work will look at measures which indicatesimilarity and distance as a FS, which better reflectsthe uncertainty inherent in FSs.R
EFERENCES[1] D. Wu and J. M. Mendel, “A comparative study of rankingmethods, similarity measures and uncertainty measures forinterval type-2 fuzzy sets,”
Information Sciences , vol. 179, no. 8,pp. 1169–1192, 2009.[2] C. Wagner, S. Miller, and J. Garibaldi, “Similarity based ap-plications for data-driven concept and word models based ontype-1 and type-2 fuzzy sets,” in
Fuzzy Systems (FUZZ), 2013IEEE International Conference on , 2013, pp. 1–9.[3] W. Wang and X. Xin, “Distance measure between intuitionisticfuzzy sets,”
Pattern Recognition Letters , vol. 26, no. 13, pp.2063–2069, 2005.[4] C.-H. Cheng, “A new approach for ranking fuzzy numbers bydistance method,”
Fuzzy Sets and Systems , vol. 95, no. 3, pp.307–317, 1998.[5] S. Coupland, J. Mendel, and D. Wu, “Enhanced interval ap-proach for encoding words into interval type-2 fuzzy sets andconvergence of the word fous,” in
Fuzzy Systems (FUZZ), 2010IEEE International Conference on , 2010, pp. 1–8.[6] K. Hirota and W. Pedrycz, “Fuzzy computing for data mining,”
Proceedings of the IEEE , vol. 87, no. 9, pp. 1575 –1600, Sep.1999.[7] L. Zadeh, “Fuzzy logic = computing with words,”
IEEE Trans.Fuzzy Syst. , vol. 4, no. 2, pp. 103 –111, May 1996.[8] J. Mendel,
Uncertain rule-based fuzzy logic systems: introduc-tion and new directions . Prentice Hall PTR, 2001.[9] L. Zadeh, “The concept of a linguistic variable and its applica-tion to approximate reasoning—I,”
Information Sciences , vol. 8,no. 3, pp. 199–249, 1975.[10] B. Chaudhur and A. Rosenfeld, “On a metric distance betweenfuzzy sets,”
Pattern Recognition Letters , vol. 17, no. 11, pp.1157–1160, 1996.[11] R. Zwick, E. Carlstein, and D. V. Budescu, “Measures ofsimilarity among fuzzy concepts: A comparative analysis,”
International Journal of Approximate Reasoning , vol. 1, no. 2,pp. 221–242, 1987.[12] J. McCulloch, C. Wagner, and U. Aickelin, “Measuring thedirectional distance between fuzzy sets,” in
Computational In-telligence (UKCI), 2013 13th UK Workshop on . IEEE, 2013,pp. 38–45.[13] R. R. Yager, “On ordered weighted averaging aggregationoperators in multicriteria decisionmaking,”
IEEE Trans. Syst.,Man, Cybern. , vol. 18, no. 1, pp. 183–190, 1988.[14] L. Cans and V. Liern, “Soft computing-based aggregation meth-ods for human resource management,”
European Journal ofOperational Research , vol. 189, no. 3, pp. 669 – 681, 2008.[15] R. Sadiq, M. J. Rodrguez, and S. Tesfamariam, “Integratingindicators for performance assessment of small water utilitiesusing ordered weighted averaging (owa) operators,”
ExpertSystems with Applications , vol. 37, no. 7, pp. 4881 – 4891,2010.[16] “Movielens dataset,” Last Access: 12/2013. [Online]. Available:http://grouplens.org/datasets/movielens/[17] A. Lipkus, “A proof of the triangle inequality for the tanimotodistance,”