[PDF] Exploring Scale-Measures of Data Sets

Abstract

Measurement is a fundamental building block of numerous scientific models and their creation. This is in particular true for data driven science. Due to the high complexity and size of modern data sets, the necessity for the development of understandable and efficient scaling methods is at hand. A profound theory for scaling data is scale-measures, as developed in the field of formal concept analysis. Recent developments indicate that the set of all scale-measures for a given data set constitutes a lattice and does hence allow efficient exploring algorithms. In this work we study the properties of said lattice and propose a novel scale-measure exploration algorithm that is based on the well-known and proven attribute exploration approach. Our results motivate multiple applications in scale recommendation, most prominently (semi-)automatic scaling.

Full PDF

aa r X i v : . [ c s . A I] F e b Exploring Scale-Measures of Data Sets

Tom Hanika , , Johannes Hirth , Knowledge & Data Engineering Group, University of Kassel, Germany Interdisciplinary Research Center for Information System DesignUniversity of Kassel, Germany [email protected], [email protected]

Abstract

Measurement is a fundamental building block of numerousscientiﬁc models and their creation. This is in particular true for datadriven science. Due to the high complexity and size of modern data sets,the necessity for the development of understandable and eﬃcient scalingmethods is at hand. A profound theory for scaling data is scale-measures,as developed in the ﬁeld of formal concept analysis. Recent developmentsindicate that the set of all scale-measures for a given data set constitutesa lattice and does hence allow eﬃcient exploring algorithms. In this workwe study the properties of said lattice and propose a novel scale-measureexploration algorithm that is based on the well-known and proven at-tribute exploration approach. Our results motivate multiple applicationsin scale recommendation, most prominently (semi-)automatic scaling. keywords : FCA, Conceptual Measures, Data Scaling, Measurements,Formal Concept, Lattice

An inevitable step of any data-based knowledge discovery process is measure-ment [24] and the associated (explicit or implicit) scaling of the data [27]. Thelatter is particularly constrained by the underlying mathematical formulation ofthe data representation, e.g., real-valued vector spaces or weighted graphs, therequirements of the data procedures, e.g., the presence of a distance function,and, more recently, the need for human understanding of the results. Consider-ing the scaling of data as part of the analysis itself, in particular formalizing itand thus making it controllable, is a salient feature of formal concept analysis(FCA) [7]. This ﬁeld of research has spawned a variety of specialized scalingmethods, such as logical scaling [25], and in the form of scale-measures links thescaling process with the study of continuous mappings between closure systems .Recent results by the authors [13] revealed that the set of all scale-measuresfor a given data set constitutes a lattice. Furthermore, it was shown that anyscale-measure can be expressed in simple propositional terms using disjunction,conjunction and negation. Among other things, the previous results allow a com-putational transition between diﬀerent scale-measures, which we may call scale-measure navigation , as well as their interpretability by humans.Despite these advances, the question of how to identify appropriate and mean-ingful scale-measures for a given data set with respect to a human data analystnd how to express that meaningfulness in the ﬁrst place remains unanswered.In this paper, we propose an answer to this question by adapting the well-known attribute exploration algorithm from FCA to present a method for exploring scalemeasures. Very similar to the original algorithm does scale-measure exploration inquire a (human) scaling expert for how to aggregate, separate, omit, or intro-duce data set features. Our eﬀorts do ﬁnally result in a (semi-)automatic scalingframework which may be applied to large and complex data sets.In detail, after recalling scale-measure basics in Section 3 we apply theoreticalresults for ideals in closure systems to the lattice of all scale-measures. From thiswe derive notions for the relevance of scale-measures as well as the mentionednovel exploration method in Section 4, which is supported by a detailed example.Finally, in Section 4.2, we outline the (semi-)automatic scaling framework andconclude in Section 6 after revisiting related work about scaling in Section 5.

FCA Recap

Formalizing and understanding the process of measurement is, inparticular in data science, an ongoing discussion, for which we refer the reader to

Representational Theory of Measurement [20, 29] as well as

Numerical RelationalStructure [24], and algebraic (measurement) structures [26, p. 253].Formal concept analysis (FCA) [7, 31] is well equipped to handle and compre-hend data scaling tasks. In FCA the basic data structure is the formal contexts as seen in the example Figure 1 (top), i.e., a triple (

G, M, I ) with non-emptyand ﬁnite set G (called objects ), non-empty and ﬁnite set M (called attributes )and a binary relation I ⊆ G × M (called incidence ). We say ( g, m ) ∈ I is equiv-alent to “ g has attribute m ”. We call S = ( H, N, J ) an induced sub-context of K , iﬀ H ⊆ G, N ⊆ M and I S = I ∩ ( H S × N ), and write S ≤ K . We ﬁnd twooperators · ′ : P ( G ) → P ( M ) , A A ′ = { m ∈ M | ∀ a ∈ A : ( a, m ) ∈ I } ,and · ′ : P ( M ) → P ( G ) , B B ′ = { g ∈ G | ∀ b ∈ B : ( g, b ) ∈ I } , called derivations . Pairs ( A, B ) ∈ P ( G ) × P ( M ) with A ′ = B and A = B ′ , are called formal concepts , where A is called extent and B intent . Consecutive applicationleads to two closure spaces Ext( K ) := ( G, ′′ ) and Int( K ) := ( M, ′′ ). Both clo-sure systems are represented in the (concept) lattice B ( K ) = ( B ( K ) , ⊆ ), where B ( K ) := { ( A, B ) ∈ P ( G ) × P ( M ) | A ′ = B ∧ B ′ = A } is the set of concepts of K and the order relation is ( A, B ) ≤ ( C, D ) ⇔ A ⊆ C . A fundamental approach to comprehensible scaling, in particular for nominaland ordinal data as studied in this work, is the following.

Deﬁnition 1 (Scale-Measure (cf. Deﬁnition 91, [7])).

Let K = ( G, M, I ) and S = ( G S , M S , I S ) be a formal contexts. The map σ : G → G S is called an S -measure of K into the scale S iﬀ the preimage σ − ( A ) := { g ∈ G | σ ( g ) ∈ A } of every extent A ∈ Ext( S ) is an extent of K . as limbs(L) breast feeds(BF) needschlorophyll (Ch) needs waterto live (W) lives onland (LL) lives inwater (LW) can move(M) monocotyledon(MC) dicotyledon(DC)dog × × × × × ﬁshleech × × × corn × × × × bream × × × × waterweeds × × × × bean × × × × frog × × × × × reed × × × × × frog reeddogBF bream water weeds corn beanDCﬁsh leech MCL LWM ChLLW Figure 1.

This Figure shows the

Living Beings and Water context in the top. Itsconcept lattice is displayed at the bottom and contains nineteen concepts.

This deﬁnition resembles the idea of continuity between closure spaces ( G , c )and ( G , c ). We say that the map f : G → G is continuous if and only iffor all A ∈ P ( G ) we have c ( f − ( A )) ⊆ f − ( c ( A )). This property is equiva-lent to the requirement in Deﬁnition 1 that the preimage of closed sets is closed.In the light of the defnition above we understand σ as an interpretation of theobjects from K in S . Therfore we view the set σ − (Ext( S )) := S A ∈ Ext( S ) σ − ( A )as the set of extents that is reﬂected by the scale context S .We present in Figure 2 the scale-context for some scale-measure and its con-cept lattice, derived from our running example context Living Beings and Water K W , cf. Figure 1. This scaling is based on the original object set G , however,the attribute set is comprised of nine, partially new, elements, which may re-ﬂect specie taxons. We observe in this example that the concept lattice of thescale-measure context reﬂects twelve out of the nineteen concepts from B ( K W ).In our work [13] we derived a scale-hierarchy on the set of scale-measures,i.e., S ( K ) := { ( σ, S ) | σ is a S − measure of K } , from a natural order of scalesintroduced by Ganter and Wille [7, Deﬁnition 92]). We say for two scale-measures( σ, S ) , ( ψ, T ) that ( σ, S ) is ﬁner then ( ψ, T ), iﬀ ψ − (Ext( T )) ⊆ σ − (Ext( S )), fromwhich also follows a natural equivalence relation ∼ . Deﬁnition 2 (Scale-Hierarchy (cf. Deﬁnition 7, [13])).

For a formal con-text K we call S ( K ) = ( S ( K ) / ∼ , ≤ ) the scale-hierarchy K . LW plants animals land plants water plants land animal water animal mammaldog × × × × ﬁshleech × × × × corn × × × bream × × × × waterweeds × × × × bean × × × frog × × × × × reed × × × × × plants := Chanimals := Mland plants := LL ∧ plantwater plants := LW ∧ plantland animal := LL ∧ animalwater animal := LW ∧ animalmammal := animal ∧ BF frog reeddogbreast feeds bream water weeds corn beanDCﬁsh leech MCL LWM ChLLW ⇒ Wfrog reeddogmammal ﬁsh leech,breamwater animal waterweedswater plantland animal corn, beanland plantanimal LW plant

Figure 2.

A scale context (top), its concept lattice (bottom right) for which id G is ascale-measure of the context in Figure 1. The reﬂected extents by the scale σ − (Ext( S ))of the scale-measure are indicated in gray in the contexts concept lattice (bottem left). Also in [13], we have shown that the scale-hierarchy of a context K is latticeordered and isomorphic to the set of all sub-closure systems of Ext( K ), i.e., { Q ⊆ Ext( K ) | Q is a Closure System on G } that is ordered by set inclusion ⊆ .To show this, we deﬁned a canonical representation of scale-measures, using theso called canonical scale K A := ( G, A , ∈ ) for A ⊆

Ext( K ) with Ext( K A ) = A . Proposition 1 (Canonical Representation (cf. Proposition 10, [13])).

Let K = ( G, M, I ) be a formal context with scale-measure ( S , σ ) ∈ S ( K ) , then ( σ, S ) ∼ (id , K σ − (Ext( S )) ) . We argued in [13] that the canonical representation eludes human explanationto some degree. To remedied this issue by means of logical scaling [25] which ledto to scales with logical attributes M S ⊆ L ( M, {∧ , ∨ , ¬} ) ([13, Problem 1]). Proposition 2 (Conjunctive Normalform (cf. Proposition 23, [13])).

Let K be a context, ( σ, S ) ∈ S ( K ) . Then the scale-measure ( ψ, T ) ∈ S ( K ) givenby ψ = id G and T = | A ∈ σ − (Ext( S )) ( G, { φ = ∧ A I } , I φ ) is equivalent to ( σ, S ) and is called conjunctive normalform of ( σ, S ) . (id , K )][(id , K { G } )][( σ, S )] Figure 3.

Scale-hierarchy of K with indicated scale-measures. The goal for the rest of this work is to identify outstanding and particurilyinteresting data scalings. This quest leads to the natural question for a structuralunderstanding of the scale-hierarchy and its elements. In order to do this werely on the isomorphism [13, Proposition 11] between a context’s scale-hierarchy S ( K ) and the lattice of all sub-closure systems of the extent set, as explainedin the last section. The later forms an order ideal in the lattice of all closuresystems F G on a set G , to which we refer by ↓ F G Ext( K ). This ideal is wellstudied [1] and we may often omit the index F G to improve the readibility.Equipped with this structure we have to recall a few notions and deﬁni-tions for a complete lattices ( L, ≤ ). In the following, we denote by ≺ the coverrelation of ≤ . Furthermore, we say L is 1) lower semi-modular if and only if ∀ x, y ∈ L : x ≺ x ∨ y = ⇒ x ∧ y ≺ y , 2) join-semidistributive iﬀ ∀ x, y, z ∈ L : x ∨ y = x ∨ z = ⇒ x ∨ y = x ∨ ( y ∧ z ), 3) meet-distributive ( lower locallydistributive , cf [1]) iﬀ L is join-semidistributive and lower semi-modular, 4) join-pseudocomplemented iﬀ x ∈ L the set { y ∈ L | y ∨ x = ⊤} has a least, 5) ranked iﬀ there is a function ρ : L N with x ≺ y = ⇒ ρ ( x ) + 1 = ρ ( y ), 6) atomistic iﬀevery x ∈ L can be written as the join of atoms in L . In addition to the just intro-duced lattice properties, there are properties for elements in L that we consider.An element x ∈ L is 1) neutral iﬀ every triple { x, y, z } ⊆ L generates a distribu-tive sublattice of L , 2) distributive iﬀ the equalities x ∨ ( y ∧ z ) = ( x ∨ y ) ∧ ( x ∨ z )and x ∧ ( y ∨ z ) = ( x ∧ y ) ∨ ( x ∧ z ) for every y, z ∈ L hold, 3) meet irreducible iﬀ x = ⊤ and V y ∈ Y y for Y ⊆ L implies x ∈ Y , 4) join irreducible iﬀ x = ⊥ and W y ∈ Y y for Y ⊆ L implies x ∈ Y , Throughout the rest of this work, we denoteby M ( L ) the set of all meet-irreducible elements of L .We can derive from literature [1, Proposition 19] the following statement. Corollary 1.

For K = ( G, M, I ) , ↓ Ext( K ) ⊆ F G and R , R ′ ∈ ↓ Ext( K ) we ﬁndthe equivalence: R ′ ≺ R ⇐⇒ R ′ ∪ { A } = R with A is meet-irreducible in R Of special interest in lattices are the meet- and join-irreducibles, since everyelement of a lattice can be represented as a join or meet of these elements. roposition 3.

For K , ↓ Ext( K ) ⊆ F G and R ∈↓

Ext( K ) we ﬁnd the equiva-lence: R join-irreducible in ↓ Ext( K ) ⇐⇒ ∃ A ∈ Ext( K ) \ { G } : R = { G, A } Proof. ⇐ : For A ∈ Ext( K ) \ { G } is { A, G } a closure system on G and therebyin ↓ Ext( K ). Further, the set { A, G } is of cardinality two and thereby anatom of ↓ Ext( K ) and thus join-irreducible. ⇒ : By contradiction assume that A ∈ Ext( K ) \ { G } : R = { G, A } , then for every D ∈ R \ { G } is { D, G } anatom of ↓ Ext( K ), hence, R = W D ∈R\{ G } { D, G } , i.e., not join-irreducible.Next, we investigate the meet-irreducibles of ↓ Ext( K ) using a similar ap-proach as done for F G [1] based on propositional logic. We recall, that an (ob-ject) implication for some context K is a pair ( A, B ) ∈ P ( G ) × P ( G ), shortlydenoted by A → B . We say A → B is valid in K iﬀ A ′ ⊆ B ′ . The set F A,B := { D ⊆ G : A B ∨ B ⊆ D } contains all models of A → B . Addi-tionally, F A,B (cid:12)(cid:12)

Ext( K ) := F A,B ∩ Ext( K ) is the set of all extents D ∈ Ext( K )that are models of A → B . The set F A,B is a closure system [1] and therefor F A,B (cid:12)(cid:12)

Ext( K ) , too. Furthermore, we can deduce that F A,B (cid:12)(cid:12)

Ext( K ) ∈↓ Ext( K ). Lemma 1.

For context K , ↓ Ext( K ) ⊆ F G , R ∈ ↓

Ext( K ) with closure operator φ R we ﬁnd R = T { F A,B (cid:12)(cid:12)

Ext( K ) | A, B ⊆ G ∧ B ⊆ φ R ( A ) } .Proof. We know that R = T {F A,B | A, B ⊆ G ∧ B ⊆ φ R ( A ) } [1, Proposition 22].Since R ⊆

Ext( K ) it holds that R = T {F A,B | A, B ⊆ G ∧ B ⊆ φ R ( A ) }∩ Ext( K )and thus equal to T { F A,B (cid:12)(cid:12)

Ext( K ) | A, B ⊆ G ∧ B ⊆ φ R ( A ) } .Note that for any R ∈ ↓

Ext( K ) the set { F A,B (cid:12)(cid:12)

Ext( K ) | A, B ⊆ G ∧ B ⊆ φ R ( A ) } contains only closure systems in ↓ Ext( K ) and thus possibly meet-irreducibleelements of ↓ Ext( K ). Proposition 4.

For context K , ↓ Ext( K ) ⊆ F G and R ∈ ↓

Ext( K ) , we ﬁndtfae: 1. R is meet-irreducible in ↓ Ext( K ) ∃ A ∈ Ext( K ) , i ∈ G with A ≺ Ext( K ) ( A ∪ { i } ) ′′ such that R = F A, { i } | Ext( K ) Proof. [1 . ⇒ . ] Due to Lemma 1 we can represent R ∈ ↓

Ext( K ) by the equation R = T {F A,B | Ext( K ) | A, B ⊆ G ∧ B ⊆ φ R ( A ) } . Moreover, since R is meet-irreducible in ↓ Ext( K ), we can infer that R ∈ {F

A,B | Ext( K ) | A, B ⊆ G ∧ B ⊆ φ R ( A ) } . In particular there exist A, B ⊆ G with B ⊆ φ R ( A ) such that R = F A,B | Ext( K ) , and thus R = F A ′′ ,B | Ext( K ) . Hence, we identify A ′′ by A forthe rest of this proof. Using the fact that F A, { i } ∩ F A, { j } = F A, { i,j } we can inferthat F A, { i } | Ext( K ) ∩ F A, { i } | Ext( K ) = F A, { i,j } | Ext( K ) . Therefore, there must exist A, { i } ⊆ G with R = F A, { i } | Ext( K ) ( ∗ ).In the case that A = ( A ∪ { i } ) ′′ the set F A, { i } | Ext( K ) = Ext( K ) and R isthereby not meet-irreducible. Assume that A Ext( K ) ( A ∪ { i } ) ′′ , then there is a D ∈ Ext( K ) with A ≺ Ext( K ) D ⊆ ( A ∪ { i } ) ′′ and i D . Hence A, D = A → { i } (see ∗ ) and thus A, D

6∈ R . Using this, we construct two sets

R ∪ { A } and R ∪ { D } . The set R ∪ { D } is closed by intersection, since an intersection of D with an element in R is a model of A → i , thus R ∪ { D } ∈↓ Ext( K ). The sameolds for R ∪ { A } resprectively. The intersection of R ∪ { A } and R ∪ { D } isequal to R which is thereby not meet-irreducible, a contradiction.[1 . ⇐ . ] Consider a closure system ˆ F ∈↓

Ext( K ) with ˆ F covers R in ↓ Ext( K ). By Corollary 1, we can represent ˆ F = R ∪ { D } ( ∗ ) with D

6∈ R and D is meet-irreducible in ˆ F (and therefore D ∈ Ext( K )). Due to R ⊆ ˆ F theset ( A ∪{ i } ) ′′ is an element of ˆ F and thereby the intersection ( A ∪{ i } ) ′′ ∩ D ∈ ˆ F .Since D

6∈ R , we can deduce that D = A → i and therefor A ⊆ D and i D . From A ≺ Ext( K ) ( A ∪ { i } ) ′′ we know that ( A ∪ { i } ) ′′ ∩ D = A . Finally, D ∈ ˆ F = ⇒ A ∈ ˆ F , and using ( ∗ ), we can infer that D = A . Hence, R ∪ { A } isthe sole upper neighbour of R in ↓ Ext( K ) and thereby R is meet-irreducible.Propositions 3 and 4 provide a characterization of irreducible elements in ↓ Ext( K ) and thereby in the scale-hierarchy of K . Those may be of particularinterest, since any element of ↓ Ext( K ) is representable by irreducible elements. Proposition 5.

For context K , A, B ∈ Ext( K ) with A ≺ Ext( K ) B , then if A ismeet-irreducible in Ext( K ) , follows F A,B (cid:12)(cid:12)

Ext( K ) is a maximum meet-irreducibleelement in ↓ Ext( K ) ⊆ F G .Proof. For A ≺ Ext( K ) B , A is the only extent that that is not a model of impli-cation A → B , since every other superset of A in Ext( K ) is also a superset of B . Hence F A,B (cid:12)(cid:12)

Ext( K ) is equal to Ext( K ) \ { A } . The only superset in ↓ Ext( K )is Ext( K ), which is not meet-irreducible.Equipped with this characterization we look into counting the irreducibles. Proposition 6.

For context K , the number of meet-irreducible elements in thelattice ↓ Ext( K ) ⊆ F G is equal to |≺ ↓ Ext( K ) | .Proof. According to Proposition 4, an element

R ∈↓

Ext( K ) is meet-irreducibleiﬀ it can be represented as F A, { i } | Ext( K ) for some A ∈ Ext( K ) with A ≺ Ext( K ) ( A ∪ { i } ) ′′ . Hence the number of meet-irreducible elements is bound by thenumber of covering pairs A ≺ Ext( K ) B in Ext( K ). It remains to be shown that for R there is only one pair ( A, B ) ∈≺ Ext( K ) with B = ( A ∪ { i } ) ′′ for some i ∈ B \ A such that R = F A, { i } | Ext( K ) . Assume there are ( A, B ) , ( C, D ) ∈≺ Ext( K ) with( A, B ) = ( C, D ) and F A,B | Ext( K ) = F C,D | Ext( K ) . First, consider the case A = C .Without loss of generality let A C , then we have C | = A → B , but C = C → D . Therefore C ∈ F A,B | Ext( K ) but C

6∈ F

C,D | Ext( K ) . In the second case, A = C ,we have B = D and thus B = C → D , but B | = A → B . This implies that B ∈ F A,B | Ext( K ) but B

6∈ F

C,D | Ext( K ) . Thus, F A,B | Ext( K ) = F C,D | Ext( K ) .Next, we turn ourselfs to other lattice properties of ↓ Ext( K ) and its elements. Lemma 2 (Join Complement).

For K , ↓ Ext( K ) ⊆ F G and R ∈ ↓

Ext( K ) ,the set ˆ R = W A ∈M (Ext( K )) \M ( R ) { A, G } is the inclusion minimum closure-systemfor which R ∨ ˆ R = Ext( K ) .roof. A set

A ⊆

Ext( K ) is a generator of Ext( K ) iﬀ all meet-irreducible ele-ments of Ext( K ) are in A . Hence, for every D ∈↓

Ext( K ) with R ∨ D = Ext( K ),we have D is a superset of M (Ext( K )) \ M ( R ) and thus of ˆ R , since ˆ R it is theclosure of M (Ext( K )) \ M ( R ) in ↓ Ext( K ).All the above result in the following statement about ↓ Ext( K ): Proposition 7.

For context K , the lattice ↓ Ext( K ) ⊆ F G :i) is join-semidistributiveii) is lower semi-modulariii) is meet-distributive iv) is join-pseudocomplementedv) is rankedvi) is atomisticProof. i) According to [1, Corollary 30] F G is join-semidistributive and therefor ↓ Ext( K ) too, since the meet and join operations of F G are closed in ↓ Ext( K ).ii) Analogue to i). iii) Follows from i) and ii) (cf. Deﬁnition 15 (5) [1]). iv) Thejoin-complement of any R ∈ ↓

Ext( K ) is given by ˆ R according to Lemma 2.v) The lattice F G is ranked by the cardinality function (cf.[1, Corollary 30]). Since ↓ Ext( K ) is an order ideal in F G , it is ranked by the same function. vi) Followsdirectly from the characterization of join-irreducibles in Proposition 3.This result can be employed for the recommendation of scale-measures, inparticular with respect to Libkins decomposition theorem [19, Theorem 1]. Thiswould allow for a divide-and-conquer procedure within the scale-hierarchy, basedon the fact: for context K the lattice ↓ Ext( K ) ⊆ F G is decomposable into thedirect product of two lattices ↓ Ext( K ) ∼ L × L iﬀ L = ( n ] , L = ( n ] and n is neutral in ↓ Ext( K ). Here n indicates the complement of n with respect to ↓ Ext( K ), which can be computed using Lemma 2. That this approach is reason-able can be drawn from the fact that ↓ Ext( K ) fullﬁls all requirements of Lemma2 and Theorem 1 from Libkin’s work [18, 19] by considering Proposition 7.In the rest of this section we investigate distributive and neutral elementsin ↓ Ext( K ) more deeply. For this, let ψ, φ ∈ Φ ( L ), i.e., the set of all closureoperators on lattice L . We say that φ ≤ Φ ψ iﬀ for all x ∈ L : φ ( x ) ≤ Φ ψ ( x ). Lemma 3.

For context K , ↓ Ext( K ) ⊆ F G and Φ (Ext( K )) , we ﬁnd that the map i : ↓ Ext( K ) Φ (Ext( K )) with i ( A ) → φ A | Ext( K ) is a dual-isomorphism.Proof. For A , D ∈↓

Ext( K ) with A ∈ A , A

6∈ D is i ( A )( A ) = A but i ( D )( A ) = A .Thus i ( A ) = i ( D ) and i injective. For φ ∈ Φ (Ext( K )) is φ [Ext( K )] ⊆ Ext( K )a closure system with G ∈ φ [Ext( K )] an therefor φ [Ext( K )] ∈↓ Ext( K ) with i ( φ [Ext( K )]) = φ . Hence i is bijective. For A , D ∈↓

Ext( K ) with A ≺ ↓ Ext( K ) D is A ∪ { D } = D for D meet-irreducible in D (Corollary 1). Thus for all A ∈ Ext( K )is i ( A )( A ) = i ( D )( A ) except for the pre-images of D , i.e., i ( D ) − ( D ). For A ∈ i ( D ) − ( D ) is i ( D )( A ) = D ⊆ i ( A )( A ) and thus i ( D ) ≤ i ( A ), as required. Corollary 2.

For a context K , ↓ Ext( K ) ⊆ F G and R ∈↓

Ext( K ) tfae: i) R is distributive ii) R is neutral iii) For A, B, C ∈ R with C = A ∧ B and A, B incomparable in

Ext( K ) , we have A ∈ R ∨ B ∈ R ∨ C ∈ R implies A, B, C ∈ R .roof.

Using Lemma 3, i) ⇔ ii) due to Thm. 2 [23] and i) ⇔ iii) due to Thm. 1 [23].An additional accompanying property is that the set of neutral elements of ↓ Ext( K ) is a complete lattice [22]. Thus, the iterative procedure that resultsfrom Corollary 2, iii) yields a closure operator on ↓ Ext( K ) to compute theneutral elements. To nourish our understanding of the neutral elements takethe following example: in the lattice F G are only the top and bottom elementsneutral [1, Proposition 33 (5)]. In contrast, take the chain C ⊆ F G with G ∈ C ,for which ↓ Ext( K C ) is a distributive lattice, hence, every element is neutral. Our theoretical ﬁndings unvails several possibilities to recommend scale-measures.First, there are meet- and join-irreducible elements of the scale-hierarchy (Propo-sitions 3 and 4). These elements are a minimum representation from which everyother scale-measure can be retrieved. However, the number of meet- and join-irreducible elements is in the size of the concept lattice B ( K ) (Proposition 3) andthereby potentially exponential large. Hence, it is necessary to narrow down theset of join-irreducible scale-measures, for example, by constraining the selectionto irreducible elements in B ( K ) or by applying conceptual importance measure.Other scale-measures of interest can be depicted based on their structuralplacement in the scale-hierarchy, i.e., element-wise modularity, distributivity, orneutrality. A further advantage of latter two selection methods is that they allowa decomposition of the scale-hierarchy using divide-and-conquer strategies. Theexistence of such neutral elements, however, cannot be guarantied, as it can beobserved in F G . When a starting scale-measure ( σ, S ) is selected, an obviouschoice is to recommend the join-complemented scale-measure (Proposition 7),i.e., the minimum scale-measure such that the join with ( σ, S ) yields Ext( K ). Thesaid join-complemented scale-measure can then be used as additional informationor be the starting point for a thorough search.In general, whenever multiple scale-measures of interest (cid:8) ( σ j , S j ) (cid:9) j ∈ J areselected, we are able to combine all those by the apposition of scale-measures([13, Proposition 19]) to combine their conceptual views on the data set. For the task of eﬃciently determining a scale-measure, based on human pref-erences, we propose the following approach. Motivated by the representationof meet-irreducible elements in the scale-hierarchy through object implicationsof the context (Proposition 4), we employ the dual of the attribute exploration algorithm ([8]) by Ganter. We modiﬁed said algorithm toward exploring scale-measures and present its pseudo-code in Algorithm 1. In this depiction we high-lighted our modiﬁcations with respect to the original exploration algorithm (Al-gorithm 19, [9]) with darker print. This algorithm semi-automatically computesa scale context S and its canonical base. In each iteration of the inner loop of our lgorithm 1: Scale-measure Exploration: A modiﬁed Exploration withBackground Knowledge

Input :

Context K = ( G, M, I ) Output: ( id G , S ) ∈ S ( K ) and optionally L S Init Scale S = ( G, ∅ , ∈ )Init A = ∅ , L S = CanonicalBase( K ) (or L S = {} for larger contexts) while A = G dowhile A = A I S I S ) doif Can A I K \ ( A ) I S I S I K for objects having ( A ) I S I S I K be neglected? then L S = L S ∪ { A → A I S I S } Exit While else

Enter B ⊆ A I K \ ( A ) I S I S I K that should be consideredAdd attribute B I K to S A = Next Closure ( A, G, L S ) return : ( id G , S ) and optionally L exploring algorithm the query that is stated to the scaling expert is if an objectimplication A = ⇒ B is true in the closure system of preferences. If the implica-tion holds, it is added to the implicational base of S and the algorithm continueswith the next implication query. Otherwise a counter example in the form of aclosed set C ∈ Ext( K ) with A ⊆ C but B C . This closed set is then added asattribute to the scale context S with the incidences given by ∈ . If C Ext( K )the scale S would contradict the scale-measure property (Proposition 20, [13]).The object implicational theory L S is initialized to the object canonical baseof K , which is an instance of according to attribute exploration with backgroundknowledge [8]. This initialization can be neglected for larger contexts, howeverit may reduce the number of queries. The algorithm terminates when the im-plication premise of the query is equal to G . The returned scale-measure is incanonical form, i.e., the canonical representation ( id G , ( G, Ext( S ) , ∈ )) (Proposi-tion 1). The motivation behind attribute exploration queries is to determine ifan implication holds in the unknown representational context of the learning do-main. In contrast, the exploration of scale-measures determines if a given Ext( K )can be coarsened by implications A = ⇒ B , resulating in a smaller and thusmore human comprehensible concept lattice B ( S ), adjusted to the preferences(or view) of the scaling expert.Querying object implications may be less intuitive compared to attributeimplications, hence, we suggest to rather not test for A = ⇒ A I S I S for A ⊆ G but to test if the diﬀerence of the intents A I K and ( A I S I S ) ′ in K , is of relevanceto the scaling expert. Finally, as a post-processing, one may apply the conjunc-tive normalform [13, Proposition 23] of scale-measures to further increase thehuman-comprehension. Yet, deriving other human-comprehensible representa-tions of scale-measures is deemed future work. .2 (Semi-)Automatic Large Data Set Scaling To demonstrate the applicability of the presented exploring algorithm, we haveimplemented it in the conexp-clj ([11]) software suite for formal concept analy-sis.For this, we apply the scale-measure exploration Algorithm 1 on our runningexample K W , see Figure 1. In Figure 4 (left) we depicted the evaluation steps ofalgorithm, the ﬁrst two columns represent the object implication that is queried,the third column contains the query translated in terms of attributes. For ex-ample, in row two the implication {} = ⇒ { D , FL , Br , F } is true in the so fargenerated scale S and is queried if it should hold. All objects of the implicationdo have at least the attribues can move and needs water to live , as indicatedin the third column (left). In the same column (right) we ﬁnd attributes from(1) I K \ (2) I K ⊆ M W that can be considered by the scaling expert to narrowthe object implication, i.e., to shrinken the size of the conclusion. The by usenvisioned answer of the scaling expert is given in column four, the attribute lives on land . Thus, the object counter example is then the attribute-derivationthe union { M , W , LL } I W = { D , F } . In our example of the scale-measure explo-ration the algorithm terminates after the scaling expert provided in total ninecounter examples and four accepts. The output is a scale context in canonicalrepresentation with twelve concepts as depicted in Figure 4 (right).The just demonstrated application of the scale-measure exploration can besupported in every steop by conceptual importance measures [16]. Furthermore,these measures can also be used to automate the exploration algorithm by ran-domly selecting the counterexample from the top-k of the list of outstandingconcepts with respect to one or more of said conceptual measures. We showillustrate this idea for the spices planer data set [12, 13, 21] and depict theresulting scale-measure in Figure 5. This data set is comprised of 56 dishes (ob-jects) and 37 spices (attributes), resulting in the context K Spices . The dishes arepicked from multiple categories, such as vegetables, meats, or ﬁsh dishes. The in-cidence I K Spices indicates that a spice m is necessary to cook dish g . The conceptlattice of K Spices has 421 concepts and is therefore too large for a meaningfulhuman comprehension. Thus, using our automatic approach for scale-measurerecommendation, we are able to generate a small-scaled view of readable size.For this example of automatic scale-measure exploration, we considered theimportance measure separation index [15, 16] on the set of objects. We considerthe maximum number of concepts that are human readable to be thirty andtherefore we restricted the number of counter examples to be computed accod-ingly. We depicted the concept lattice of the resulting scale-measure in Figure 5using the conjunctive normalform. To improve the readability, we only anno-tated meet-irreducible attribute concepts in the lattice diagram and omittedredundant attribute conjunctions, e.g., for Anis ∧ Vanilla ∧ Cinnamon ∧ Pastry weannotate ... ∧ Pastry, since Anis ∧ Vanilla ∧ Cinnamon is already given by an up-per neighbor. The so given scale-measure concept lattice seems empirically morehuman readable and displays extensive information with respect to the originaldata set K Spices and the employed importance measure.

Object) Premise (1) (Object) Conclusion (2) Attribute Question CE? Edit(2) ′ (1) ′ \ (2) ′{} { D, FL, Co, Br, WW, Be, F, R } { W } : add more detail using { L, BF, Ch, LL, LW, M, MC, DC } ? { M } Add { D, FL, Br, F } to M S { D, FL, Br, F } {

W, M } : add more detail using { L, BF, Ch, LL, LW, MC, DC } ? { LL } Add { D, F } to M S { D, F } {

L, W, LL, M } : add more detail using { BF, Ch, LW, MC, DC } ? { BF } Add { D } to M S { D } { L, BF, W, LL, M } : add more detail using { Ch, LW, MC, DC } ? { LW } Add {} to M S {}{ R } { D, FL, C, Br, WW, Be, F, R } { W } : add more detail using { Ch, LL, LW, MC } ? { Ch } Add { C, WW, Be, R } to M S { C, WW, Be, R } {

W, Ch } : add more detail using { LL, LW, MC } ? { LL } Add { C, Be, R } to M S { C, Be, R } {

W, Ch, LL } : add more detail using { LW, MC } ? { LW } Add { R } to M S { R }{ F } { F, D } {

L, W, LL, M } : add more detail using { LW } ? { LW } Add { F } to M S { F }{ F, R } {

D, FL, Co, Br, WW, Be, F, R } { W } : add more detail using { LW, LL } ? { LW } Add { FL, Br, WW, F, R } to M S { FL, Br, WW, F, R } { W } : add more detail using { LW, LL } ? no Add (1) = ⇒ (2) to L S { F, Br } {

FL, Br, F } {

W, LW, M } : add more detail using { L } ? no Add (1) = ⇒ (2) to L S { FL, Br, F } {

FL, Br, F }{ WW, R } {

WW, R }{ FL, Br, WW, F, R } {

FL, Br, WW, F, R }{ D } { D }{ D, F } {

D, F }{ D, FL, Br, F } {

D, FL, Br, F }{ Co, R } {

Co, Be, R } {

W, Ch, LL } : add more detail using { Mc } ? no Add (1) = ⇒ (2) to L S { Be } { Co, Be, R } {

W, Ch, LL } : add more detail using { Dc } ? no Add (1) = ⇒ (2) to L S { Co, Be, R } {

Co, Be, R }{ Co, WW, Be, R } {

Co, WW, Be, R } DONE { D, FL, Br, F } {} {

Co, WW, Be, R } {

Co, Be, R } {

FL, Br, WW, F, R } {

D, F } { D } { F } { R } dog (D) × × × ﬁsh leech (FL) × × corn (Co) × × bream (Br) × × water weeds(WW) × × bean (Be) × × frog (F) × × × × reed (R) × × × × {} frog reed { R } dog { D } fish leech,bream water weedscorn, bean { Co, Be, R }{ D, FL,Br, F } {

FL, Br,WW, F, R } {

Co, WW,Be, R }{ D }{ D, F } F i g u r e . S c a l e - m e a s u r ee x p l o r a t i o n r e s u l t s ( l e f t) f o r t h e L i v i n g B e i n g s a n d W a t e r c o n t e x t , t h e r e s u l t i n g c o n t e x t( b o tt o m r i g h t) a nd i t s c o n c e p t l a tt i c e (t o p r i g h t) . T h ee m p l o y e d o b j e c t o r d e r i s : B e > C o > D > WW > F L > B r > F > R ased on this approach, we propose a comprehensive study that speciﬁcallyexamines the use of the diﬀerent importance measures in relation to the datadomains used. Such a study would, of course, go beyond the scope of this paper.Another approach to improve the automatic scaling process could be the removalof irrelevant attributes ([14]). Other selection criteria could regard the distribu-tivity of concepts, since distributive lattices are known to have easy readabledrawings. Another, line of research with respect to improving the automatic scal-ing with our algorithm regards the logical representation of the scale-measureattributes. In presented work, we use the conjunctive normalform, but futurework may investigate new and additional logical representations. Fried-Fish Vegtables,Stew Dark-Sauce Bowle, Cake,Jam, Tea,Christmas-Pastry... ∧ GingerHerb-Dip,Hash Fried-Potato,Pizza, Stove-PotatoDuck, Pork Dessert... ∧ ClovesAsian-Rice Pasta Red-Cabbage,Sauer-braten, Wild... ∧ ClovesVeggie-Casserole Chicken Beef Fruit-Salad... ∧ Pastry ∧ SweetsAllspice ∧ Lorber ∧ Juniper-Berries ∧ Pepper-BlackGoulash ... ∧ ThymeSteamed-Fish, Paella,White-Sauce,Grilled-Fish,Baked-Fish,Veal-MeatCurry ∧ Garlic ∧ Pepper-White Mushrooms... ∧ Thyme LambThyme ∧ Rosemary ∧ Garlic ∧ CayenneCarrots, Green-Salad, Spinach,Broccoli,Cauliﬂower,Lentil-Soup,Cucumber-Salad, Beans,Sauerkraut,Tomato-Salad,KohlrabiVegetables ∧ Pepper-White ... ∧ Thyme Rice-PuddingAnise ∧ Vanilla ∧ CinnamonPaprika-Roses ∧ Garlic ∧ CayennePaprika-Roses ∧ Paprika-Sweet ∧ Garlic Paprika-Sweet ∧ Garlic ∧ CayenneOmlette, Potato-Gratin, Puree, Cheese-Pastry, Potato-Soup,Shellﬁsh, Roulades,Goose, Curry-Rice

Figure 5.

Automatically generated scale-measure of the spices context using the mostoutstanding concepts by the separation index importance measure. The scale has con-sists of 30 of the original 421 concepts and is in conjunctive normalform.

Related Work

Measurement is an important ﬁeld of study in many (scientiﬁc) disciplines thatinvolve the collection and analysis of data. According to Stevens [27] there arefour feature categories that can be measured, i.e., nominal , ordinal , interval and ratio features. Although there are multiple extensions and re-categorizations ofthe original four categories, e.g., most recently Chrisman introduced ten [2], forthe purpose of our work the original four suﬃce. Each of these categories describewhich operations are supported per feature category. In the realm of formal con-cept analysis we work often with nominal and ordinal features, supporting valuecomparisons by = and <, > . Hence grades of detail/membership cannot be ex-pressed. A framework to describe and analyze the measurement for Boolean datasets has been introduced in [10] and [6], called scale-measures . It characterizesthe measurement based on object clusters that are formed according to commonfeature (attribute) value combinations. An accompanied notion of dependencyhas been studied [30], which led to attribute selection based measurements ofboolean data. The formalism includes a notion of consistency enabling the deter-mination of diﬀerent views and abstractions, called scales , to the data set. Thisapproach is comparable to OLAP [3] for databases, but on a conceptual level.Similar to the feature dependency study is an approach for selecting relevantattributes in contexts based on a mix of lattice structural features and entropymaximization [14]. All discussed abstractions reduce the complexity of the data,making it easier to understand by humans.Despite the in this work demonstrated expressiveness of the scale-measureframework, it is so far insuﬃciently studied in the literature. In particular al-gorithmical and practical calculation approaches are missing. Comparable andpopular machine learning approaches, such as feature compressed techniques,e.g.,

Latent Semantic Analysis [4, 5], have the disadvantage that the newly com-pressed features are not interpretable by means of the original data and are notguaranteed to be consistent with said original data. The methods presented inthis paper do not have these disadvantages, as they are based on meaningful andinterpretable features with respect to the original features using propositionalexpressions. In particular preserving consistency, as we did, is not a given, whichwas explicitly investigated in the realm scaling many-valued formal contexts [25]and implicitly studied for generalized attributes [17].Earlier approaches to use scale contexts for complexity reduction in dataused constructs such as ( G N ⊆ P ( N ) , N, ∋ ) for a formal context K = ( G, M, I )with N ⊆ M and the restriction that at least all intents of K restricted to N arealso intent in the scale [28]. Hence, the size of the scale context concept latticedepends directly on the size of the concept lattice of K . This is particularlyinfeasible if the number of intents is exponential, leading to incomprehensiblescale lattices. This is in contrast to the notion of scale-measures, which cover atmost the extents of the original context, and can thereby display selected andinteresting object dependencies of scalable size. Conclusion

With this work we have shed light on the hierarchy of scale-measures. By apply-ing multiple results from lattice theory, especially concerning ideals, to said hier-archy, we were able to give a more thorough structural description of ↓ F G Ext( K ).Our main theoretical result is Proposition 7, which in turn leads to our practicalapplications. In particular, based on this deeper understanding we were able topresent an algorithm for exploring the scale-hierarchy of a contextual data set K . Equipped with this algorithm a data scaling expert may explore the latticeof scale-measures for a given data set with respect to her preferences and the re-quirements of the data analysis task. The practical evaluation and optimizationof this algorithm is a promising goal for future investigations. Even more impor-tant, however, is the implementation and further development of the automaticscaling framework, as outlined in Section 4.2. This opens the door to empiricalscale recommendation studies and a novel approach for data preprocessing. References [1] N. Caspard and B. Monjardet. “The lattices of closure systems, closure opera-tors, and implicational systems on a ﬁnite set: a survey.” In:

Discrete AppliedMathematics

Car-tography and Geographic Information Systems

Providing OLAP (On-Line AnalyticalProcessing) to User-Analysts: An IT Mandate . E. F. Codd and Associates. 1993.[4] V. Codocedo, C. Taramasco, and H. Astudillo. “Cheating to achieve FormalConcept Analysis over a Large Formal Context.” In:

CLA . Ed. by A. Napoli andV. Vychodil. Vol. 959. CEUR-WS.org, 2011, pp. 349–362.[5] S. T. Dumais. “Latent semantic analysis.” In:

Annu. Rev. Inf. Sci. Technol.

Applications of combinatoricsand graph theory to the biological and social sciences . Ed. by F. Roberts. Springer-Verlag, 1989, pp. 139–167.[7] B. Ganter and R. Wille.

Formal Concept Analysis: Mathematical Foundations .Springer-Verlag, Berlin, 1999, pp. x+284.[8] B. Ganter. “Attribute Exploration with Background Knowledge.” In:

Theor.Comput. Sci.

Conceptual Exploration . Springer, 2016, pp. 1–315. isbn : 978-3-662-49291-8.[10] B. Ganter, J. Stahl, and R. Wille. “Conceptual measurement and many–valuedcontexts.” In:

Classiﬁcation as a tool of research . Ed. by W. Gaul and M. Schader.North–Holland, 1986, pp. 169–176.[11] T. Hanika and J. Hirth. “Conexp-Clj - A Research Tool for FCA.” In:

Sup-plementary Proceedings of ICFCA 2019 Conference and Workshops, Frankfurt,Germany, June 25-28, 2019 . Ed. by D. Cristea et al. Vol. 2378. CEUR WorkshopProceedings. CEUR-WS.org, 2019, pp. 70–75.[12] T. Hanika and J. Hirth. “Knowledge Cores in Large Formal Contexts.” In: arXivpreprint arXiv:2002.11776 (2020).13] T. Hanika and J. Hirth. “On the Lattice of Conceptual Measurements.” In: arXivpreprint arXiv:2012.05287 (2020).[14] T. Hanika, M. Koyda, and G. Stumme. “Relevant Attributes in Formal Con-texts.” In:

Graph-Based Representation and Reasoning - 24th International Con-ference on Conceptual Structures, ICCS 2019, Marburg, Germany, July 1-4,2019, Proceedings . Ed. by D. Endres, M. Alam, and D. Sotropa. Vol. 11530.Lecture Notes in Computer Science. Springer, 2019, pp. 102–116.[15] M. Klimushkin, S. Obiedkov, and C. Roth. “Approaches to the Selection of Rel-evant Concepts in the Case of Noisy Data.” In:

ICFCA 2010 . Ed. by L. Kwuidaand B. Sertkaya. Springer Berlin, 2010, pp. 255–266.[16] S. O. Kuznetsov and T. P. Makhalova. “On interestingness measures of formalconcepts.” In:

Inf. Sci.

Ann.Math. Artif. Intell.

AlgebraUniversalis

Discret. Math.

Foundations of Measurement – Representation, Axiomatization,and Invariance . Vol. 3. Academic Press, 1990.[21] M. Mahn.

Gew¨urze : das Standardwerk . Ed. by G. M¨uller-Wallraf. M¨unchen:Christian, 2014, p. 319. isbn : 9783862446773.[22] J. Morgado. “Note on the central closure operators of complete lattices.” In:

Proc. Koninkl. nederl. akad. wet. A . Vol. 67. 1964, pp. 467–476.[23] J. Morgado. “Note on the distributive closure operators of a complete lattice.”In:

Portugaliae mathematica

Theory of Measurement . Heidelberg: Physica, 1971.[25] S. Prediger and G. Stumme. “Theory-driven Logical Scaling: Conceptual In-formation Systems meet Description Logics.” In:

Proc. KRDB’99 . Ed. by E.Franconi and M. Kifer. Vol. 21. CEUR-WS.org, 1999, pp. 46–49.[26] F. S. Roberts.

Measurement Theory . Cambridge University Press, 1984.[27] S. S. Stevens. “On the Theory of Scales of Measurement.” In:

Science issn : 0036-8075.[28] G. Stumme. “Hierarchies of Conceptual Scales.” In:

Proc.Workshop on Knowl-edge Acquisition, Modeling and Management (KAW’99) . Ed. by T. B. Gaines,R. Kremer, and M. Musen. Vol. 2. Banﬀ, Aug. 1999, pp. 78–95.[29] P. Suppes et al.

Foundations of Measurement – Geometrical, Threshold, andProbabilistic Representations . Vol. 2. Academic Press, 1989.[30] R. Wille. “Dependencies of many valued attributes.” In:

Classiﬁcation and relatedmethods of data analysis . Ed. by H.-H. Bock. North–Holland, 1988, pp. 581–586.[31] R. Wille. “Restructuring Lattice Theory: An Approach Based on Hierarchies ofConcepts.” In:

Ordered Sets: Proc. of the NATO Advanced Study Institute . Ed.by I. Rival. Dordrecht: Springer, 1982, pp. 445–470. isbnisbn