[PDF] Contextuality Analysis of Impossible Figures

Abstract

This paper has two purposes. One is to demonstrate contextuality analysis of systems of epistemic random variables. The other is to evaluate the performance of a new, hierarchical version of the measure of (non)contextuality introduced in earlier publications. As objects of analysis we use impossible figures of the kind created by the Penroses and Escher. We make no assumptions as to how an impossible figure is perceived, taking it instead as a fixed physical object allowing one of several deterministic descriptions. Systems of epistemic random variables are obtained by probabilistically mixing these deterministic systems. This probabilistic mixture reflects our uncertainty or lack of knowledge rather than random variability in the frequentist sense.

Full PDF

CContextuality Analysis of Impossible Figures

Víctor H. Cervantes & Ehtibar N. DzhafarovPurdue University

Abstract

This paper has two purposes. One is to demonstrate contextuality analysis of systems ofepistemic random variables. The other is to evaluate the performance of a new, hierarchicalversion of the measure of (non)contextuality introduced in earlier publications. As objects ofanalysis we use impossible ﬁgures of the kind created by the Penroses and Escher. We make noassumptions as to how an impossible ﬁgure is perceived, taking it instead as a ﬁxed physicalobject allowing one of several deterministic descriptions. Systems of epistemic random variablesare obtained by probabilistically mixing these deterministic systems. This probabilistic mixturereﬂects our uncertainty or lack of knowledge rather than random variability in the frequentistsense.

Our main purpose is to illustrate the use of epistemic random variables using objects that arenaturally described in a deterministic way, but not uniquely. That is, these objects are describedby one of several deterministic systems. The ﬁrst applications of contextuality analysis to suchsystems is presented in [1], using various deterministic representations of the Liar’s paradox. In thepresent paper the objects of the analysis are the so-called impossible ﬁgures: the Penrose triangleand several similar ﬁgures, as well as the Ascending and Descending staircase lithograph by M. C.Escher. The triangle and the staircase ﬁgures were famously discussed by Penrose and Penrose [2].In Appendix B to this paper we report several measures of contextuality computed for our systemsof epistemic random variables, but in the main text we focus on one measure only, introduced herefor the ﬁrst time, a hierarchical version of the (non)contextuality measure CNT -NCNT describedin [3, 4].The Contextuality-by-Default (CbD) theory [1, 3–5] has been developed to apply to abstractsystems of random variables, irrespective of one’s interpretation of probabilistic notions involved.However, most of its applications have dealt with empirical data and random variables understoodin the frequentist sense. In this paper we use the term epistemic random variable to denote avariable for which the probabilities with which it falls in diﬀerent sets of possible values reﬂect ouruncertainty or lack of knowledge rather than random variability in the frequentist sense.A system R of random variables is a set of double-indexed random variables R cq , where q ∈ Q denotes their content , that can be deﬁned as a question to which the random variable responds,and c ∈ C is their context , encompassing the conditions under which it is recorded. A system canbe presented as 1 a r X i v : . [ q - b i o . N C ] S e p = { R cq : c ∈ C, q ∈ Q, q ≺ c } , (1)where q ≺ c indicates that content q is responded to in context c . The variables of the subset R c = { R cq : q ∈ Q, q ≺ c } (2)are jointly distributed , whereas any two random variables in the subset R q = { R cq : c ∈ C, q ≺ c } (3)are stochastically unrelated. The set R c is called the bunch corresponding to context c and the set R q is referred to as the connection for content q .We will limit our discussion and applications to ﬁnite systems of binary random variables, with n = | Q | and m = | C | . Without loss of generality we may assume all variables R cq take values / .These systems can be represented by three vectors. The ﬁrst one is l = (1 , p cq : q ∈ Q, c ∈ C, q ≺ c ) (cid:124) , (4)the vector of the low-level marginals, p cq = (cid:104) R cq (cid:105) = Pr( R cq = 1) , with 1 formally equal to (cid:104)·(cid:105) . Thesecond vector is c = (min( p c q , p c q ) : q ∈ Q, c , c ∈ C, q ≺ c , c , c (cid:54) = c ) , (5)the vector of maximal probabilities with which two random variables in a connection could bothequal (if they possessed a joint distribution). It is uniquely determined by vector l . The thirdvector is b = ( b , . . . , b r ) (cid:124) , (6)where b s = ( p cq ,...,q s : q , . . . , q s ∈ Q, c ∈ C, q , . . . , q s ≺ c, q , . . . , q s are distinct ) . (7)Here, s = 2 , . . . , r , with p cq ,...,q s = (cid:104) R cq · · · R cq s (cid:105) = Pr( R cq = 1 , . . . , R cq s = 1) , and ≤ r ≤ n isthe largest number of distinct q ’s within a bunch. Thus, b is a minimal set of probabilities thatcompletely describes the joint distributions of the bunches in the system (for a given vector l ). Thecoordinates of a speciﬁc system of random variables are then given by p ∗ = ( l ∗ , c ∗ , b ∗ ) (cid:124) . (8)This is the reduced vectorial representation of the system, as discussed in [3]. The system is non-contextual if and only if there is a vector h ≥ (component-wise) such that Mh = p ∗ . (9)Here, the elements of h are probabilities assigned to all possible combinations of values ( ’s and ’s)assigned to all random variables R cq in the system, and M is an incidence (Boolean) matrix [3, 5]: More generally, any two R cq , R c (cid:48) q (cid:48) ∈ R with c (cid:54) = c (cid:48) are stochastically unrelated. That is, they do not possess ajoint distribution. We could allow r = 1 , in which case b is empty and the system is trivially noncontextual.

2n the row of M corresponding to a given element of p ∗ , say, Pr( R cq = . . . = R cq s = 1) , we put 1in the columns corresponding to the elements of h in which R cq , . . . , R cq s are assigned the value ;other columns in this row are ﬁlled with . Denoting the rows of M that correspond to l ∗ , c ∗ , b ∗ by, respectively, M l , M c , M b , we can rewrite (9) in extenso: ( M l , M c , M b ) (cid:124) h = ( l ∗ , c ∗ , b ∗ ) (cid:124) . (10)Ref. [3] also introduces a measure of contextuality, CNT , with its natural extension into ameasure of noncontextuality, NCNT . In [4], the behavior of both measures was characterized fora special class of systems, known as cyclic. These measures can be computed with the aid of linearprogramming. To compute CNT , one solves the following task ﬁnd minimizing subject to x 1 (cid:124) d − d ≤ b ∗ − M b x ≤ dx , d ≥ M l x = l ∗ M c x = c ∗ ,and, for any solution x ∗ , one ﬁnds CNT = (cid:107) b ∗ − M b x ∗ (cid:107) ( L -norm). By enumerating theelements of b ∗ as , . . . , K , the NCNT measure of noncontextuality is computed asNCNT = min i =1 ,...,K { min( d ∗− i , d ∗ + i ) } , (11)where d ∗− i , d ∗ + i are solutions of the following linear programming tasks (denoting by e i the unitvector with unity in its i -th element) ﬁnd maximizing subject to x , d − i d − i b ∗ − d − i e i = M b xx , d − i ≥ M l x = l ∗ M c x = c ∗ ﬁnd maximizing subject to x , d + i d + i b ∗ + d + i e i = M b xx , d + i ≥ M l x = l ∗ M c x = c ∗ .Clearly, for a contextual system CNT > and NCNT = 0 , whereas for a noncontextual one,CNT = 0 and NCNT ≥ . The development of this measure has greatly beneﬁted from discussions with and critical analysisby Janne V. Kujala (see Acknowledgements).For a given vector l ∗ (hence also the vector c ∗ ), we call the convex polytope K = { b |∃ h ≥ M l , M c , M b ) (cid:124) h = ( l ∗ , c ∗ , b ) (cid:124) } (12)the noncontextuality polytope. A system represented by the point p ∗ = ( l ∗ , c ∗ , b ∗ ) (cid:124) is noncontex-tual if and only if b ∗ ∈ K . The measures CNT and NCNT are the L -distance from b ∗ to thesurface of K when the system is, respectively, contextual or noncontextual.In [4, 6] it is noted that 3 ax(0 , p cq + p cq − ≤ p cq ,q ≤ min( p cq , p cq ) . (13)This observation is easily generalized as max (cid:18) , p cq ,q + p cq ,q + p cq ,q − (cid:19) ≤ p cq ,q ,q ≤ min( p cq ,q , p cq ,q , p cq ,q ) , (14) max (cid:16) , p cq ,q ,q + p cq ,q ,q + p cq ,q ,q + p cq ,q ,q − (cid:17) ≤ p cq ,q ,q ,q ≤ min( p cq ,q ,q , p cq ,q ,q , p cq ,q ,q , p cq ,q ,q ) , (15)and in complete generality, max (cid:16) , s − (cid:16)(cid:80) sk =1 p c { q ,...,q s }\{ q k } − (cid:17)(cid:17) ≤ p cq ,...,q s ≤ min( p c { q ,...,q s }\{ q k } : k = 1 , . . . , s ) , (16)for s = 2 , . . . , r . That is, the elements of b ∗ are hierarchically bounded, with l ∗ providing thebounds for b ∗ and b ∗ s − determining the bounds for b ∗ s if < s ≤ r .These hierarchical restrictions suggest a hierarchical way of approaching the measurement ofthe (non)contextuality of a system R represented by p ∗ . Consider the systems of equations ( M l , M c , M ) (cid:124) h = ( l ∗ , c ∗ , b ∗ ) (cid:124) ... ( M l , M c , M , . . . , M s ) (cid:124) h s = ( l ∗ , c ∗ , b ∗ , . . . , b ∗ s ) (cid:124) (17)... ( M l , M c , M , . . . , M r ) (cid:124) h r = ( l ∗ , c ∗ , b ∗ , . . . , b ∗ r ) (cid:124) = ( l ∗ , c ∗ , b ∗ ) (cid:124) = Mh , where M s is the submatrix formed by the rows of M corresponding to the elements of b ∗ s . Clearly,for a contextual system there is a value ≤ s ∗ ≤ r such that there is no solution h s ≥ for any s ≥ s ∗ while there is a solution for each s < s ∗ . For a noncontextual system, the solution to thesystem (9) implies the solution to all systems (17). Therefore, if R is contextual, we can furtherqualify its contextuality and say that it is contextual at level s ∗ . Its degree of contextuality at level s ∗ can be computed by solving the linear programming task ﬁnd minimizing subject to x 1 (cid:124) d − d ≤ b ∗ s ∗ − M s ∗ x ≤ dx , d ≥ M l x = l ∗ M c x = c ∗ M x = b ∗ ... M s ∗ − x = b ∗ s ∗ − ,and computing, for any solution x ∗ ,CNT s ∗ = (cid:107) b ∗ s ∗ − M s ∗ x ∗ (cid:107) . (18)4oreover, for each s for which ( M l , M c , M , . . . , M s ) (cid:124) h s = ( l ∗ , c ∗ , b ∗ , . . . , b ∗ s ) (cid:124) has a solution, wecan compute the noncontextuality of R at level s asNCNT s = min i =1 ,...,K s { min( d ∗− i , d ∗ + i ) } , (19)where K s is the number of elements of b ∗ s , and d ∗− i , d ∗ + i are solutions to the linear programmingtasks ﬁnd maximizing subject to x , d − i d − i b ∗ s − d + i e i = M s xx , d − i ≥ M l x = l ∗ M c x = c ∗ M x = b ∗ ... M s − x = b ∗ s − ﬁnd maximizing subject to x , d + i d + i b ∗ s + d + i e i = M s xx , d + i ≥ M l x = l ∗ M c x = c ∗ M x = b ∗ ... M s − x = b ∗ s − .In this way, we construct the hierarchical measure of (non)contextuality which characterizes thedegree of (non)contextuality of a system R by a vector of size s ∗ − if the system is contextual, orof size r − if the system is noncontextual. For a contextual system, CNT s ∗ gives the L -distancefrom b ∗ s ∗ to the surface of the polytope K s ∗ =  { b |∃ h ≥ M l , M c , M ) (cid:124) h =( l ∗ , c ∗ , b ) (cid:124) } if s ∗ = 2 { b s ∗ |∃ h ≥ M l , M c , M , . . . , M s ∗ ) (cid:124) h s ∗ = (cid:0) l ∗ , c ∗ , b ∗ , . . . , b ∗ s ∗ − , b s ∗ (cid:1) (cid:124) } if < s ∗ ≤ r . (20)And, at each level ≤ s < s ∗ ( ≤ s ≤ r for a noncontextual system), NCNT s is the L -distancefrom b ∗ s to the surface of the polytope K s = { b s |∃ h ≥ M l , M c , M , . . . , M s ) (cid:124) h s = (cid:0) l ∗ , c ∗ , b ∗ , . . . , b ∗ s − , b s (cid:1) (cid:124) } . (21)Analogously to CNT and NCNT , we have that for a system contextual at level s ∗ , CNT s ∗ > and NCNT s ∗ = 0 , and in addition for s < s ∗ , CNT s = 0 . We will now apply the measures just constructed to some drawings known as impossible ﬁgures.The general idea underlying their contextuality analysis is that the epistemic random variablesrepresenting an impossible ﬁgure should always be compared to those representing a realizable (i.e.,“normal”) ﬁgure, and the degree of contextuality be derived from the diﬀerence between the two.We begin with the well-known Penrose triangle, depicted in Figure 1a. Observe that one cansee precisely two faces of each of the three bars forming the ﬁgure. So each corner of the triangle isformed by four of these faces, one of which ends in the inner fold of the corner (invisible in Figure1a; in other ﬁgures, e.g. in Figure 2a, if the inner fold is visible, then two faces, of two diﬀerent5 a) (b) 0 1 0 11 0(c) (d)

Figure 1: ( a ) The Penrose triangle. ( b ) The Penrose triangle with superimposed cuts, shown withthe corresponding labels (0 or 1). ( c ) Look through the right bar of the triangle. ( d ) Right barremoved by its cuts.bars, end in it). Looking at the two faces of a given bar near one of the corners, either one ofthe faces ends in the inner fold while another does not (we encode this case by ), or both do not(encoded as ). In Figure 1b this is shown by interrupted and solid lines of the cuts made in eachbar at the two corners it connects. The endpoint labels and correspond to, respectively, thecase when both cut lines are solid, and the case when one of them is interrupted.We do not claim that this encoding describes how the ﬁgure or its elements are perceived. Thelatter is a question for an empirical investigation, such as the one presented in [7], where con-textuality analysis was applied to perception of an ambiguous ﬁgure (SchrÃ¶der’s stair). We aremerely selecting a possible description of the ﬁgure as a ﬁxed physical object. Other mathemat-ical descriptions of the Penrose triangle and other impossible ﬁgures (some of them very diﬀerentfrom those considered in this paper) can be found in [8–10]. Perception may make use of suchdescriptions, but it is most likely a complex process with descriptions changing in time.We can construct a system representation of the Penrose triangle in two ways. The ﬁrst oneconsists in looking at each of the three bars separately, and arbitrarily picking one of its two endsas the ﬁrst one in an ordered pair. We see in Figure 1 that whenever the end we pick is coded 0,the other end of the bar is coded 1, and vice-versa. Thus, we get two deterministic descriptions, cq q (cid:48) and cq q (cid:48) , (22)where c designates a speciﬁc bar (here, left, right, or bottom one), and ( q, q (cid:48) ) is an ordered pair of6 a) 0 0 110 0 (b) 11 00 11 Figure 2: Realizable triangles with superimposed cuts, shown with the corresponding labels (0 or1). ( a ) Triangle at 30 deg oblique projection. ( b ) Triangle at 90 deg oblique projection.its endpoints. Since these deterministic descriptions are equally plausible (i.e., there is no preferredordering of the endpoints of an isolated bar), we can assign the epistemic probability / to eachof them, obtaining thereby two perfectly anticorrelated uniformly distributed epistemic randomvariables, R cq (cid:48) = 0 R cq (cid:48) = 1 R cq = 0 0 / / R cq = 1 / / / / . (23)The stand-alone numbers in the table are joint and marginal probabilities.We now have to add for comparison epistemic random variables describing a bar of a realizable(i.e., “normal”) triangle, as the ones shown in Figure 2. We deﬁne a realizable ﬁgure as one thatcan be viewed as an oblique projection of a physical object of relatively small thickness (so thatperspective can be ignored). Each of the contexts representing a realizable bar has two identicallylabelled ends, and depending on the oblique projection angle, each bar (left, right, or bottom)can be labeled (1 , or (0 , . Using the same uniform epistemic mixing as before, we obtain twoperfectly correlated uniformly distributed epistemic random variables, R cq (cid:48) = 0 R cq (cid:48) = 1 R cq = 0 / / R cq = 1 0 / / / / . (24)Repeating this reasoning for each of the three bars, we get the context-content matrix7 ption 1 R R c R R c R R c R R c R R c R R c q q q q q q P . (25)The contexts and contents in P have been numbered so that the odd contexts represent the barsof the Penrose triangle (with the perfectly anticorrelated variables), and the even ones representthe corresponding realizable bars (with perfectly correlated variables). This ordering highlights thefact that the system is composed of three disjoint × subsystems (formally, cyclic systems of rank2). We call this way of representing the impossible ﬁgure Option 1 .Another way (

Option 2 ) to approach the construction of a system of epistemic random variablesdescribing the Penrose triangle is to look at the sequence of the labels for the cuts in the entiretriangle. We arbitrarily choose one of the six cuts in the ﬁgure as a starting point, and then proceedto the second cut in the same bar, then to the nearest cut in the adjacent bar, etc. This producesone of the two patterns shown below, representing three starting points each: c q q q q q q or c q q q q q q . (26)By uniformly mixing these patterns, we obtain a vector of six jointly distributed epistemicrandom variables that represent the Penrose triangle in a single context of random variables: (cid:0) R , R , R , R , R , R (cid:1) dist = (1 , , , , ,

0) (0 , , , , , any other pattern / / . (27)The numbers in the second row are probabilities of the values in the ﬁrst row, dist = stands for“distributed as.”To complete Option 2, we have to add for comparison the epistemic random variables describinga realizable triangle in the same fashion. We select the triangle depictions that may be obtainedby oblique projection at arbitrary angles excluding multiples of 60 deg. Except for rotations, thisprocess produces two distinct ﬁgures with respect to the patterns of and that describe them ina similar way to the Penrose triangle above. The three left patterns in (28) below are the possiblepatterns that describe the triangle in Figure 2a, and the three patterns on the right describe theone in Figure 2b. We exclude the multiples of 60 deg because at these angles one of the bars is drawn with a single visible sideinstead of two. c q q q q q q c q q q q q q c q q q q q q c q q q q q q c q q q q q q c q q q q q q . (28)By taking the uniform mixture of these deterministic patterns, we produce a joint distribution thatrepresents a realizable triangle as the second context in the context-content matrix P below Option 2 R R R R R R c = Penrose triangle R R R R R R c = Regular triangle q q q q q q P . (29)Whichever of the two options we choose, the pairs of random variables that represent two cutson the same bar are perfectly negatively correlated in the case of the Penrose triangle, and areperfectly positively correlated in a realizable triangle. Because of this the systems in both Options1 and 2 are contextual. In both cases the contextuality is achieved at level 2. Therefore theirhierarchical representation is a one-component vector. The values are CNT = 1 . in Option 1 andCNT = 4 . in Option 2. The value of CNT for Option 1 is clearly the sum of the contextuality values of the threedisjoint subsystems. Generally, for a system R which is composed of N disjoint systems R i , thenoncontextuality polytope K = N (cid:89) i =1 K i , (30)the Cartesian product of the corresponding noncontextuality polytopes K i of each system. Conse-quently, CNT s ∗ ( R ) = N (cid:88) i =1 CNT s ∗ ( R i ) , (31)and NCNT s ( R ) = N min i =1 { NCNT s ( R i ) } , (32)for ≤ s ≤ s ∗ . At this introductory stage we have not discussed the issue of normalization of the (non)contextuality values,because of which we should avoid comparing the values computed for systems of diﬀerent format (such as our Option1 and Option 2 systems). Figure 3: Alternative impossible triangle with cross-sections and corresponding labels (0 or 1).

To further explore how our Option 1 and Option 2 representations capture the impossibility ofa ﬁgure by the degree of contextuality of the resulting systems, let us consider an alternativeimpossible triangle, depicted in Figure 3. Unlike the Penrose triangle, this one has only one bar(the left one) with diﬀerent labels at its ends. For the bottom bar, both ends are coded as , and forthe right bar the two ends are coded as . Intuitively, this triangle seems “less impossible” than thePenrose one. Following the same Option 1 procedure as before, with the same variables describinga realizable triangle, we obtain the system of the same format as system P ; however, in two of thedisjoint × subsystems, in the odd-numbered contexts representing the bottom and right bars,the two random variables become deterministic, making thereby these subsystems noncontextual. For Option 2, we obtain a system of the same format as P , but the distribution of the bunchcorresponding to c (impossible triangle) is given by the uniform mixture of the patterns below: c q q q q q q c q q q q q q c q q q q q q c q q q q q q c q q q q q q c q q q q q q (33)The deterministic patterns whose mixture produces random variables variables in context c arethe same as in (28). The alternative impossible triangle is contextual under both representations,with CNT = 0 . under Option 1, and CNT = 1 . under Option 2, both values being lower thanthe corresponding ones for the Penrose triangle.Finally, we need to check that the procedures above yield no contextuality if an impossible ﬁgureis replaced with a realizable one. We use the realizable triangle in Figure 2a. Option 1 producesthree deterministic pairs of random variables for all three odd-numbered contexts corresponding to In CbD, any deterministic variable in a system can be deleted from the system without aﬀecting its(non)contextuality, and the same is true for any variables that is alone in its column (connection). Therefore,two of the subsystems, corresponding to the lower and the right bars, can be removed without aﬀecting the analysis. a) 1 0 101010 (b) 0 0 00 1010 (c) 110 000 11 Figure 4: Square ﬁgures, shown with cuts and corresponding labels (0 or 1). ( a ) Impossible square.( b ) Alternative impossible square. ( c ) Realizable “normal” square.the pictured triangle. For Option 2, the realizable triangle is represented by a context with thejoint distribution given by uniform mixture of the following patterns: c q q q q q q c q q q q q q c q q q q q q (34)The realizable triangle is noncontextual under both options, and all its hierarchical NCNT s mea-sures are zero (indicating that the systems for the realizable triangle lie on the surface of thecorresponding noncontextuality polytopes).Contextuality analysis similar to the reported above for the impossible triangles can be extendedto other impossible ﬁgures. We have conducted this analysis for the impossible square (or rectangle)and the impossible circle (also known as impossible loop). For the impossible square, in addition tothe ﬁgure constructed with the same type of corners as in the Penrose triangle (Figure 4a), we haveconsidered an alternative impossible square (Figure 4b). Figure 4c shows a realizable (“normal”)square. Since the procedures and reasoning here are in all essential details the same as for theimpossible triangles, we have relegated the details to Appendix A.Figure 5 depicts an impossible circle (5a) and a realizable one (5b). To characterize the circles,we may look at them as being composed of two handles joined by curved bars, with the cut linesdrawn between the handles and the joining bars. In this way, we obtain systems analogous to thosewe found for the two ways we are using to represent the ﬁgures. Again, we relegate the details ofthe analysis to Appendix A.We summarize the results of our analysis in Table 1. As we see, all the impossible ﬁgures weexplored, under both Option 1 and Option 2, are contextual at level 2.11 a) 1 0 10 (b) 110 0 Figure 5: Circular ﬁgures, shown with cuts and corresponding labels (0 or 1). ( a ) Impossible circle.( b ) Realizable circle. We approach the Ascending and Descending lithograph by M. C. Escher (Figure 6) by consideringthe four stair ﬂights as four contents, q , q , q , q . The same as for the impossible ﬁgures, we havetwo ways of constructing the system of epistemic random variables. In Option 1, the system isrepresented by the following context-content matrix: Option 1 R R c R R c R R c R R c R R R R c q q q q E . (35)If we view a stair ﬂight as ascending, then R cq = 1 , otherwise R cq = 0 . For every pair ofconsecutive stair ﬂights in the picture, they are seen as both ascending, or both descending. Byuniformly mixing Table 1: Measures of contextuality for the impossible ﬁguresImpossible ﬁgure Option 1CNT Option 2CNT Triangle in Figure 1 . . Triangle in Figure 3 . . Square in Figure 4a . . Square in Figure 4b . . Circle in Figure 5a . . cq q (cid:48) and cq q (cid:48) , (36)we form the ﬁrst four contexts in the system. Thus, in contexts c to c , the distributions aredescribed by (cid:104) R kk (cid:105) = (cid:104) R kk ⊕ (cid:105) = 0 . , (cid:104) R kk R kk ⊕ (cid:105) = 1 , (37)for k = 1 , . . . , ( ⊕ is cyclic addition, with ⊕ ). The ﬁfth context in E includes the possiblepatterns of ascent and descent for the entire staircase. The strangeness (or impossibility) of thesituation in the lithograph is that four stair ﬂights forming a closed loop cannot ascend or descendindeﬁnitely: the number of ascending stair ﬂights should be precisely two to counterbalance thedescending ones. In other words, the physically possible values of the bunch (cid:0) R , R , R , R (cid:1) are c q q q q c q q q q c q q q q c q q q q c q q q q c q q q q . (38)The epistemic distribution of the ﬁfth bunch therefore is given by the uniform mixture of thesedeterministic patterns.The second way in which we represent the Ascending-Descending staircase is by looking at thefour stair ﬂights together. This forms one (“impossible”) context where either all staircases aredescribed as ascending ( R q = 1 , for q = 1 , . . . , ) or all of them are descending ( R q = 0 , for i =1 , . . . , ). A second context, describing the physically realizable patterns is formed in the same wayas the ﬁfth context above. The system we obtain in this way is13 ption 2 R R R R c R R R R c q q q q E . (39)For both these options, the resulting systems are contextual at the level s ∗ = 2 , and the values areCNT = / for Option 1 and CNT = 2 for Option 2. We have introduced a hierarchical measure of (non)contextuality of systems of random variables.It follows the same logic and is calculated similarly to the measures of contextuality CNT andnoncontextuality NCNT . It is clear that in cyclic systems the hierarchical measure equals CNT when the system is contextual, and it equals NCNT when the system is noncontextual. It stillremains to investigate whether some of the properties of CNT -NCNT described in [4] for cyclicsystems generalize to some classes of noncyclic systems.The analysis of the impossible ﬁgures show that the intuitive degree of the impossibility orstrangeness of those ﬁgures can be captured through the contextuality of the systems of epistemicrandom variables chosen to describe them. When the endpoint codes of a bar in a Penrose-likeﬁgure are anticorrelated, the adjacent bars appear to twist toward diﬀerent directions. The moresuch “twisted” situations we see in a ﬁgure, the stranger it looks, and the greater its contextuality(if achieved at the same level).All contextual systems constructed in this paper happen to be contextual at level 2. Thiscannot be otherwise for the Option 1 systems which only contain two random variables in eachbunch. However, it is an empirical rather than mathematically deducible fact for other systems.More work is needed to ﬁnd out the scope of the impossible ﬁgures whose reasonable descriptionshave the same property. Acknowledgments.

The authors are grateful to Janne V. Kujala who critically analyzed andcorrected a mistake in the earlier version of the hierarchical contextuality measure presented in thispaper.

A Appendix: Impossible squares and circles

To create the epistemic random variables depicting realizable (“normal”) square, we proceed anal-ogously to the realizable triangles: we systematically consider all depictions that may be obtainedby oblique projection at arbitrary angles excluding the multiples of 90 deg. In this case, ignoringrotations, there is only one pattern of endpoint labels produced by all realizable squares. Vary-ing the starting point, we get the following sequences whose uniform mixture yields the epistemicrandom variables for realizable squares: 14 c q q q q q q q q c q q q q q q q q c q q q q q q q q c q q q q q q q q (A.1)All the results for impossible squares are obtained as a straightforward expansion of those forimpossible triangles. Following Option 1, we obtain the systems describing the rectangles in Fig. 4in format Option 1 R R c R R c R R c R R c R R c R R c R R c R R c q q q q q q q q P . (A.2)The systems for Option 2 have the format Option 2 R R R R R R R R c R R R R R R R R c q q q q q q q q P . (A.3)For the ﬁrst option, just as with the triangles, all random variables representing a realizablebar, are perfectly correlated. The random variables representing bars of the Penrose-like square arenegatively correlated, whereas for the alternative impossible square, two of the four odd-numberedbunches become deterministic. For the realizable square in Figure 4c, all four odd-numbered bunchesare deterministic, making the system trivially noncontextual.For Option 2, the patterns that generate the joint distribution for the context describing thePenrose-like impossible square are c q q q q q q q q c q q q q q q q q . (A.4)The patterns that generate the joint distribution for the context describing the alternative impos-sible square are 15 c q q q q q q q q c q q q q q q q q c q q q q q q q q c q q q q q q q q (A.5)The measures of contextuality for the impossible square (4a) are CNT = 2 and CNT = 8 ,under Options 1 and 2, respectively. The corresponding measures for the alternative impossiblesquare (4b) are CNT = 1 and CNT = 2 . The realizable square (4c) is noncontextual and all itsnoncontextuality measures are zero.For the impossible circle, the system for Option 1 has the format Option 1 R R c R R c R R c R R c q q q q P , (A.6)and the system for the Option 2 has the format Option 2 R R R R c R R R R c q q q q P . (A.7)The distributions of the random variables in each context for Option 1 follows the same patternas the corresponding Option 1 for the triangles and the squares. The two variables in a bunchcorresponding to the impossible loop are anticorrelated and the ones corresponding to the realizableone are perfectly correlated. For Option 2, c q q q q c q q q q (A.8)gives the patterns for the impossible circle. The patterns that generate the joint distribution forthe bunch representing the realizable circle are c q q q q c q q q q . (A.9)16he measures of contextuality for the impossible loop are CNT = 1 and CNT = 2 , underOptions 1 and 2, respectively. The realizable ﬁgure is noncontextual and all its noncontextualitymeasures are zero. B Appendix: Measures of contextuality

For completeness and comparison purposes, we present the values of the contextuality measuresCNT , CNT , CNT , CNT , and contextual fraction (CNTF), for each of the systems used to rep-resent the impossible ﬁgures. The measure CNT is the hierarchical measure of (non)contextualitydescribed in this paper. The contextual fraction was introduced in [11]. The other measures andthe linear programming tasks that may be used to compute them are presented in [3].Table B.1 presents the contextuality measures for the systems representing the triangles. TableB.2 presents the contextuality measures for squares ﬁgures. Table B.3 presents the contextualitymeasures for the systems representing the two circles. Table B.4 presents the contextuality measuresof the systems representing Escher’s Ascending and Descending Staircase.Table B.1: Contextuality measures of systems representing the alternative impossible triangle anda realizable triangle.Figure System CNT CNT CNT CNT CNTFPenrose triangle Option 1 . . . Option 2 . . . Alt. Imp. Option 1 . . . Option 2 . . . / Realizable Option 1

Option 2

Table B.2: Contextuality measures of systems representing the square ﬁgures.Figure System CNT CNT CNT CNT CNTFPenrose like Option 1

Option 2

Alt. Imp. Option 1

Option 2 / Realizable Option 1

Option 2 CNT CNT CNT CNTFImpossible Option 1

Option 2

Realizable Option 1

Option 2

Table B.4: Contextuality measures of systems representing the Ascending and Descending staircase.Representation CNT CNT CNT CNT CNTFOption 1 / / / / Option 2 / References [1] Dzhafarov, E.N. The Contextuality-by-Default View of the Sheaf-Theoretic Approach to Con-textuality. arXiv , arXiv:1906.02718.[2] Penrose, L.S.; Penrose, R. Impossible Objects: A Special Type of Visual Illusion.

Br. J.Psychol. , 49, 31–33. DOI:10.1111/j.2044-8295.1958.tb00634.x[3] Kujala, J.V.; Dzhafarov, E.N. Measures of Contextuality and Noncontextuality.

Phil. Trans.Roy. Soc. A , 377, 20190149. DOI:10.1098/rsta.2019.0149[4] Dzhafarov, E.N.; Kujala, J.V.; Cervantes, V.H. Contextuality and Noncontextuality Mea-sures and Generalized Bell Inequalities for Cyclic Systems.

Phys. Rev. A , 101, 042119.DOI:10.1103/PhysRevA.101.042119[5] Dzhafarov, E.N.; Cervantes, V.H.; Kujala, J.V. Contextuality in Canonical Systems of RandomVariables.

Phil. Trans. Roy. Soc. A , 375, 20160389. DOI:10.1098/rsta.2016.0389[6] Kujala, J.V.; Dzhafarov, E.N. Proof of a Conjecture on Contextuality in Cyclic Systems withBinary Variables.

Found. Phys. , 46, 282–299. DOI:10.1007/s10701-015-9964-8[7] Asano, M.; Hashimoto, T.; Khrennikov, A.; Ohya, M.; Tanaka, T. Violation of ContextualGeneralization of the Leggett-Garg Inequality for Recognition of Ambiguous Figures.

Phys.Scr. , T163, 014006. DOI:10.1088/0031-8949/2014/T163/014006[8] Cowan, T.M. Organizing the properties of impossible ﬁgures.

Perception , , 6, 41–45.DOI:10.1068/p060041[9] Penrose, R. On the Cohomology of Impossible Figures. Leonardo , , 25, 245–247.[10] Sugihara, K. Classiﬁcation of impossible objects. Perception , , 11, 65–74.DOI:10.1068/p110065[11] Abramsky, S.; Barbosa, R. S.; Mansﬁeld; S. The Contextual Fraction as a Measure of Contex-tuality. Phys. Rev. Lett.2017