[PDF] Crossing Minimization in Storyline Visualization

Abstract

A storyline visualization is a layout that represents the temporal dynamics of social interactions along time by the convergence of chronological lines. Among the criteria oriented at improving aesthetics and legibility of a representation of this type, a small number of line crossings is the hardest to achieve. We model the crossing minimization in the storyline visualization problem as a multi-layer crossing minimization problem with tree constraints. Our algorithm can compute a layout with the minimum number of crossings of the chronological lines. Computational results demonstrate that it can solve instances with more than 100 interactions and with more than 100 chronological lines to optimality.

Full PDF

CCrossing Minimization in Storyline Visualization

Martin Gronemann , Michael Jünger , Frauke Liers , Francesco Mambelli Department of Computer Science, University of Cologne {gronemann,mjuenger,mambelli}@informatik.uni-koeln.de Department of Mathematics, University of Erlangen-Nürnberg [email protected]

Abstract.

A storyline visualization is a layout that represents the tem-poral dynamics of social interactions along time by the convergence ofchronological lines. Among the criteria oriented at improving aestheticsand legibility of a representation of this type, a small number of linecrossings is the hardest to achieve. We model the crossing minimizationin the storyline visualization problem as a multi-layer crossing minimiza-tion problem with tree constraints. Our algorithm can compute a layoutwith the minimum number of crossings of the chronological lines. Com-putational results demonstrate that it can solve instances with more than interactions and with more than chronological lines to optimality.

Visualizing time-varying relationships between entities using converging and di-verging curves on a timeline has received a considerable amount of interest re-cently. The ability to display interactions among entities, while at the same timebeing able to put these in a chronological context has found applications beyondits initial purpose which coined its name. Munroe [26] introduced the storylinevisualization as hand-drawn illustrations in xkcd’s “Movie Narrative Charts”,where lines represent the characters of various popular movies and the scenes areordered chronologically and represented by bundling the lines of the correspond-ing characters. This concept has been used to visualize various spatiotemporaldata, like communities in time-varying graphs [25,33], software projects [28],topic analysis [9], etc.However, hand crafted or semi-automated methods are limited in their ap-plicability in a world of ever growing datasets. In order to obtain a storylinevisualization automatically, Tanahashi and Ma [32] discuss various aspects of awell-designed storyline visualization and present an evolutionary algorithm thatincorporates these in its objective function. They identify three important crite-ria that one usually wants to optimize: line crossings, whose number should besmall, line wiggles, that should be avoided by drawing every chronological line asstraight as possible, and space eﬃciency. Based on these aspects, Liu et al. [24]describe a technique which further improves the layout and runs signiﬁcantlyfaster compared to the evolutionary algorithm in [32]. Being able to create sto-ryline visualizations of bigger instances, Tanahashi et al. [31] take this one stepfurther and show how to create storyline visualizations from streaming data. a r X i v : . [ c s . D S ] A ug n this paper, we study the crossing minimization problem in storyline visual-ization from a combinatorial optimization point of view. While most approachestackle this problem with heuristics, Kostitsyna et al. [22] recently shed somelight onto its combinatorial properties. Besides noting that the decision problemis NP-complete (by reduction from bipartite crossing number), they provide alower bound for the number of crossings in a restricted variant of the problemand show that the general problem is ﬁxed-parameter tractable in the numberof characters. But a straightforward implementation of the algorithm is imprac-tical, even for a small number of characters.However, the problem is also similar to a few already well-studied problemsin graph-drawing. It may seem that the problem is related to a special case ofmetro-line crossing minimization, in particular, the so called two-sided modelsin which metro-lines run only from left to right [5,12]. However, all metro-linecrossing minimization problems have in common that they are deﬁned on a railnetwork whose embedding is ﬁxed due to its geographical context. This diﬀerencemakes a straightforward transformation diﬃcult.As already observed by Kostitsyna et al. [22], storyline crossing minimizationhas a strong relationship to multi-layer crossing minimization (MLCM) . Hereeach node of the graph is assigned to one of the layers (parallel straight lines) insuch a way that each edge connects two vertices on consecutive layers. The aim isto ﬁnd an ordering of the nodes on every layer such that the total number of edgecrossings is minimized. Although the corresponding planarity testing problem islinear-time solvable [19], the MLCM problem itself remains NP-hard even whenrestricted to two layers [13]. This led to the development of various heuristics,but also to exact approaches [6,8,16,18].In order to exploit existing techniques for solving MLCM instances, a straight-forward transformation can be sketched as follows. We represent the charactersas paths in an MLCM instance, in which the layers mark important points intime, e.g., a new bundle has to be created. Of course, a bundle of lines (paths)requires the corresponding vertices to be consecutive on the layer, a constraintwhich is problematic in the general MLCM setting.But we can borrow ideas from another crossing minimization problem type,the so called tanglegrams . The general tanglegram problem consists of two treesand a set of edges connecting the leaves of one tree with the leaves of the other,i.e., the leaves and the connecting edges form a bipartite graph. The objectiveis essentially to perform a bipartite (or two-layer) crossing minimization withthe additional constraint that leaves of the same subtree appear consecutivelyon the layers. However, when consulting the literature on tanglegrams, attentionmust be paid to the details. Some deﬁnitions require the trees to be binary, whileothers restrict the edge set to be a perfect matching, or both [7,11,27]. Since thefocus of the paper is not on tanglegrams, we restrict ourselves to the generalcase. Here two works are of interest, Baumann et al. [4] describe an ILP-basedapproach, whereas Wotzlaw et al. [34] employ a SAT-formulation. However, notonly the techniques diﬀer, in [4] only two layers are considered, whereas the SAT2pproach in [34] works on multiple layers but requires that the tree constraintsare k -ary with k > ﬁxed.The related problem of testing level planarity under tree constraints is dis-cussed by Angelini et al. [1]. They show that if edges are restricted to run betweenconsecutive layers, then the problem can be solved in quadratic time, whereas ifthis restriction does not hold, the problem is NP-complete.In this paper we solely focus on the crossing minimization problem in sto-ryline visualization. Therefore, we neglect other design aspects and restrict our-selves to the combinatorial problem, i.e., determining an ordering of the linessuch that the number of crossings is minimum. We model this problem as a spe-cial variant of the MLCM problem under tree constraints and provide an ILPformulation for it. Computational results show that we are able to solve instancesof moderate size to optimality within a few seconds. Moreover, we provide solu-tions for storyline instances from the literature, some of which have been solvedto optimality for the ﬁrst time. These are of particular value, since they oﬀer areference when comparing the crossing minimization performance of heuristics. We begin with a formal deﬁnition of the multi-layer crossing minimization prob-lem with tree constraints (MLCM-TC). The input for MLCM-TC consists of agraph G = ( V, E, T ) , where the set of the nodes V = (cid:83) pr =1 V r is partitioned into p diﬀerent layers. E = (cid:83) p − r =1 E r is the set of the edges such that E r ⊆ V r × V r +1 for every r ∈ { , , . . . , p − } , i.e., each edge of E r has one end in V r and theother in V r +1 . T = { T r | r = 1 , , . . . , p } is a family of rooted trees with at leastone internal node (root node), whose leaves are exactly the nodes of V r . In thefollowing, whenever we consider a graph, we implicitly assume that it is of thistype, which is known in the literature as “(proper) T -level graph” [1].Given an instance G = ( V, E, T ) of MLCM-TC, the task is to determine, foreach layer r ∈ { , , . . . , p } , permutations π r = (cid:104) v , v , . . . , v | V r | (cid:105) of the nodes in V r such that for each internal node τ of T r , all t leaves in the subtree rooted at τ are adjacent in π r , i.e., they form a sub-permutation (cid:104) v i , v i +1 , . . . , v i + t − (cid:105) forsome i ∈ { , , . . . , | V r | − t + 1 } .An easy reduction of the NP-hard MLCM problem to the MLCM-TC prob-lem (add a trivial tree with the root as the only internal node to each layer) showsthat MLCM-TC is NP-hard. This justiﬁes the usage of integer programmingtechniques in the next section. Now we give a formal description of the storylinevisualization problem in order to support our hypothesis that MLCM-TC cap-tures its core when the criteria “line wiggle avoidance” and “space eﬃciency” areneglected in favour of crossing minimization. A story consists of a set of charac-ters C = { c , c , . . . , c n } and a set of scenes S ⊆ C . For each scene s ∈ S , b s and e s are the points in time when s begins and ends, respectively. The time intervals [ b s , e s ] and [ b s , e s ] of two distinct scenes s and s may have a non-emptyintersection, but if they do, we require s ∩ s = ∅ .3 c c c s = { c , c } s = { c , c } s = { c , c } s = { c , c , c } b s e s b s b s e s e s b s e s Fig. 1.

An example of a story with four scenes and four characters, where characters c and c enter late, character c leaves early, and the time intervals [ b s , e s ] and [ b s , e s ] have a non-empty intersection. The storyline visualization problem requires depicting each character c ∈ C by a curve in the Euclidean plane that is strictly monotone on the time axis thatwe arbitrarily ﬁx to the horizontal x -axis. The curve begins at the x -coordinate x bc = min { b s | c ∈ s } and ends at x ec = max { e s | c ∈ s } . We call the interval [ x bc , x ec ] the lifespan of character c .The curves must be such that for every scene s = { c σ , c σ , . . . , c σ k } ∈ S the k corresponding curves in the interval [ b s , e s ] are horizontal parallel lines thatare equally spaced with vertical distance . Furthermore, the curves of all c (cid:54)∈ s are restricted to y -coordinates that have an absolute diﬀerence of at least tothe y -coordinates of the curves c σ i ∈ s in the interval [ b s , e s ] , and to all curvesfor characters that are not members of any scene that intersects with [ b s , e s ] . Anexample is given in Fig. 1.Given a story ( C, S, { [ b s , e s ] | s ∈ S } ) , we construct an MLCM-TC instance G = ( V, E, T ) as follows:1. Sort the points in time { b s | s ∈ S } ∪ { e s | s ∈ S } in non-decreasing order,and let (cid:104) t , t , . . . , t p (cid:105) be the sorted sequence.2. Associate a layer V r with each t r ( r ∈ { , , . . . , p } ), create a node v c,r foreach character c for which t r is within its lifespan, i.e., for which t r ∈ [ x bc , x ec ] ,and let V r = { v c,r | t r ∈ [ x bc , x ec ] } .3. Let V = (cid:83) pr =1 V r .4. Let E = (cid:8) { v c,r , v c,r +1 } for all c ∈ C such that t r , t r +1 ∈ [ x bc , x ec ] (cid:9) .5. For each layer V r create a tree T r as follows:i. For each scene s = { c σ , c σ , . . . , c σ k } such that t r ∈ [ b s , e s ] create aninternal tree node v s,r and tree edges { v s,r , v c σi ,r } for all i ∈ { , , . . . , k } .ii. Unless the above results in a rooted tree with all nodes in V r as leaves,create a tree root ρ r and tree edges connecting ρ r to all previously createdinternal tree nodes of T r and to all character nodes in V r that are notjoined to a previously added internal tree node.In Fig. 2, we demonstrate the construction for our example instance from Fig. 1.Notice that the trees in T are all of height up to , which means that storylinevisualization instances yield a special subclass of MLCM-TC instances. By con-4 b s ) t t t t t t t t ( e s ) ( b s ) ( b s ) ( e s ) ( e s ) ( b s ) ( e s ) c c c c s s s s s s s s s s T T T T T T T T Fig. 2.

The MLCM-TC instance of the story of Fig. 1. struction, an optimal solution of this MLCM-TC instance induces a storyline vi-sualization with the minimum number of crossings, and, conversely, any instanceof this special MLCM-TC subclass with trees of height up to is the result ofthe given transformation for some story. Thus, both problems are equivalent. AsMLCM can be reduced to this special subclass, NP-hardness is maintained. We present an integer linear programming (ILP) formulation of MLCM-TC.ILP formulations have already been introduced for the general MLCM prob-lem [16,18] as well as for MLCM-TC, when restricted to the special case of twolayers only [4]. Both models use quadratic ordering formulations. In this section,we will extend these formulations to an ILP model for MLCM-TC.To this end, let G = ( V, E, T ) be an instance of MLCM-TC, as describedin Sect. 2. For every layer r ∈ { , , . . . , p } , let V (2) r = { ( i, j ) ∈ V r × V r : i < j } be the set of all the ordered pairs of nodes on the considered layer with theﬁrst index smaller than the second. As the total number of edge crossings isthe sum of all crossings in adjacent layers r and r + 1 , summed up for all r ∈ { , , . . . , p − } , let us consider the problem for a pair of adjacent layers r and r + 1 , with r ∈ { , , . . . , p − } .A permutation of the nodes in V r is characterized by variables x rij ∈ { , } associated with the pairs ( i, j ) ∈ V (2) r as follows: x rij = 1 if and only if i is placed above j on layer r. Then a pair of edges ( i, k ) , ( j, (cid:96) ) ∈ E r crosses if and only if i is placed above j on layer r and (cid:96) is placed above k on layer r + 1 or j is placed above i on layer r and k is placed above (cid:96) on layer r + 1 , see Fig. 3. 5 j k‘r r + 1 ij k‘r r + 1 ij k‘r r + 1 ij k‘r r + 1 Fig. 3.

An edge pair crosses in two of four cases.

Therefore, if { x rij | ( i, j ) ∈ V (2) r } and { x r +1 k(cid:96) | ( k, (cid:96) ) ∈ V (2) r +1 } describe nodepermutations on layers r and r + 1 , respectively, we have c ijk(cid:96) := x rij (1 − x r +1 k(cid:96) ) + (1 − x rij ) x r +1 k(cid:96) ∈ { , } and c ijk(cid:96) = 1 if and only if the edges ( i, k ) and ( j, (cid:96) ) cross.It is well known (see, e.g., [14]) that { x rij ∈ { , } | ( i, j ) ∈ V (2) r } characterizesa node permutation on V r if and only if the transitivity conditions ≤ x rhi + x rij − x rhj ≤ h < i < j ) are satisﬁed for all r ∈ { , , . . . , p } .It remains to model the tree conditions implied by the elements of T . Given alayer r ∈ { , , . . . , p } and two nodes i and j in V r , we denote by P ( i, j ) the lowestcommon ancestor of i and j in T r . Let V (3) r = { ( h, i, j ) ∈ V r × V r × V r : h < i < j } .For every r ∈ { , , . . . , p } and every triple ( h, i, j ) ∈ V (3) r , we impose the treeconstraints x rhj = x rij if P ( h, i ) (cid:54) = P ( P ( h, i ) , j ) ,x rhi = x rhj if P ( i, j ) (cid:54) = P ( h, P ( i, j )) . The ﬁrst equation forbids the placement of j between h and i in case j does notbelong to the smallest subtree containing h and i . Similarly, the second equationforbids the placement of h between i and j in case h is not contained in thesmallest subtree of i and j .Putting it all together, we obtain the following model for MLCM-TC basedon a combination of [18] for MLCM and [4] for the special case of MLCM-TCfor two layers:minimize p − (cid:88) r =1 (cid:88) ( i,j ) ∈ V (2) r , ( k,(cid:96) ) ∈ V (2) r +1 ( i,k ) , ( j,(cid:96) ) ∈ E r x rij (1 − x r +1 k(cid:96) ) + (1 − x rij ) x r +1 k(cid:96) ≤ x rhi + x rij − x rhj ≤ for all r ∈ { , , . . . , p } and ( h, i, j ) ∈ V (3) r x rhj = x rij for all r ∈ { , , . . . , p } and ( h, i, j ) ∈ V (3) r if P ( h, i ) (cid:54) = P ( P ( h, i ) , j ) x rhi = x rhj for all r ∈ { , , . . . , p } and ( h, i, j ) ∈ V (3) r if P ( i, j ) (cid:54) = P ( h, P ( i, j )) x rij ∈ { , } for all r ∈ { , , . . . , p } and ( i, j ) ∈ V (2) r . This is a quadratic 0-1-programming problem with linear constraints, namely,the transitivity conditions and the tree conditions. (Without the tree conditions,the problem is also called a quadratic linear ordering problem .)When we temporarily ignore the transitivity conditions and the tree con-ditions, the remaining problem is known as quadratic 0-1-optimization of theform minimize z T Qz + q T z s.t. z ∈ { , } N for an upper triangular matrix Q ∈ Z N × N and a vector q ∈ Z N . A well knownconstruction of Hammer [15], see also [2,10,23], results in an equivalent formula-tion as a maximum cut problem on a graph G mc = ( V mc , E mc ) with N +1 nodes,all but one are identiﬁed with the z i , i ∈ { , , . . . , N } . Let us call the additionalnode z , so V mc = { z , z , . . . , z N } . The undirected edges ( z i , z j ) , ≤ i < j ≤ N ,correspond to the nonzero entries of the matrix Q , and there are additional N edges ( z , z i ) for ≤ i ≤ N , giving the edge set E mc . The edge weights w e = w ij , ≤ i < j ≤ N , are easily computed from Q and q . For W ⊆ V mc the edge set δ ( W ) = { ( i, j ) ∈ E mc | i ∈ W, j ∈ V mc \ W } is called a cut in G mc . Then theresulting maximum cut problem has the form max { w ( δ ( W )) | W ⊆ V mc } . By introducing variables y e ∈ { , } for each e ∈ E mc , the maximum cutproblem can be formulated asmaximize (cid:88) e ∈ E mc w e y e subject to (cid:80) e ∈ F y e − (cid:80) e ∈ C \ F y e ≤ | F | − for all cycles C ⊆ E mc and all F ⊆ C, | F | odd y e ∈ { , } for all e ∈ E mc , see [3]. The constraints are called odd cycle constraints .Applying this transformation is the key to our algorithm: The edges e ∈ E mc not incident to z correspond to edge pairs ( i, k ) , ( j, (cid:96) ) ∈ E r , r ∈ { , , . . . , p − } .The edges e ∈ E mc that are incident to z correspond to our variables x rij for r ∈ { , , . . . , p } , i < j . In view of the latter property, we can formulateMLCM-TC as a maximum cut problem with the additional transitivity and treeconstraints, and we can solve it using a branch and cut approach for the maxi-mum cut problem like in [2] that additionally enforces these extra constraints.7 Implementation

The implementation used to determine the minimum number of crossings in astoryline visualization consists of two main phases, a preprocessing phase and abranch and cut phase. During the preprocessing, we ﬁrst reduce the number oflayers of the problem (if possible), by identifying two consecutive layers r and r + 1 in case the corresponding trees T r and T r +1 are identical and every nodein V r and V r +1 is an end of one edge of E r (e.g., layers and of Fig. 2 canbe identiﬁed). Then, a variant of the barycenter heuristic proposed by Sugiyamaet al. [29], in which the presence of the trees on layers is taken into account, isexecuted in order to obtain an initial feasible solution that deﬁnes the indexingwithin the layers: In this heuristic, the nodes of the trees are sorted accordingto their barycenters. The barycenter of a given leaf t is computed by assigningto each edge, that has t as end, the relative position of the other end as weight.The barycenter of each internal node τ is the mean of the barycenters of all theleaves of the subtree rooted at τ .During the creation of the maximum cut graph induced by the heuristicsolution, we exploit the fact that the tree constraints force many variables toassume the same value, so that we can identify them. Moreover, this procedurereduces also the number of constraints consistently after all variables have beenreplaced by their representatives: On the one hand, the tree constraints arenot needed in the formulation anymore; on the other hand, some transitivityconstraints become deactivated or redundant. It is important to point out that,during this ﬁrst phase, the problem is initialized without constraints and theyare added according to need during the subsequent branch and cut phase.The branch and cut phase is realized in C++ using ABACUS [20] andCPLEX [17]. The initial relaxation consists just of the objective function to-gether with lower bounds 0 and upper bounds 1 for the variables. Odd cycleconstraints and transitivity constraints are generated via separation, the formerwith the same strategy as described in [2], the latter by complete enumeration. Our test-bed consists of:– three movie instances [30], namely “Inception”, the original trilogy of “StarWars” and “The Matrix”;– three book instances from the Stanford GraphBase database [21], namely“Adventures of Huckleberry Finn”, “Anna Karenina” and “Les Misérables”.These instances have been converted to MLCM-TC by using the proceduredescribed in Sect. 2. In the conversion of the book instances, a slight change isrequired: Since these instances do not report time intervals, but just the list ofthe characters involved in each scene of each chapter, a layer has been createdfor each of these scenes, instead of for each beginning and ending time point.8he three movie instances have been generated using the raw data set from[30] in order to compare them with results in the literature. We obtained “Incep-tion”, “Star Wars” and “The Matrix” following the principles described in Sect. 2.However, after having solved them, we realized that the number of crossings givenby our algorithm for “Inception” was , while it was in [31] and in [24].After a careful study of the layouts provided in [24,31,32], we noticed that thestorylines of “Inception” and “The Matrix” in [24,31,32] diﬀer from the raw dataset provided by [30], and therefore are not comparable with our instances.In order to make a comparison possible, “Inception” required three majormodiﬁcations. This modiﬁed instance is called “Inception-sf” and is generatedby incorporating the following changes that are based on a careful study of thelayouts provided in [24,31,32]. The storyline for the character “Mal” is allowedto take shortcuts, i.e., in long periods of absence it is drawn as a thin curve thatmay cross other storylines without accounting for these crossings (see Fig. 12in [24]). Moreover, the grouping at the end of the movie does not correspondto the last scene in the data set. To keep our layout comparable, we enforcedin our new instance the same grouping at the end. The third discrepancy is thenumber of characters. In the data from [30] there are ten characters listed inthe corresponding ﬁle, whereas the layouts from the literature [24,31,32] containonly eight storylines, in which “Arch” and “Asian” are missing. A major modiﬁ-cation was also necessary in “The Matrix”, where the storylines for the characters“Brown”, “Smith” and “Jones” are allowed to take shortcuts as well. We call it“The Matrix-sf”.Since the instances “Anna Karenina” and “Les Misérables” are very big, wehave split them into chapters and sequences of chapters. The resulting test-bedis made of eight chapters, seven pairs of chapters, six triples of chapters andﬁve quadruples of chapters from “Anna Karenina”, and ﬁve chapters, four pairsof chapters and three triples of chapters from “Les Misérables”, plus the entire“Adventures of Huckleberry Finn”, “Inception-sf”, “Inception”, “Star Wars”, “TheMatrix-sf”, and “The Matrix”.To the best of our knowledge, this is the ﬁrst time in which ILP techniques areapplied to storyline visualizations. Thus comparisons of computational resultsare not possible. Runs were performed on one node of the HPC Cluster of theComputer Science Department of the University of Cologne. The node usedconsists of two Intel E5-2690v2 CPUs with ten cores each and 128GB RAM.While the book instances generated from the Stanford GraphBase databaseare introduced here for the ﬁrst time, the literature provides crossing countsfor the three movie instances (“Inception”, “Star Wars”, and “The Matrix”). Ta-ble 1 shows a comparison of the minimum number of crossings (OPT) from ourapproach with the numbers of crossings obtained by the streaming-oriented ap-proach from Tanahashi et al. [31] (THM), the Storyﬂow approach from Liu etal. [24] (LIU), and the evolutionary algorithm from Tanahashi and Ma [32] (TM).Crossing counts for THM, LIU and TM are taken from Table 3 in [31]. We canconﬁrm that the best solution reported by Liu et al. [24] for the movie “Incep-tion” is optimal. For “Star Wars” the approach from Tanahashi et al. [31] comes9ery close to the optimal solution, even though the instance is the biggest andhas the highest crossing count. One may conclude that the heuristics in [24,31]deliver solutions with a good crossing count, especially when considering the factthat they do not optimize the crossing count alone. Table 1.

Comparison of the solution of the movies.OPT THM [31] LIU [24] TM [32]Inception-sf Star Wars

39 41

48 51

The Matrix-sf In Table 2, we report the information about the solution of the consideredinstances: The number of layers ( p ), of nodes ( | V | ), of edges ( | E | ), the minimumnumber of crossings (cr) in boldface or a pair [ lower bound, best known number ofcrossings ] , the number of variables ( n var ), of odd cycle constraints added duringthe separation ( n oddc ), of transitivity constraints added during the separation( n trans ), of subproblems in the branch and cut tree ( n sub ), of linear programmingrelaxations solved ( n LPs ), and the runtime expressed in seconds (Time) where“t.l.” means that the run was aborted due to the time limit of one hour, in whichcases the cr column contains an interval. While of the instances have beensolved to optimality, for the remaining instances the best lower bound for thenumber of crossings diﬀers from the best solution found at timeout termination.When we analyze the behaviour of our algorithm, we have to distinguishbetween movie and book instances: Since the original instances from [30] allowmore than one scene per layer, the trees on the layers of the movie instancesrestrict consistently the possible permutations of the corresponding nodes andconsequently reduce the number of variables. On the other hand, this is notthe case for the book instances, where only one scene per layer occurs. We canobserve that MLCM-TC for movies tends to be much easier in comparison to abook instance with similar numbers of layers, nodes, and edges.The diﬃculty of a book instance is mainly inﬂuenced by the combinationof two parameters: the number of layers p and the number of nodes | V | . Ifthe number of nodes is ﬁxed, the higher the number of layers is, the easier thesolution is, since the distribution of the nodes on more layers reduces the numberof variables of the problem. On the other hand, if the number of layers is ﬁxed,the diﬃculty increases with the number of nodes.The hardest instance we have been able to solve to optimality is “anna2-4”,where nodes are distributed on only layers which results in

40 789 variables. The biggest solved instance in terms of number of layers is “jean1-3”with layers but only nodes, which results in

27 720 variables.We present crossing minimal storyline visualizations of the three movie in-stances in Fig. 4 and the two book instances in Fig. 5.10 able 2.

Information about the solution of the considered instances. p | V | | E | cr n var n oddc n trans n sub n LPs

Timeanna1

58 409 368 . anna2

58 525 489 . anna3

48 265 219

951 0 0 1 1 0 . anna4

49 364 334 . anna5

71 615 565 . anna6

56 522 495 . anna7

62 467 420 . anna8

28 192 175 . anna1-2

117 1 454 1 397

16 433 18 284 89 5 545 196 . anna2-3

108 1 461 1 394

18 763 16 849 29 3 469 48 . anna3-4

100 1 015 951 . anna4-5

126 1 808 1 748

23 742 26 129 181 3 814 306 . anna5-6

129 1 760 1 697

19 967 23 155 252 3 656 281 . anna6-7

120 1 445 1 385

14 464 32 396 671 5 3 008 1 387 . anna7-8

90 905 850 . anna1-3

166 2 948 2 865 [100 , t.l.anna2-4

158 2 637 2 557

40 789 46 600 351 3 2 042 1 284 . anna3-5

174 3 100 3 012 [115 , t.l.anna4-6

178 3 115 3 044 [124 , t.l.anna5-7

191 3 742 3 656 [144 , t.l.anna6-8

146 2 205 2 140 [117 , t.l.anna1-4

216 4 627 4 534 [115 , t.l.anna2-5

232 5 366 5 266 [102 , t.l.anna3-6

226 5 262 5 168 [122 , t.l.anna4-7

240 5 467 5 375 [117 , t.l.anna5-8

217 4 624 4 534 [123 , t.l.huck

107 1 059 985 . jean1

95 502 462 . jean2

59 226 212

461 385 0 1 44 0 . jean3

99 873 838 . jean4

76 909 876 . jean5

73 491 471 . jean1-2

154 1 102 1 055 . jean2-3

159 1 808 1 767

18 882 14 128 1 512 3 732 48 . jean3-4

176 3 249 3 208 [115 , t.l.jean4-5

149 1 943 1 907

24 584 32 573 619 3 1 037 1012 . jean1-3

254 2 853 2 780

27 720 20 886 1 991 3 1 177 143 . jean2-4

235 4 182 4 135 [130 , t.l.jean3-5

248 4 429 4 386 [101 , t.l.Inception-sf

137 798 787 . Inception

139 925 915 . Star Wars

100 940 926 . The Matrix-sf

82 678 660 . The Matrix

82 683 669 . a) The movie “Inception” with crossings.(b) The movie “Inception-sf” with crossings.(c) The original trilogy of the movie “Star Wars” with crossings.(d) The movie “The Matrix” with crossings. Fig. 4.

Storyline visualizations with minimum number of crossings of the three moviesfrom [30].

In this work we have tackled the crossing minimization problem in storyline visu-alization via an ILP formulation. Despite being an NP-hard problem, computa-tional results show that with our approach one can handle instances of mediumsize within a reasonable time frame. However, our approach is of purely combina-torial nature, thus, extending it to automatically generate storyline visualizationssuch that other design criteria are taken into account is not straightforward.

Acknowledgments

The authors are grateful to Käte Zimmer who made her MLCM code, developedin the context of her Master’s thesis [35], available to us. Her code served as thebasis for our experimental MLCM-TC implementation. Our work is supportedby the EU grant FP7-PEOPLE-2012-ITN - Marie-Curie Action “Initial TrainingNetworks” no. 316647 “Mixed-Integer Nonlinear Optimization” (MINO).12 a) The third chapter of the book “Anna Karenina” with crossings.(b) The ﬁrst chapter of the book “Les Misérables” with crossings. Fig. 5.

Storyline visualizations of two chapters from “Anna Karenina” and “Les Misé-rables” [21].

References

1. Angelini, P., Da Lozzo, G., Di Battista, G., Frati, F., Roselli, V.: The importanceof being proper: (In clustered-level planarity and T -level planarity). TheoreticalComputer Science 571, 1–9 (2015)2. Barahona, F., Jünger, M., Reinelt, G.: Experiments in quadratic 0-1 programming.Mathematical Programming 44, 127–137 (1989)3. Barahona, F., Mahjoub, A.R.: On the cut polytope. Mathematical Programming36(2), 157–173 (1986)4. Baumann, F., Buchheim, C., Liers, F.: Exact Bipartite Crossing Minimizationunder Tree Constraints. In: Festa, P. (ed.) Proceedings of the 9th InternationalSymposium on Experimental Algorithms [SEA 2010]. pp. 118–128. Springer (2010)5. Bekos, M.A., Kaufmann, M., Potika, K., Symvonis, A.: Line Crossing Minimizationon Metro Maps. In: Hong, S.H., Nishizeki, T., Quan, W. (eds.) Proceedings of the15th International Symposium on Graph Drawing [GD 2007]. pp. 231–242. Springer(2008)6. Buchheim, C., Wiegele, A., Zheng, L.: Exact Algorithms for the Quadratic LinearOrdering Problem. INFORMS Journal on Computing 22(1), 168–177 (2010)7. Buchin, K., Buchin, M., Byrka, J., Nöllenburg, M., Okamoto, Y., Silveira, R.I.,Wolﬀ, A.: Drawing (Complete) Binary Tanglegrams. Algorithmica 62(1), 309–332(2012)8. Chimani, M., Hungerländer, P., Jünger, M., Mutzel, P.: An SDP Approach toMulti-level Crossing Minimization. In: Müller-Hannemann, M., Werneck, R. (eds.)Proceedings of the 13th Workshop on Algorithm Engineering and Experiments[ALENEX 2011]. pp. 116–126. Society for Industrial and Applied Mathematics(2011)9. Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z.J., Tong, X., Qu, H.: TextFlow:Towards Better Understanding of Evolving Topics in Text. IEEE Transactions onVisualization and Computer Graphics 17(12), 2412–2421 (2011)

0. De Simone, C.: The cut polytope and the Boolean quadric polytope. DiscreteMathematics 79(1), 71–75 (1990)11. Fernau, H., Kaufmann, M., Poths, M.: Comparing trees via crossing minimization.Journal of Computer and System Sciences 76(7), 593–608 (2010)12. Fink, M., Pupyrev, S.: Metro-Line Crossing Minimization: Hardness, Approxima-tions, and Tractable Cases. In: Wismath, S., Wolﬀ, A. (eds.) Proceedings of the21st International Symposium on Graph Drawing [GD 2013]. pp. 328–339. Springer(2013)13. Garey, M.R., Johnson, D.S.: Crossing Number is NP-Complete. SIAM Journal onAlgebraic Discrete Methods 4(3), 312–316 (1983)14. Grötschel, M., Jünger, M., Reinelt, G.: Facets of the Linear Ordering Polytope.Mathematical Programming 33(1), 43–60 (1985)15. Hammer, P.: Some network ﬂow problems solved with pseudo-Boolean program-ming. Operations Research 13, 388–399 (1965)16. Healy, P., Kuusik, A.: Algorithms for multi-level graph planarity testing and layout.Theoretical Computer Science 320(2–3), 331–344 (2004)17. IBM: IBM ILOG CPLEX Optimization Studio 12.6. , 201418. Jünger, M., Lee, E.K., Mutzel, P., Odenthal, T.: A polyhedral approach to themulti-layer crossing minimization problem. In: Di Battista, G. (ed.) Proceedingsof the 5th International Symposium on Graph Drawing [GD 1997]. pp. 13–24.Springer (1997)19. Jünger, M., Leipert, S., Mutzel, P.: Level Planarity Testing in Linear Time. In:Whitesides, S.H. (ed.) Proceedings of the 6th International Symposium on GraphDrawing [GD 1998]. pp. 224–237. Springer (1998)20. Jünger, M., Thienel, S.: The ABACUS System for Branch-and-Cut-and-Price Algo-rithms in Integer Programming and Combinatorial Optimization. Software: Prac-tice and Experience 30, 1325–1352 (2000)21. Knuth, D.E.: The Stanford GraphBase source. ftp://ftp.cs.stanford.edu/pub/sgb/sgb.tar.gz , 199322. Kostitsyna, I., Nöllenburg, M., Polishchuk, V., Schulz, A., Strash, D.: On Mini-mizing Crossings in Storyline Visualizations. In: Di Giacomo, E., Lubiw, A. (eds.)Proceedings of the 23rd International Symposium on Graph Drawing and NetworkVisualization [GD 2015]. pp. 192–198. Springer (2015)23. Liers, F., Jünger, M., Reinelt, G., Rinaldi, G.: Computing Exact Ground States ofHard Ising Spin Glass Problems by Branch-and-Cut. In: Hartmann, A.K., Rieger,H. (eds.) New Optimization Algorithms in Physics, pp. 47–69. Wiley-VCH (2004)24. Liu, S., Wu, Y., Wei, E., Liu, M., Liu, Y.: StoryFlow: Tracking the Evolutionof Stories. IEEE Transactions on Visualization and Computer Graphics 19(12),2436–2445 (2013)25. Muelder, C.W., Crnovrsanin, T., Sallaberry, A., Ma, K.L.: Egocentric Storylinesfor Visual Analysis of Large Dynamic Graphs. In: Proceedings of the 2013 IEEEInternational Conference on Big Data. pp. 56–62 (2013)26. Munroe, R.: xkcd http://xkcd.com/657/ , 200927. Nöllenburg, M., Völker, M., Wolﬀ, A., Holten, D.: Drawing Binary Tanglegrams:An Experimental Evaluation. In: Finocchi, I., Hershberger, J. (eds.) Proceedings ofthe 11th Workshop on Algorithm Engineering and Experiments [ALENEX 2009].pp. 106–119. Society for Industrial and Applied Mathematics (2009)28. Ogawa, M., Ma, K.L.: Software Evolution Storylines. In: Telea, A.C. (ed.) Proceed-ings of the 5th International Symposium on Software Visualization [SOFTVIS’10].pp. 35–42. ACM (2010)

9. Sugiyama, K., Tagawa, S., Toda, M.: Methods for Visual Understanding of Hier-archical System Structures. IEEE Transactions on Systems, Man, and Cybernetics11(2), 109–125 (1981)30. Tanahashi, Y.: Movie data set. http://vis.cs.ucdavis.edu/~tanahashi/data_downloads/storyline_visualizations/story_data.tar , 201331. Tanahashi, Y., Hsueh, C.H., Ma, K.L.: An Eﬃcient Framework for GeneratingStoryline Visualizations from Streaming Data. IEEE Transactions on Visualizationand Computer Graphics 21(6), 730–742 (2015)32. Tanahashi, Y., Ma, K.L.: Design Considerations for Optimizing Storyline Visu-alizations. IEEE Transactions on Visualization and Computer Graphics 18(12),2679–2688 (2012)33. Vehlow, C., Beck, F., Auwärter, P., Weiskopf, D.: Visualizing the Evolution of Com-munities in Dynamic Graphs. Computer Graphics Forum 34(1), 277–288 (2015)34. Wotzlaw, A., Speckenmeyer, E., Porschen, S.: Generalized k -ary tanglegrams onlevel graphs: A satisﬁability-based approach and its evaluation. Discrete AppliedMathematics 160(16–17), 2349–2363 (2012)35. Zimmer, K.: Ein Branch-and-Cut-Algorithmus für Mehrschichten-Kreuzungsmini-mierung. Master’s thesis, Institut für Informatik, Universität zu Köln (2013)-ary tanglegrams onlevel graphs: A satisﬁability-based approach and its evaluation. Discrete AppliedMathematics 160(16–17), 2349–2363 (2012)35. Zimmer, K.: Ein Branch-and-Cut-Algorithmus für Mehrschichten-Kreuzungsmini-mierung. Master’s thesis, Institut für Informatik, Universität zu Köln (2013)