A Dynamic Data Structure for Temporal Reachability with Unsorted Contact Insertions
Luiz F. Afra Brito, Marcelo Albertini, Arnaud Casteigts, Bruno A. N. Travençolo
aa r X i v : . [ c s . D S ] F e b A Dynamic Data Structure for Temporal Reachabilitywith Unsorted Contact Insertions
Luiz F. Afra Brito , Marcelo Albertini , Arnaud Casteigts , and Bruno A. N. Traven¸colo Federal University of Uberlˆandia, Brazil University of Bordeaux, France
February 9, 2021
Abstract
Temporal graphs represent interactions between entities over the time. These interactionsmay be direct (a contact between two nodes at some time instant), or indirect, through se-quences of contacts called temporal paths (journeys). Deciding whether an entity can reachanother through a journey is useful for various applications in communication networks andepidemiology, among other fields. In this paper, we present a data structure which maintainstemporal reachability information under the addition of new contacts ( i.e., triplets ( u, v, t ) indi-cating that node u and node v interacted at time t ). In contrast to previous works, the contactscan be inserted in arbitrary order—in particular, non-chronologically—which corresponds tosystems where the information is collected a posteriori (e.g. when trying to reconstruct contam-ination chains among people). The main component of our data structure is a generalization oftransitive closure called timed transitive closure ( TTC ), which allows us to maintain reachabil-ity information relative to all nested time intervals, without storing all these intervals, nor thejourneys themselves.
TTC s are of independent interest and we study a number of their generalproperties. Let n be the number of nodes and τ be the number of timestamps in the lifetime of the temporal graph. Our data structure answers reachability queries regarding the existenceof a journey from a given node to another within given time interval in time O (log τ ); it has anamortized insertion time of O ( n log τ ); and it can reconstruct a valid journey that witnessesreachability in time O ( k log τ ), where k < n is the maximum number of edges of this journey.Finally, the space complexity of our reachability data structure is O ( n τ ), which remains withinthe worst-case size of the temporal graph itself. Temporal graphs represent interactions between entities over the time. These interactions oftenappear in the form of contacts at specific timestamps. Moreover, entities can also interact indirectlywith each others by chaining several contacts over time. For example, in a communication network,devices that are physically connected can send new messages or propagate received ones; thus, byfirst sending a new message and, then, repeatedly propagating messages over time, remote entitiescan communicate indirectly. Time-respecting paths in temporal graphs are known as temporalpaths, or simply journeys , and when a journey exists from one node to another node, we say thatthe first can reach the second. 1n a computational environment, it is often useful to check whether entities can reach eachother. Investigations on temporal reachability have been used for characterizing mobile and socialnetworks [18]; for validating protocols and better understanding communication networks [5, 20];for checking the existence of trajectories and improving flow in transportation networks [22, 21,3]; for assessing future states of ecological networks [14]; and for making plans for agents usingautomation networks [4]. Beyond the sole reachability, some applications require the ability toreconstruct a concrete journey if one exists. For example, journey reconstruction has been used forfinding and visualizing detailed trajectories in transportation networks [22, 9, 27, 11]; for visualizingsystem dynamics [12]; and for matching temporal patterns in temporal graph databases [15].In standard graphs, the problem of maintaining reachability information under various modifi-cations of the graph is known as dynamic connectivity and has been extensively studied [1, 10, 7,28, 17, 19]. Here, the adjective dynamic does not refer to the temporal nature of the network, itrefers to the fact that the computed information is to be updated after the input graph is changed.This maintenance is performed by means of a dynamic data structure , which stores intermediateinformation to speed up the query time (and its own update time after a change). Three types ofdynamic data structures are classically considered, depending on the type of change allowed, namely incremental (insertion only), decremental (deletion only), and fully-dynamic (both). Typically, theelements to be inserted and removed in the classical version are the edges of the graph.In the case of temporal graphs, the elements to be inserted are not edges but contacts, which areedges together with a timestamp, which indicates that the two corresponding nodes interacted atthis particular time. An important aspect of data structures which manipulate contacts is whetherthe order of insertion respects the order of the interactions themselves, i.e., the insertions are chronological . Algorithms for updating reachability information with the assumption that the inputis chronological have been proposed [2, 20]. However, this assumption does not capture importantuse cases where the contacts are collected in an unpredictable order and the reachability informationis updated afterwards. For instance, during scenarios of epidemics, outdated information containingthe interaction details among infected and non-infected individuals are reported in arbitrary order.Then, this information is periodically queried in order to better understand the disseminationprocess, and then take appropriate measures for reducing reachability and identifying sources ofcontamination [25, 24, 8, 16].Motivated by such scenarios, we investigate the problem of maintaining an incremental datastructure for temporal reachability, where the insertions of contacts are made in arbitrary order.A naive approach is to store and update the temporal graph itself (e.g., as a set of contact),then run standard journey computation algorithms like [26] when a query is made. However,the goal of a data structure is to reduce the computational cost of the queries based on pre-computing intermediate information. In fact, data structures typically offer a tradeoff betweenquery time, update time, and space. To the best of our knowledge, the only existing work supportingnon-chronological contact insertions and exploiting intermediate representations for speeding upreachability queries in temporal graphs is [23]. The solution in [23] relies on maintaining a directedacyclic graph (DAG) in which every original vertex is possibly copied up to τ times (where τ isthe number of timestamps) and a journey exists from u to v in the interval [ t, t ′ ] if and only if andvertex u t can reach vertex v t ′ in the DAG. Some paths preprocessing is additionally consideredthat results in an average speed up for reachability queries. However, the worst-case query timecorresponds to a standard path search (e.g. depth first search) in the DAG, which takes Θ( n τ )time in the case of dense temporal graphs (whose number of contacts is of the same order). The2pace complexity (size of the DAG) also corresponds essentially to the number of contacts, thusΘ( n τ ) in the worst case. Finally, the update time upon insertion is quite efficient, because theDAG representation allows its effect to remain local. If one ignores the cost of paths preprocessingin [23] (as we focus on worst-case analysis), it only takes O (1) time to update the DAG if thecorresponding nodes are already known, and up to Θ( τ ) otherwise, due to the creation of (up to) τ copies of the new nodes. In this paper, we consider the problem of maintaining reachability information through a datastructure that supports the following four operations, where by convention, G is a temporal graph, u and v are vertices of G , and t, t , and t are timestamps: • add contact(u, v, t) : Update information based on a contact from u to v at time t • can reach(u, v, t , t ) : Return true if u can reach v within the interval [ t , t ] • is connected( t , t ) : Return true if G restricted to the interval [ t , t ] is temporally con-nected, i.e. all vertices can reach each other within the interval [ t , t ] • reconstruct journey(u, v, t , t ) : Return a journey (if one exists) from u to v occurringwithin the interval [ t , t ]For generality, we consider directed contacts (timed arcs). Furthermore, if t and t are omittedin the above operations, then the entire lifetime of G is considered. The challenge in realizingthese operations is to answer queries as fast as possible, while keeping space consumption andupdate time at reasonable levels. The worst-case complexities are as follows: the query operations, can reach(u, v, t , t ) and is connected( t , t ) , run, respectively, in O (log τ ) and O ( n log τ )time; the update operation, add contact(u, v, t) , runs in O ( n log τ ) amortized time; and theretrieval operation, reconstruct journey(u, v, t , t ) , runs in O ( k log τ ) time, where k < n isthe length of the resulting journey. The worst-case space complexity remains within the worst-casesize of the temporal graph itself, namely O ( n τ ). Overall, the space complexity is comparable tothat of [23], while the query time is much faster and the update time is slower.The core of our data structure is a component called the timed transitive closure ( TTC ), whichgeneralizes the classical notion of a transitive closure ( TC ). Classical TCs capture reachabilityinformation among vertices over the entire lifetime of the network. They are classically encoded asa static directed graph where the existence of an arc from u to v implies that there is a journey u to v in the temporal graph. If one is not interested in querying reachability for specific subintervals, andif the contacts are inserted in chronological order, then TC s are actually sufficient for maintainingtemporal reachability information (see e.g. [2]). A generalization of TC has also been consideredin [20], which allows queries to be parametrized by a maximum journey duration, however basicjourney information, such as departure and arrival times, are not known and the computation ofthe structure requires the information to be processed at once and chronologically (i.e. subsequentupdates are not supported).In the unsorted (i.e., non-chronological) case, TC s do not provide enough information to decidewhether a new contact (possibly occurring at any point in history) can be composed with knownjourneys. To address this need, we introduce a generalization of TC s called timed transitive closure ( TTC s), which store information regarding the availability of journeys for a well-chosen set of time3ntervals, without storing the journeys themselves. We study the general properties of
TTC s and weprove, in particular, that one can restrict the number of intervals considered to O ( τ ) for any pairof nodes (as opposed to O ( τ )), with immediate consequences on the space complexity of a datastructure based on TTC s. This information is then exploited by our data structure algorithms.
This paper is organized as follows. In Section 2, we present basic definitions. In Section 3, weintroduce timed transitive closures, study their basic properties, and provide a number of low-levelprimitives for manipulating them. In Section 4, we describe the algorithms that perform eachoperation of our data structure based on
TTC s, together with their running time complexities.Finally, Section 5 concludes with some remarks and open questions.
Following the formalism in [6], a temporal graph can be generally represented by a tuple G =( V, E, T , ρ, ζ ), where V is a set of vertices, E ⊆ V × V is a set of edges, T is the time intervalover which the network exists ( lifetime ), ρ : E × T → { , } is a presence function that expresseswhether a given edge is present at a given time instant, and ζ : E × T 7→ T is a latency functionthat expresses the duration of an interaction for a given edge at a given time, where T is the timedomain (typically R or N ). In this paper, we consider a setting where E is a set of arcs (directededges), T is equal to N (time is discrete) and T = [1 , τ ] ⊆ T (the lifetime contains τ timestamps).The latency function is constant, namely ζ = δ , where δ is any fixed positive integer (typically 0or 1). We call ( u, v, t ) a contact in G if ρ (( u, v ) , t ) = 1. We use a short-hand notation G [ t ,t k ] whenrestricting the lifetime of G to a subinterval [ t , t k ] ⊆ T , and call G [ t ,t k ] a temporal subgraph of G .Finally, the static graph G = ( V, E ) is called the underlying graph of G .Reachability in temporal graphs is defined in a time-respecting way, by requiring that a pathtravels along increasing times (resp. non-decreasing times) if δ ≥ δ = 0). These pathsare called temporal paths or journeys, interchangeably. Definition 1 (Journey) . A journey from u to v in G is a sequence of contacts J = h c , c , . . . , c k i ,whose sequence of underlying arcs form a valid ( u, v ) -path in the underlying graph G and for eachcontact c i = ( u i , v i , t i ) , it holds that ρ (( u i , v i ) , t i ) = 1 and t i +1 ≥ t i + δ for each i ∈ [1 , k − .Additionally, we say that departure ( J ) = t , arrival ( J ) = t k + δ and duration ( J ) = arrival ( J ) − departure ( J ) . A journey is trivial if it consists of a single contact.
Definition 2 (Reachability) . A vertex u can reach a vertex v within time interval [ t , t ] iff thereexists a journey J from u to v in G [ t ,t ] (i.e., such that departure ( J ) ≥ t and arrival ( J ) ≤ t ). The (standard) transitive closure ( TC ) of a temporal graph G is a static directed graph G ∗ =( V, E ∗ ) such that ( u, v ) ∈ E ∗ if and only if u can reach v in G . This notion is illustrated inFigure 1. As already explained, if the contacts are discovered in a chronological order, then G ∗ canbe updated incrementally to compute the entire reachability information of G [2]. In the unsortedcase, however, this notion is not sufficient because it does not allows one to decide if a new contactcan be composed with previously-known journeys, which motivates the definition of more powerfulobjects. 4 bc d a bc d Figure 1: (Left) A temporal graph G on four vertices V = { a, b, c, d } , where the presence times of thearcs are depicted by labels. Whether δ = 0 or 1, this graph has only two non-trivial journeys, namely J = h ( a, b, , ( b, d, i and J = h ( a, c, , ( c, d, i . (Right) Transitive closure G ∗ . Note that J and J arerepresented by the same arc in G ∗ (the two contacts from b to d as well). In this section, we describe an extension of the concept of transitive closure called timed transitiveclosure ( TTC ). The purpose of
TTC s is to encode reachability information among the vertices,parametrized by time intervals, so that one can subsequently decide if a new contact occurringanywhere in history can be composed with existing journeys. The main components of
TTC s arecalled reachability tuples ( R -tuples). We introduce a number of operators on R -tuples, such asinclusion and concatenation, and describe their role in the construction and maintenance of a TTC . R -tuples) Just as the number of paths in a static graph, the number of journeys in a temporal graph could betoo large to be stored explicitly (typically, factorial in n ). To avoid this problem, R -tuples capturethe fact that a node can reach another within a certain time interval without storing the corre-sponding journeys. Thus, a single R -tuple may capture the reachability information correspondingto many possible journeys. We distinguish between two versions of R -tuples, namely ( existential ) R -tuples and constructive R -tuples, the latter adding information for reconstructing a journey thatwitnesses reachability. R -tuples The following definitions are given in the context of a temporal graph G whose vertex set is V ,lifetime is T = [1 , τ ], and latency is δ . Definition 3 ( R -tuple) . An (existential) R -tuple is a quadruplet r = ( u, v, t − , t + ) , where u and v are vertices in G , and t − and t + are timestamps in T . It encodes the fact that node u can reachnode v through a journey J such that departure ( J ) = t − and arrival ( J ) = t + . If several suchjourneys exist, then they are all captured by the same R -tuple. The set of journeys captured by an R -tuple r is denoted by J ( r ), and we say that r represents these journeys. An R -tuple is trivial when it represents a trivial journey ( i.e. , a single contact).Trivial R -tuples thus have the form ( u, v, t, t + δ ) for some t . The following relations and operationsare quite natural to define. 5 efinition 4 (Precedence ≺ ) . An interval I = [ t − , t +1 ] precedes an interval I = [ t − , t +2 ] , noted I ≺ I , if t +1 ≤ t − . Given two R -tuples r = ( u , v , t − , t +1 ) and r = ( u , v , t − , t +2 ) , r precedes r ,noted r ≺ r if t +1 ≤ t − and u = v . Intuitively, the precedence relation among R -tuples tells us that the journeys they represent canbe composed, leading to another R -tuple as follows: Definition 5 (Concatenation · ) . Given two R -tuples r = ( u , v , t − , t +1 ) and r = ( u , v , t − , t +2 ) such that r ≺ r , the concatenation of r with r is the R -tuple r · r = ( u , v , t − , t +2 ) . The natural inclusion among intervals extends to R -tuples as follows: Definition 6 (Inclusion ⊆ ) . Given two R -tuples r = ( u , v , t − , t +1 ) and r = ( u , v , t − , t +2 ) , r ⊆ r if and only if u = u , v = v , and [ t − , t +1 ] ⊆ [ t − , t +2 ] (that is, t − ≤ t − ≤ t +1 ≤ t +2 ). If neither r ⊆ r nor r ⊆ r (or if the vertices are different), then r and r are called incomparable . Intuitively, if r ⊆ r , then any of the journeys represented by r could be replacedby a (possibly faster) journey represented by r . More precisely: Lemma 1.
Let u and v be two nodes in V . Let I = [ t − , t +1 ] and I = [ t − , t +2 ] be two subintervalsof T such that I ⊆ I . If u can reach v within I , then u can reach v within I .Proof. The proof is straightforward, we give it for completeness. Let r be the R -tuple ( u, v, t − , t +1 )and let J be any of the journeys in J ( r ). One can reach v from u within I through the threefollowing steps: (1) wait at u from t − to t − , (2) travel from u to v using J , and finally (3) wait at v from t +1 to t +2 .The main consequence of Lemma 1 is that if r ⊆ r , then r is redundant for answeringreachability queries from u to v . Definition 7 (Redundancy) . Let S be a set of R -tuples and let r ∈ S , r is called redundant in S if there exists r ′ ∈ S such that r ′ ⊆ r . A set with no redundant R -tuple is called irredundant . An R -tuple that is non-redundant in a set is also called minimal (in that set). It is natural toask what the maximum size of an irredundant set of R -tuples could be, with consequences for thespace complexity of a reachability data structure based on R -tuples. It turns out that this numberis always significantly smaller than the number of possible R -tuples. Lemma 2.
The maximum size of an irredundant set of R -tuples for G is Θ( n τ ) .Proof. First, we prove that the maximum number of pairwise incomparable R -tuples is O ( n τ ).Then, we show that this bound is tight, as some graphs induce Θ( n τ ) incomparable R -tuples. (1) Upper bound: There are Θ( n ) ordered pairs of vertices. Thus, it is sufficient to show that foreach pair ( u, v ), the number of incomparable R -tuples whose starting vertex is u and whose endingvertex is v is Θ( τ ). Let S be an irredundant set of such R -tuples, and let r = ( u, v, t − , t +1 ) and r = ( u, v, t − , t +2 ) be any two R -tuples in S . If t − = t − , then either r ⊆ r or r ⊆ r , thus S isredundant (contradiction). As a result, all departure timestamps t − i belonging to the R -tuples in S are different, which implies that | S | ≤ τ . (2) Tightness: Consider the complete temporal graph K n,τ on n vertices in which every edge is6resent in all timestamps in [1 , τ ]. In such a graph, there are consequently Θ( n τ ) contacts, eachof which is a trivial journey. Now, observe that either these journeys connect different vertices, ortheir intervals are incomparable (same duration with different starting times), thus none of themis redundant with the others.Given a graph G and a set S of R -tuples representing all the journeys of G , the subset S ′ ⊆ S of all minimal R -tuples is called the representative R -tuples of G , denoted by R ( G ). We also write R ( u, v ) for those R -tuples in R ( G ) whose source is u and destination is v . From the proof ofLemma 2, we extract: Observation 1.
Every contact of G is present in R ( G ) in the form of a trivial R -tuple. Observation 1 implies that R ( G ) is a non-lossy representation, as G itself is contained in it. Thedown side is that its space complexity is at least as large as the number of contacts in G . Observethat, up to a constant factor, it can however not be worse than the worst number of contacts, sincethere may exist up to Θ( n τ ) contacts and irredundant sets cannot exceed this size (Lemma 2). Inother words, in dense temporal graphs, the reachability information offered by R -tuples is essentiallyfree in space. R -tuples The data structure considered in this work has four operations, namely add contact , can reach , is connected , and reconstruct journey . The first three operations can be dealt with using onlyexistential R -tuple. The fourth operation could benefit from storing a small amount of additionalinformation into the R -tuple. Definition 8 (Constructive R -tuple) . A constructive R -tuple r = ( u, v, t − , t + , w ) contains the sameinformation as an existential R -tuple, plus a vertex w such that at least one journey J ∈ J ( r ) startswith the contact ( u, w, t − ) . Node w is called the successor of u in r (resp., in J ). Most of the definitions and lemmas from Section 3.1 apply unchanged to constructive R -tuple.In particular, the definition of redundant R -tuples applies without considering the successor field.Indeed, if two constructive R -tuples differ only by the successor node, then they are seen asequivalent and any of the two can be discarded. As for the concatenation of two constructive R -tuples r = ( u , v , t − , t +1 , w ) and r = ( u , v , t − , t +2 , w ), provided r ≺ r , we addition-ally require that the resulting R -tuple adopts the successor of r as its own successor; that is, r · r = ( u , v , t − , t +2 , w ). For simplicity, whenever constructive R -tuples are not needed, wedescribe the algorithms using existential R -tuples. Informally, the timed transitive closure of a temporal graph G is a multigraph that captures theexistence of journeys within all possible time intervals, based on irredundant R -tuples. Definition 9 (Timed transitive closure) . Given a graph G , the timed transitive closure of G , noted T T C ( G ) , is a (static) directed multigraph on the same set of vertices, whose arcs correspond to therepresentative R -tuples of G . bc d [2 ,
3] [4 , , , ,
5] [5 , , , a bc d ([2 , , b ) ([4 , , d )([1 , , d )([4 , , c )([4 , , a ) ([5 , , d )([2 , , b )([4 , , c ) Figure 2:
Timed transitive closure
T T C ( G ) of the temporal graph G in Figure 1, considering δ = 1. On theleft, the version with existential R -tuple, whose intervals are depicted by labels. On the right, the versionwith constructive R -tuples, depicting also the successor. Figure 2 shows two examples of
TTC s, both corresponding to the temporal graph of Figure 1(one for existential R -tuples, the other for constructive R -tuples). Algorithmically, a TTC providesmost of the support needed to realize the high-level operations of our data structure. For example,the operation can reach(u, v, t , t ) amounts to testing if there exists an arc whose associated R -tuple is ( u, v, t − , t + ) with [ t − , t + ] ⊆ [ t , t ]. The operation is connected( t , t ) can be realizedby performing such a test for every pair of vertices. The operation add contact(u, v, t) reducesto adding a new arc to the TTC ( G ) if no smaller interval already captures this information. Ifthe new arc is added, then some other arcs may become redundant and should be removed, someothers may also be created by composition. This operation is therefore the most critical. Finally, ifconstructive R -tuples are used, then an actual journey may be reconstructed quite efficiently from TTC ( G ) when reconstruct journey(u, v, t , t ) is called, by retrieving a constructive R -tuple( u, v, t − , t + , w ) such that [ t − , t + ] ⊆ [ t , t ] and unfolding the corresponding journey inductively, byreplacing u with the successor vertex w and t − with t − + δ in each step.All algorithms for these operations are described in Section 4. Before doing so, we present anexplicit encoding of TTC s based on adjacency matrices and binary search trees (BST). In orderfor the high-level algorithms to remain independent from this particular choice, we define a set ofprimitives for manipulating the
TTC , that are used by the high-level algorithms of Section 4.
We encode the
TTC by an n × n matrix, in which every entry ( i, j ) points to a self-balancedbinary search tree (BST) denoted by T ( i, j ). The nodes in this tree contain all the time intervalscorresponding to R -tuples in R ( i, j ). From Lemma 2, we know that a tree T ( u, v ) contains up to τ nodes. In addition, all these intervals are incomparable, thus one can use any of their boundaries(departure or arrival) as the sorting key of the BST. Note that retrieving T ( u, v ) within the matrixtakes constant time, as the cells of a matrix are directly accessed. Also recall that finding the largestkey below (resp. the smallest key above) a certain value takes O (log τ ) time. Similarly, insertinga new element (in our case, an interval) takes O (log τ ) time. Finally, observe that several types ofBST (e.g. red-black trees) can self-balance without impacting the asymptotic cost of insertions.We provide the following low-level operations for manipulating TTC s: (1) find next ( u, v, t )returns the containing the earliest interval [ t − , t + ] in T ( u, v ) such that t − ≥ t , if any, and nilotherwise; symmetrically, (2) find previous ( u, v, t ) returns the node containing the latest interval[ t − , t + ] in T ( u, v ) such that t + ≤ t , if any, and nil otherwise; finally, (3) insert ( u, v, t − , t + ) inserts8 I I (a) Find I (b) Remove I (c) Insert
Figure 3:
Basic steps to perform insert ( u, v, I ) where I = [ t , t ]. First, in (a), an algorithm must find thecandidate intervals that could become redundant after the inserting I . These intervals are exactly the onesbetween I = find previous ( u, v, t + ) and I = find next ( u, v, t − ). Note that there are cases in which I or I do not exist. Next, in (b), all intervals I ′ between (and including) I and I such that I ′ ⊆ I mustbe removed. Finally, in (c), the algorithm inserts I in the correct place. a new node containing the interval [ t − , t + ] in T ( u, v ) and performs some operations for maintainingthe property that all intervals in T ( u, v ) are minimal.Let us now describe the algorithms that perform these operations, along with their time com-plexities. The algorithm for find next ( u, v, t ) searches T ( u, v ) recursively, by comparing t withthe departure t − of the current node interval [ t − , t + ]. If t − is equals to or greater than t , then thecurrent node is a candidate answer. The algorithm then compares the current node candidate andthe previous one, and keeps the one containing the smallest (earliest) t − , then it descends the leftchild. Otherwise, if t − is smaller than t , it simply descends the right child. As soon as a leaf isreached (and visited), the algorithm returns the current candidate as the answer. The algorithm for find previous ( u, v, t ) works symmetrically. The time complexities of both algorithms correspondto the depth of the tree, which is O (log τ ).The algorithm for insert ( u, v, t − , t + ) finds and removes any potential node with interval I i suchthat [ t − , t + ] ⊆ I i , then it inserts a new node containing [ t − , t + ] using a standard BST insertion.Figure 3 gives a linear representation of the intervals in T ( u, v ) while performing this operation. Anaive implementation of this operation would consist of searching and removing each correspondingnode independently. However, this would lead to a complexity of O ( d log τ ) time, where d is thenumber of nodes removed, that is up to O ( τ ). We use a non-standard approach that makes itfeasible in O ( d + log τ ) time only. The strategy is to identify in T ( u, v ) the nodes containing,respectively, the boundary intervals I = find previous ( u, v, t + ) and I = find next ( u, v, t − ),which correspond to the first and last nodes to be removed (note that the parameters are indeed t + and t − , not the reverse). Then, every node containing intervals in this range is removed usingthe technique outlined in the proof of Lemma 3. Lemma 3.
In the worst case, the cost of the insert operation is O ( d + log τ ) , where d is thenumber of elements removed from T ( u, v ) . The amortized cost of an insertion is O (log τ ) .Proof. The range of intervals to be removed is characterized by two boundary intervals I and I , which can be found by calling both find next and find previous a single time, which takes O (log τ ) time. The final insertion of the input interval in the BST also takes O (log τ ). The difficultpart is thus the removal of redundant intervals prior to this insertion (illustrated in Figure 3). Let d be the number of intervals in the deletion range. We start by recalling the main ideas of rangedeletion in a BST (see [13] for a pedagogical explanation), then we discuss their use in the particularcase of balanced BSTs, and finally we explain why the claimed cost is correct despite the fact thatbalance may be lost after the deletion. Let I A be the common ancestor of I and of I (possiblyequal to one of these). The process can be split into two phases, the first one is to walk upward from9 to I A and the second is to walk upward (again) from I to I A . As either walk proceeds, potentialdeletions are performed in intermediate tree nodes. Some of these deletions remove the node itself,replacing it by one of its child in constant time. The intermediate branches are cut without beingexplored. The final cost of O ( d + log τ ) actually follows from the cumulated length of both walks.The down side with this technique (which may explain why it is not standard in balanced BSTs)is that the resulting tree may have lost its balance if most of the nodes are deleted. However, asthe depth cannot increase, and as we only need that it remains of O (log τ ) for subsequent use ofthe tree, the solution is good enough for our needs. Finally, observe that the number of intervalsremoved cannot exceed the number of intervals previously inserted, which is why the amortizedcost of an insertion remains of O (log τ ).Additionally, we define the following basic operations: • N ∗ out ( u ) : Returns the set of vertices { v , v , . . . , v k } such that there exists at least one arcfrom u to v i in the TTC • N ∗ in ( u ) : Returns the set of vertices { v , v , . . . , v l } such that there exists at least one arc from v i to u in the TTC
Both operations can be realized in O ( n ) time, through traversing the corresponding row (resp.column) of the matrix and testing if the corresponding tree is empty. In this section, we describe the algorithms which perform the four operations of our data structure,previously described in Section 1.1. These operations are can reach(u, v, t , t ) , is connected( t , t ) , add contact(u, v, t) , and (optionally) reconstruct journey(u, v, t , t ) . For simplic-ity, the first three algorithms are presented using existential R -tuples only (however they are straight-forwardly adaptable to constructive R -tuples). All the algorithms rely on the primitives defined inSection 3.2.1 for manipulating the TTC in an abstract way.
The algorithm for performing can reach(u, v, t , t ) is straightforward. It consists of testingwhether T ( u, v ) contains at least one interval that is included in [ t , t ]. This can be done byretrieving [ t − , t + ] = find next ( u, v, t ) and checking that t + ≤ t . Therefore, the cost of thisalgorithm reduces essentially to that of the operation find next ( u, v, t ), which takes O (log τ )time. We note that, if [ t , t ] = T , then it is sufficient to verify (in constant time) that T ( u, v ) isnot empty. Regarding the operation is connected( t , t ) , a simple way of answering it is to call can reach(u, v, t , t ) for every pair of vertices, with a resulting time complexity of O ( n log τ ).It seems plausible that this strategy is not optimal and could be improved in the future.10 .2 Update operation The algorithm for add contact(u, v, t) manages the insertion of a new contact ( u, v, t ) in thedata structure, where ( u, v ) ∈ E and t ∈ τ . To start, the interval corresponding to the trivialjourney from u to v over [ t, t + δ ] is inserted in T ( u, v ) using the insert primitive. (Recall that thisprimitive encapsulate the removal of redundant intervals in T ( u, v ), if any.) Then, the core of thealgorithm consists of computing the indirect consequences of this insertion for the other vertices.Namely, if a vertex w − could reach u before time t with latest departure t − and v could reachanother vertex w + after time t + δ with earliest arrival t + , it follows that w − can now reach w + over interval [ t − , t + ]. Our algorithm consists of enumerating these compositions and inserting themin the TTC . Interestingly, for each predecessor w − of u , only the latest interval ending before t in T ( w − , u ) needs to be considered. The reason is that in order to compose an earlier journey J withthe new contact, we need to wait at u until time t . Thus, even though some other journeys startedearlier, it would have to wait at u and it would thus eventually arrive at the same time (based ona non-minimal interval). Based on this property, our algorithm only searches for the latest intervalpreceding t for each predecessor of u and the earliest interval exceeding t + δ for each successor of v . The details are given in Algorithm 1, whose behavior is as follows. In Line 1, the algorithminserts the interval [ t, t + δ ] into T ( u, v ), which corresponds to the trivial journey induced by thenew contact. In Lines 2 to 7, for every vertex w − ∈ N ∗ in ( u ), it finds the latest interval [ t − , ] in T ( w − , u ) that arrives before time t (inclusive) and inserts the composition [ t − , t + δ ] into T ( w − , v ).For the same reasons as above, the algorithm only needs considering inserting [ t − , t + δ ] becauseevery other possible composition would contain it as a subinterval. In Lines 8 to 11, for everyvertex w + ∈ N ∗ out ( v ), the algorithm finds the earliest interval [ , t + ] in T ( v, w + ) that leaves v aftertime t + δ (inclusive), and inserts the composition [ t, t + ] into T ( u, w ). In the same way, everyother possible composition would contain [ t, t + ] as a subinterval. Finally, in Lines 12 to 14, for all w − ∈ N ∗ in ( u ) and w + ∈ N ∗ out ( v ), it inserts the composition [ t − , t + ] into T ( w − , w + ). In order tooptimize this last step, the algorithm only considers the subset of N ∗ in whose reachability to v hasbeen impacted by the new contact, thanks to a dedicated storage D computed in Line 7. Theorem 4.
The update operation has amortized time complexity O ( n log τ ) . In the worst case,a single update operation costs O ( n τ ) time.Proof. An insert operation is performed in Line 1. The loop from Line 3 to 7 iterates over O ( n ) vertices and makes at most one insertion for each. The loop from Line 8 to 14 iterates over O ( n ) vertices, and for each one, iterates in a nested way over O ( n ) vertices. For each resultingpair, it performs at most one insert operation. The latter clearly dominates the overall cost ofthe algorithm, with a cost of O ( n ) times the cost of the insert operation, the latter being ofamortized time O (log τ ) and otherwise of O ( d + log τ ) time (Lemma 3), with d = O ( τ ) in the worstcase. The algorithm for the operation reconstruct journey(u, v, t , t ) reconstructs a journey fromvertex u to vertex v whose contact timestamps must be contained in [ t , t ]. As explained inSection 3.1.2, existential R -tuples can be augmented by a successor field that indicates whichvertex comes next in (at least one of) the journeys represented by the R -tuple. This information is11 lgorithm 1 add contact(u, v, t) Require: t ∈ T , u, v ∈ V with u = v insert ( u, v, t, t + δ ) D ← {} for all w − ∈ N ∗ in ( u ) do [ t − , ] ← find previous ( w − , u, t ) if t − = nil then insert ( w − , v, t − , t + δ ) D ← D ∪ ( w − , t − ) for all w + ∈ N ∗ out ( v ) do [ , t + ] ← find next ( v, w + , t + δ ) if t + = nil then insert ( u, w + , t, t + ) for all ( w − , t − ) ∈ D do if w − = w + then insert ( w − , w + , t − , t + )very useful for reconstruction and has a negligible cost (asymptotically speaking). Concretely, onecan make the nodes of the BST store the successor field in addition to the interval. The low-leveloperations for manipulating the TTC (see Section 3.2.1) are unaffected, neither are the query andupdate algorithms in a significant way. The only subtlety is that when two intervals (nodes) arecomposed, the successor field of the resulting node corresponds to the successor field of the firstnode (this was already discussed in terms of R -tuples in Section 3).The goal of the algorithm is thus to reconstruct a journey by unfolding the intervals and successorfields. Details are given in Algorithm 2. The first step (Lines 1 to 3) is to retrieve a node in T ( u, v ) Algorithm 2 reconstruct journey(u, v, t , t ) Require: [ t , t ] ⊆ T , u, v ∈ V, u = v ([ t − , t + ] , w ) ← find next ( u, v, t ) ⊲ node augmented with successor if the return value is nil or t + ≤ t then return nil ⊲ no interval contained in [ t , t ] in T ( u, v ) J ← { ( u, w, t − ) } while w = v do ([ t, ] , w ′ ) ← find next ( w, v, t − + δ ) J ← J · { ( w, w ′ , t ) } w ← w ′ t − ← t return J whose interval is contained within [ t , t ] if one exists. If several choices exist, the earliest isselected (through calling the find next primitive). Then, the algorithm iteratively replaces u with the successor and searches for the next interval until the successor is v itself (Lines 5 to 9),adding gradually the corresponding contacts to a journey J (Line 4 and Line 7), which is ultimatelyreturned in Line 10. 12 heorem 5. Algorithm 2 has time complexity O ( k log τ ) , where k is the length of the resultingjourney.Proof. The algorithm calls find next in Line 1. After that, it is known whether a journey canbe reconstructed. If so, a journey prefix J is initialized with the first contact of the reconstructedjourney (indeed, such a contact must exist due to the minimality of the interval). Then, in the loopfrom Line 5 to Line 9, the algorithm extends J by one contact for each call to find next until J contains the entire journey. Overall, find next is thus called as many times as the length of thereconstructed journey, which corresponds to O ( |J | log τ ) time. The costs of the other operationsare clearly dominated by this cost. In general, several journeys may exist that satisfy the query parameters. We observe that thespecific choices made in Algorithm 2 imply additional properties.
Lemma 6.
The journey J which is returned by Algorithm 2 is a foremost journey in the requestedinterval (i.e., it arrives at the earliest possible time at v ). Furthermore, among all the possibleforemost journeys, it is also a fastest journey (i.e., the difference between departure time andarrival time is minimized). Proof.
The fact that J is a foremost journey follows from the call to find next in Line 1. Indeed,the interval returned by this call corresponds to the earliest departure from u , which happensto also correspond to the earliest arrival at v because the stored intervals are incomparable. J thus achieves the earliest possible arrival time at v in the given interval. And since all the storedintervals are minimal (i.e. they do not contain smaller reachability intervals), it also follows that departure ( J ) is as late as possible among all the journeys arriving in v at time arrival ( J ), whichmeans J is as fast as possible among all foremost journeys.Let us insist that Lemma 6 does not imply that J is both foremost and fastest in the requestedinterval. It only states that J is a foremost journey, and a fastest one among the possible foremostjourneys. Even faster journeys might exist in the requested interval, arriving later at v . The aboveproperty is however already convenient, e.g. in communication networks, where a message wouldarrive at destination as early as possible, while (secondarily) traveling for as little time as possible. We presented in this paper an incremental data structure to solve the dynamic connectivity problemin temporal graphs. Our data structure places a high priority on the query time, by answeringreachability questions in time O (log τ ). Based on the ability to retrieve reachability informationfor particular time intervals, it supports the insertion of contacts in a non-chronological orderin O ( n log τ ) amortized time (deterministic worst-case O ( n τ ) time) and makes it possible toreconstruct efficiently foremost journeys within a given time interval, i.e., in time O ( k log τ ), where k is the size of the resulting journey. Our algorithms exploit the special features of non-redundant(minimal) reachability information, which we represent through the concept of R -tuples. The coreof our data structure, namely the timed transitive closure ( TTC ), is itself essentially a collection ofirredundant R -tuples, whose size (and that of the data structure itself) cannot exceed O ( n τ ).13he theory of R -tuples, initiated in this paper, poses a number of further questions, some ofwhich are of independent interest, some leading to possible improvements of the presented algo-rithms. For example, do R -tuples involving different pairs of vertices possess further interdepen-dence which may reduce the space needed to maintain TTC s? More generally, how restricted are
TTC s intrinsically? On the practical side, can we improve the insertion time for new contacts byusing another low-level structure than a balanced BST? Could the notion of contacts be generalizedto contacts of arbitrary duration? Finally, designing efficient data structures for the decrementaland the fully-dynamic versions of this problem, with unsorted contact insertion and deletion, seemsto represent both a significant challenge and a natural extension of the present work, one thatwould certainly develop further our common understanding of temporal reachability.
Acknowledgements
This study was financed in part by Funda¸c˜ao de Amparo `a Pesquisa doEstado de Minas Gerais (FAPEMIG) and the Coordena¸c˜ao de Aperfei¸coamento de Pessoal de N´ıvelSuperior - Brasil (CAPES) - Finance Code 001* - under the “CAPES PrInt program” awarded tothe Computer Science Post-graduate Program of the Federal University of Uberlˆandia, as well asthe Agence Nationale de la Recherche through ANR project ESTATE (ANR-16-CE25-0009-03).
References [1] R. Agrawal, A. Borgida, and H. V. Jagadish. “Efficient management of transitive relationshipsin large data and knowledge bases”. In:
Proceedings of the 1989 ACM SIGMOD InternationalConference on Management of Data . SIGMOD ’89. Portland, Oregon, USA: Association forComputing Machinery, 1989, pp. 253–262. isbn : 0897913175.[2] Matthieu Barjon et al. “Testing temporal connectivity in sparse dynamic graphs”. In:
CoRR abs/1404.7634 (2014).[3] L. Bedogni, M. Fiore, and C. Glacet. “Temporal reachability in vehicular networks”. In:
IEEEINFOCOM 2018 - IEEE Conference on Computer Communications . 2018, pp. 81–89.[4] Daniel Bryce and Subbarao Kambhampati. “A tutorial on planning graph based reachabilityheuristics”. In:
AI Magazine
Protocol Specification,Testing and Verification XV: Proceedings of the Fifteenth IFIP WG6.1 International Sympo-sium on Protocol Specification, Testing and Verification, Warsaw, Poland, June 1995 . Ed. byPiotr Dembi´nski and Marek ´Sredniawa. Boston, MA: Springer US, 1996, pp. 35–49. isbn :978-0-387-34892-6.[6] Arnaud Casteigts et al. “Time-varying graphs and dynamic networks”. In:
International Jour-nal of Parallel, Emergent and Distributed Systems
SIAM Journalon Computing doi : .[8] Jessica Enright et al. Deleting edges to restrict the size of an epidemic in temporal networks .2018. 149] Betsy George, Sangho Kim, and Shashi Shekhar. “Spatio-temporal network databases androuting algorithms: a summary of results”. In:
Advances in Spatial and Temporal Databases .Ed. by Dimitris Papadias, Donghui Zhang, and George Kollios. Berlin, Heidelberg: SpringerBerlin Heidelberg, 2007, pp. 460–477. isbn : 978-3-540-73540-3.[10] Haixun Wang et al. “Dual labeling: answering graph reachability queries in constant time”.In: . 2006, pp. 75–75.[11] Khandaker Tabin Hasan et al. “Making sense of time: timeline visualization for public trans-port schedule”. In:
Symposium on Human-Computer Interaction and Information Retrieval(HCIR 2011), Washington . 2011.[12] C. Hurter et al. “Bundled visualization of dynamicgraph and trail data”. In:
IEEE Transac-tions on Visualization and Computer Graphics
Remove range of keys from Binary Search Tree in O(s+h) . Computer ScienceStack Exchange. URL:https://cs.stackexchange.com/q/123535 (version: 2020-04-02). eprint: https://cs.stackexchange.com/q/123535 . url : https://cs.stackexchange.com/q/123535 .[14] Alexandre Camargo Martensen, Santiago Saura, and Marie-Josee Fortin. “Spatio-temporalconnectivity: assessing the amount of reachable habitat in dynamic landscapes”. In: Methodsin Ecology and Evolution
Querying evolving graphs with portal . 2016.[16] Polina Rozenshtein et al. “Reconstructing an epidemic over time”. In:
Proceedings of the 22ndACM SIGKDD International Conference on Knowledge Discovery and Data Mining . KDD’16. San Francisco, California, USA: Association for Computing Machinery, 2016, pp. 1835–1844. isbn : 9781450342322.[17] S. Seufert et al. “FERRARI: flexible and efficient reachability range assignment for graphindexing”. In: . 2013,pp. 1009–1020.[18] John Tang et al. “Characterising temporal distance and reachability in mobile and onlinesocial networks”. In:
SIGCOMM Comput. Commun. Rev. issn :0146-4833.[19] Hao Wei et al. “Reachability querying: an independent permutation labeling approach”. In:
The VLDB Journal
Proceedings of the 18th Annual In-ternational Conference on Mobile Computing and Networking . Mobicom ’12. Istanbul, Turkey:Association for Computing Machinery, 2012, pp. 377–388. isbn : 9781450311595.[21] Matthew J. Williams and Mirco Musolesi. “Spatio-temporal networks: reachability, centralityand robustness”. In:
Royal Society Open Science . 2017, pp. 1283–1294.[23] H. Wu et al. “Reachability and time-based path queries in temporal graphs”. In: . 2016, pp. 145–156.1524] H. Xiao, C. Aslay, and A. Gionis. “Robust cascade reconstruction by steiner tree sampling”.In: . 2018, pp. 637–646.[25] Han Xiao et al. “Reconstructing a cascade from temporal observations”. In:
Proceedings ofthe 2018 SIAM International Conference on Data Mining , pp. 666–674.[26] B. Bui Xuan, A. Ferreira, and A. Jarry. “Computing shortest, fastest, and foremost journeysin dynamic networks”. In:
International Journal of Foundations of Computer Science
IEEE Transactionson Visualization and Computer Graphics
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data .SIGMOD ’14. Snowbird, Utah, USA: Association for Computing Machinery, 2014, pp. 1323–1334. isbnisbn