Edge-outer graph embedding and the complexity of the DNA reporter strand problem
EEdge-outer graph embedding and the complexity of the DNAreporter strand problem
M. N. Ellingham a,1, , Joanna A. Ellis-Monaghan b,2 a Department of Mathematics, 1326 Stevenson Center, Vanderbilt University, Nashville, Tennessee 37240 b Department of Mathematics, Saint Michael’s College, One Winooski Park, Colchester, Vermont 05458
Abstract
In 2009, Jonoska, Seeman and Wu showed that every graph admits a route for a DNAreporter strand, that is, a closed walk covering every edge either once or twice, in oppositedirections if twice, and passing through each vertex in a particular way. This corresponds toshowing that every graph has an edge-outer embedding , that is, an orientable embedding withsome face that is incident with every edge. In the motivating application, the objective issuch a closed walk of minimum length. Here we give a short algorithmic proof of the originalexistence result, and also prove that finding a shortest length solution is NP-hard, even for3-connected cubic (3-regular) planar graphs. Independent of the motivating application, thisproblem opens a new direction in the study of graph embeddings, and we suggest severalproblems emerging from it.
Keywords:
DNA origami, reporter strand, orientable graph embedding, one-faceembedding, edge-outer embedding
1. Introduction
DNA self-assembly, and self-assembly in general, is a rapidly advancing field, with [15, 18]providing good overviews. In 2006, Rothemund introduced ‘DNA origami’, a new self-assembly method that increased the scale of DNA constructs and is one of the major de-velopments in DNA nanotechnology this century [17]. DNA origami originally involvedcombining an M13 single-stranded cyclic viral molecule, called the scaffolding strand , with200-250 short staple strands to produce a 90 ×
90 nm tile (in 2D), but now these strands
Email addresses: [email protected] (M. N. Ellingham), [email protected] (Joanna A. Ellis-Monaghan)
URL: https://math.vanderbilt.edu/ellingmn/ (M. N. Ellingham), (Joanna A.Ellis-Monaghan) Supported by Simons Foundation award 429625 Supported by NSF grant DMS-1332411
February 15, 2019 a r X i v : . [ m a t h . C O ] F e b an also produce 3D constructs with the structure of graphs or graph fragments [2]. At itsmost basic level, the design objective for DNA origami assembly of a graph-like structureis a strategy with the scaffolding strand following a single walk that traverses every edgeat least once, with any edges that are traversed more than once visited exactly twice, inopposite directions (because DNA strands in a double helix are oppositely directed), andwithout separating or crossing through at a vertex. See [1, 5, 7] for further work on routingscaffolding strands.The problem of finding a similarly prescribed walk also arises in the context of determin-ing an efficient route for a reporter strand , that is, a strand that is recovered and read at theend of an experiment to report on one of the products of the experiment. In designing theDNA self-assembly of a molecule with the structure of a graph G , the boundary componentsof a ‘thickened’ version of G identify the circular DNA strands that assemble (hybridize)into the graph G . For details, see [11], where the objective was to show that every graph hasan associated orientable thickened graph, with a boundary component visiting every edgeat least once, thus corresponding to the desired route for the reporter strand.While motivated by a particular application, this problem of finding suitable walks isof independent intrinsic interest in topological graph theory. Thickened graphs are alsoknown as ribbon graphs , and are equivalent to embeddings of graphs in compact surfaces.All embeddings in this paper will be cellular , in which every face is homeomorphic to anopen disk. Each face of an embedding corresponds to a boundary component of a thickenedgraph, which corresponds to a circular strand of DNA. Thus, showing the existence of asuitable walk for a scaffolding or reporter strand is equivalent to proving that every graphadmits an orientable embedding where every edge lies on a single face. The facial walkof this face gives the corresponding desired route for the DNA strand. Prompted by thisapplication, we define a reporter strand walk in a graph G to be a walk that uses every edgeof G at least once and occurs as a facial boundary walk in some orientable embedding of G .See Figure 1 for two examples of embeddings of K on the torus with facial walks that arereporter strand walks. Notice that the walk shown on the right is shorter (has fewer edges)than the one on the left.Having a face that includes every edge is intermediate between two well-known propertiesof graph embeddings. A one-face embedding is an embedding in which there is only one face,so every edge occurs exactly twice on the boundary of this single face; a one-face embeddingis necessarily a maximum genus embedding. An outer embedding in a given surface is anembedding in which all vertices appear on a single face (outerplanar graphs are particularlywell-studied). It therefore seems appropriate to call an embedding in which there is a face(the ‘outer’ face) that includes every edge an edge-outer embedding . For connected graphs,a one-face embedding is edge-outer, and an edge-outer embedding is outer.Here we give a short proof of the result from [11] that reporter strand walks always exist,and hence every graph has an edge-outer embedding. Our result provides a polynomial-timealgorithm that finds both the embedding and the reporter strand walk. Furthermore, weshow that the problem of finding a shortest length reporter strand walk, or equivalently, anembedding with smallest degree outer face, is NP-hard, even for 3-connected cubic planargraphs. We begin with a weaker NP-hardness result in Section 3 and strengthen the result2 igure 1: Facial walks corresponding to reporter strands. in Section 4.
2. A short proof that reporter strand walks exist
In this section we provide a short proof of the existence of reporter strand walks in allgraphs. In this paper, graphs may have loops and multiple edges. A graph with neitherloops nor multiple edges is simple . Sometimes we think of an edge as consisting of twodistinct edge-ends or just ends ; D G ( v ) denotes the set of ends incident with v in G . Werefer the reader to [21] for graph theory terms not defined in this paper. Most terms definedexplicitly in Sections 2 to 4 are terms we introduce for our specific needs in this paper.We assume the reader is familiar with combinatorial descriptions of orientable cellularembeddings of graphs, using rotation schemes or rotation systems , as described in [6, 9, 13].A rotation scheme assigns to each vertex a rotation , which is a cyclic ordering of the edge-endsincident with that vertex, corresponding to their order in the globally consistent clockwisedirection in the surface. Every orientable embedding is determined up to homeomorphismby its rotation scheme.Suppose we have an embedding of a graph G described by a rotation scheme. Supposealso that at some vertex v there is an incident face f and an incident edge-end d , so that d is not incident with f . Assume that f appears between consecutive edge-ends d (cid:48) , d (cid:48)(cid:48) inthe rotation at v ( f may also appear in other places around v ). Let d − , d + be the edge-ends immediately before and after d in the rotation at v , respectively. Then the operation of flipping d into f (between d (cid:48) and d (cid:48)(cid:48) ) modifies the rotation at v from ( d − , d, d + , . . . , d (cid:48) , d (cid:48)(cid:48) , . . . )to ( d − , d + , . . . , d (cid:48) , d, d (cid:48)(cid:48) , . . . ), moving d into a position between d (cid:48) and d (cid:48)(cid:48) . See Figure 2 (whichis explained in more detail below).The operation of moving one edge-end in a vertex rotation was used, for example, byDuke [3, Theorem 3.2] to show that the orientable genus range for a graph forms an interval.Jonoska, Seeman and Wu [11, Figure 5] used the special case of this operation for cubicgraphs.In our terminology, the main result of [11] is that every connected graph has a reporterstrand walk. The following algorithm provides a short proof. Algorithm 2.1.
Given a connected graph G (with loops and multiple edges allowed):3ake an arbitrary orientable embedding of G .Choose an arbitrary face f .While some edge is not in f { Choose an edge e not in f , but incident with a vertex v of f .Modify the embedding by flipping an end of e incident with v into f .There is a new face using e twice; let f be this face. } Assertion.
This algorithm runs in polynomial time. It terminates with an orientable em-bedding of G in which f is a face using every edge of G . Thus, the boundary walk of f is areporter strand walk in G . Proof that the algorithm works.
The initial embedding exists because we may just give eachvertex an arbitrary rotation. The edge e always exists because G is connected. When weflip the end of e into f , we create a new face f (cid:48) that includes all edges of the old face f ,and uses e twice. (If e belonged to two distinct old faces then f (cid:48) also uses all other edgesfrom those faces. Otherwise, e belonged to a single old face g , and the two occurrences of e split the boundary of g into two pieces; f (cid:48) includes the edges of one of those pieces, andthe other piece becomes the boundary of a second new face. Also, the length of the face f increases both when e has only one end incident with a vertex of f and when it has two,as the example below illustrates.) Since f (cid:48) becomes the new f , the edge set of f strictlyincreases at each iteration, until it contains all edges.Since each edge is flipped at most once, and since tracing the faces initially and updatingthe tracing after each flipping are fast, the operations in the algorithm are easily implementedin polynomial time using the rotation scheme representation of an embedding.An example is shown in Figure 2, where we represent orientable embeddings of a graph asplane drawings with possible edge crossings. With this representation, the rotation schemejust corresponds to the clockwise ordering at each vertex in the drawing, and we can tracefaces in the usual way for a plane graph, except that we ignore edge crossings. We show acomplete run of the algorithm, which requires two iterations. The tracing of the face f isshown initially and after each iteration (in red, if color appears) and the edges e , e usedin the two iterations are labeled. Corollary 2.2.
Every connected graph G (loops and multiple edges allowed) has an ori-entable embedding such that there is a face f whose boundary contains every edge, and thegenus of the embedding is the maximum orientable genus of G .Proof. Apply Algorithm 2.1 to G , beginning with a maximum genus orientable embeddingof G . From the parenthetical note in the proof of the algorithm, at each step the number offaces stays the same or decreases, so the genus of the embedding stays the same or increases.Since we started with a maximum genus embedding and the genus does not decrease, thefinal embedding is a maximum genus embedding.4 f → e f → f Figure 2: Example of Algorithm 2.1.
Remark.
Often, an important consideration in determining reporter or scaffolding strandwalks is assuring that the result is not knotted. The authors of [7] observed that knottedwalks can result from A-trails (non-crossing Eulerian circuits) in toroidal meshes, while[14] characterizes knotted and unknotted A-trails in toroidal meshes and [12] gives an anapproximation algorithm for unknotted walks in surface triangulations. In [1] the authorsrestrict to spheres to assure unknotted routes. We note here that Algorithm 2.1 starts withan abstract graph and outputs a walk that is a facial walk of the graph embedded in anorientable surface. If the surface is embedded in 3-dimensional space, then this walk boundsa disk, and hence is unknotted when viewed as a curve in space. This means that everygraph has some embedding in 3-dimensional space, in fact an embedding in some orientablesurface, with an unknotted reporter strand walk.
3. NP-completeness of short reporter strands
Given that reporter (or scaffolding) strand walks exist, for experimental efficiency it isnatural to seek a shortest such walk, i.e., one with as few edges as possible, and hence a5inimum number of duplicated edges. However, in this section we show that the followingdecision version of finding srs( G ), the length of a shortest reporter strand walk in G , isNP-complete. Shortest Reporter Strand Walk (SRS Walk).
Given (
G, k ) where G is a graphand k is a nonnegative integer, is srs( G ) ≤ k ? In other words, does G have a reporter strandwalk of length at most k ?Thus, the SRS Walk problem asks whether G has an orientable embedding with a facialwalk that uses all edges and has length at most k . A ‘yes’ instance can be certified by givinga suitable embedding of G , so SRS Walk is in NP. The construction used here to provethat
SRS Walk is NP-complete is relatively straightforward; it forms the first step towardsa stronger NP-completeness result given in Section 4.All walks from this point onwards (including paths and cycles) are directed walks. Let W − denote the reverse of a walk W . For two walks W and W , W · W denotes theirconcatenation, which is only defined if the last vertex of W is the first vertex of W . Awalk is edge-spanning if it uses every edge at least once, and edge- -bounded if it uses everyedge at most twice. In any walk an edge used exactly once is a solo edge, and an edge usedexactly twice (whether in the same direction, or in opposite directions) is a double edge.For much of this section and the next we will be working with simple graphs. Thus, wecan uniquely identify an edge between u and v using the notation uv . We can also describewalks just using sequences of vertices: then uv also means a one-edge walk from u to v , inthat direction. Using special notation to distinguish between the (undirected) edge uv andthe (directed) walk uv would be unwieldy; we rely instead on context or explicit textualexplanation.We are interested in walks that can occur as face boundaries in an orientable graphembedding. Loosely speaking, if a walk W can be a face boundary in some orientableembedding of the graph, then we should be able to glue a facial disk to the graph, identifyingits boundary with W , without preventing the neighborhood of any vertex from being an opendisk in some embedding, and without introducing nonorientability.Formalizing this objective yields two properties we desire in walks. Given a walk W in G and v ∈ V ( G ), let Rot G ( W, v ) be the graph with vertex set D G ( v ), where we add oneedge between edge-ends d, d (cid:48) for each time W enters v on d and immediately leaves on d (cid:48) , orvice versa. In an embedding, for each v the union of Rot G ( W, v ) over all facial walks W is acycle on D G ( v ) describing the (undirected) rotation at v , so each Rot G ( W, v ) is a subgraphof such a cycle. Therefore, we say a closed walk W is rotation-compatible in G if for every v ∈ V ( G ), Rot G ( W, v ) is either a cycle with vertex set D G ( v ), or a union of vertex-disjointpaths. Also, a walk is orientable if it uses each edge at most once in each direction. A walkthat is rotation-compatible or orientable is edge-2-bounded.A result of ˇSkoviera and ˇSir´aˇn [19, Prop. 1] implies that a closed walk in G occurs as aface boundary in some orientable embedding of G if and only if it is orientable and rotation-compatible in G . Loosely, if W is orientable and rotation-compatible then each Rot G ( W, v )describes a partial rotation at v that can be arbitrarily completed to a full rotation, giving6 igure 3: Chinese postman walk (left) and shortest reporter strand walk (right) in a theta graph. a rotation scheme. Thus, a reporter strand walk in G is precisely a closed walk that isedge-spanning, orientable, and rotation-compatible in G .There is a natural lower bound on the length of a reporter strand walk. A Chinesepostman walk is an edge-spanning closed walk of minimum length. A Chinese postman walkis edge-2-bounded, but need not be rotation-compatible or orientable. Such walks were firstconsidered by Guan [10]. We let cp( G ) denote the length of a Chinese postman walk in G .Since every reporter strand walk is an edge-spanning closed walk, srs( G ) ≥ cp( G ). Thus,we have another decision problem that may be regarded as a more specific version of the SRS Walk problem.
Chinese Postman Reporter Strand Walk (CPRS Walk).
Given a graph G , issrs( G ) = cp( G )? In other words, does G have a reporter strand walk that is also a Chinesepostman walk? (Such a walk is a Chinese postman reporter strand walk or CPRS walk .)Edmonds and Johnson [4] showed that cp( G ) can be computed in polynomial time.Thus, we can verify a ‘yes’ instance of the CPRS Walk problem in polynomial time bychecking that a given reporter strand walk (the certificate) has length cp( G ). This shows that CPRS Walk is in NP. Moreover, every instance G of CPRS Walk can be transformedin polynomial time to the instance ( G, cp( G )) of the SRS Walk problem, involving thesame graph. Then G is a ‘yes’ instance of CPRS Walk if and only if ( G, cp( G )) is a ‘yes’instance of SRS Walk . Therefore, if we show that
CPRS Walk is NP-complete for aclass of graphs,
SRS Walk is also NP-complete for that class.While the right side of Figure 1 shows that K has a CPRS walk (since cp( K ) = 8), inan arbitrary graph a CPRS walk may not exist, i.e., a shortest reporter strand walk is notgenerally a Chinese postman walk. For example, even the 2-vertex theta graph in Figure 3has a shortest reporter strand walk of length 6, while a Chinese postman walk has length 4,and thus the theta graph does not have a CPRS walk.Chinese postman walks in triangulations of the sphere are central to the strand routingalgorithm of [1]. However, the walks produced by that algorithm may have retractions(defined below in Subsection 3.2), and so are not in general reporter strand walks accordingto our definition.We will prove that the problem CPRS Walk is NP-complete even when restricted to 2-connected cubic planar graphs, and, in the next section, to 3-connected cubic planar graphs.We do this by reducing the hamilton cycle problem for 3-connected cubic planar graphs,which is known to be NP-complete [8], to
CPRS Walk .7n this section and the following section, we work with cubic graphs and their subgraphs.
Both Chinese postman and reporter strand walks have special structures in 2-connectedcubic graphs.First we consider Chinese postman walks. Suppose G is a 2-connected cubic graph. Anyedge-spanning walk in G uses all three edges at each vertex, so it must use each vertex atleast twice, and hence it must have length at least 2 | V ( G ) | . Thus, its length is 2 | V ( G ) | if and only if it uses each vertex exactly twice, if and only if it contains exactly two soloedges and one double edge incident with each vertex. Now, by a well-known result [16] ofPetersen, G has a perfect matching M . If we replace each edge of M in G by two paralleledges to obtain G (cid:48) , then G (cid:48) is eulerian, and an euler tour in G (cid:48) gives an edge-spanning walkin G of length 2 | V ( G ) | . Therefore, a Chinese postman walk has length 2 | V ( G ) | , and henceuses two solo edges and one double edge at each vertex. We summarize this as follows. Lemma 3.1.
A closed walk in a -connected cubic graph G is a Chinese postman walk ifand only if it is edge-spanning, edge- -bounded, and its double edges form a perfect matchingof G . Now we consider reporter strand walks. In cubic graphs, rotation-compatibility canbe replaced by a simpler property. A retraction in a walk consists of an edge followedimmediately by the same edge in the opposite direction. For example, the walk on the leftin Figure 3 has a retraction. A walk with no retractions is retraction-free . If a graph hasno vertices of degree 1, every rotation-compatible closed walk is retraction-free. If a graphhas no vertices of degree 4 or more, every retraction-free edge-2-bounded closed walk isrotation-compatible. Therefore, in a cubic graph a closed walk is rotation-compatible if andonly if it is edge-2-bounded and retraction-free, giving the following.
Lemma 3.2.
A closed walk in a cubic graph G is a reporter strand walk if and only if it isedge-spanning, orientable, and retraction-free. Corollary 3.3.
A closed walk in a -connected cubic graph is a Chinese postman reporterstrand (CPRS) walk if and only if it is edge-spanning, orientable, retraction-free, and thedouble edges form a perfect matching of G . The two walks in K shown in Figure 1 illustrate these characterizations: both satisfyLemma 3.2 and are reporter strand walks, and the one on the right satisfies Corollary 3.3and is a CPRS walk.Suppose W is a CPRS walk in 2-connected cubic G , and v ∈ V ( G ). Since W is orientable,the double edge at v , call it δ W ( v ), is used in both directions by W , so one of the solo edgesat v , call it σ − W ( v ), must be used by W to enter v , and the other, σ + W ( v ), must be used by W to leave v . Since W is retraction-free, it must use the edge sequences σ − W ( v ) δ W ( v ) and δ W ( v ) σ + W ( v ) to pass through v .Therefore, we can reconstruct W from the choice of double edges (which form a matching)and of orientations for the remaining solo edges (one entering, one leaving each vertex). To8 uv d uv a vu d vu u vP uv a uv d uv a vu d vu u vX uv a uv d uv a vu d vu u vX uv X vu Figure 4: Construction of P , and how a CPRS walk passes through P uv . consider possible CPRS walks we make such choices and try to trace W by following theedge sequences σ − W ( v ) δ W ( v ) and δ W ( v ) σ + W ( v ) at v . In general this tracing procedure mayfail by finding a closed walk that is not edge-spanning. If this does not happen we obtain aCPRS walk.A connected graph H with two vertices v , v of degree 2 and all other vertices of degree3 is called an edge gadget . If G is a graph disjoint from H and u u ∈ E ( G ), then we say J = ( G − u u ) ∪ H ∪ { u v , u v } is obtained by bisecting u u in G with H . The cubiccompletion of H is H + = H ∪ v v . We leave the proof of the following straightforwardresult to the reader. Lemma 3.4.
Let G be a cubic graph, and H an edge gadget. Construct J by bisecting anedge of G with H . Then J is cubic. If G and the cubic completion H + are both -connected,planar, and simple, then J is -connected, planar, and simple.3.3. The NP-completeness result Construction 3.5.
Given a 3-connected cubic planar simple graph N , construct a newgraph P by replacing each edge uv by a subgraph P uv (= P vu ) consisting of a 4-cycle( a uv d uv a vu d vu ) on four new vertices and three additional edges ua uv , va vu and d uv d vu . Notethat order matters for subscripts in new vertex names. We use π uv (= π vu ) to refer to theautomorphism of P uv that swaps d uv and d vu and fixes the other vertices. See Figure 4. Claim.
The graph P is a 2-connected cubic planar simple graph. Proof of claim.
The graph P (cid:48) uv = P uv − { u, v } , obtained by removing u and v and all theirincident edges from P uv , is an edge gadget. Moreover, ( P (cid:48) uv ) + ∼ = K is 2-connected, planar,and simple. Replacing uv by P uv is equivalent to bisecting uv with P (cid:48) uv , so the claim followsby repeated application of Lemma 3.4. 9 uv wC → t uv wX tu X uv X uw X wu Figure 5: Constructing a CPRS walk in P from a hamilton cycle in N . Lemma 3.6.
Suppose we construct P as in Construction 3.5. Let X uv = ua uv d uv a vu d vu d uv - a uv d vu a vu v and X uv = ua uv d uv d vu a uv u . Then a CPRS walk W in P must pass through eachsubgraph P uv in one of two ways,(a) as a single walk X uv , π uv ( X uv ) , ( X uv ) − or ( π uv ( X uv )) − ; or(b) as two walks X uv and X vu , or π uv ( X uv ) = ( X uv ) − and π uv ( X vu ) = ( X vu ) − .Proof. If d uv d vu is a double edge of W , then ua uv and va vu are also double edges. The soloedges in P uv form a single cycle, which must be oriented consistently, as either ( a uv d uv a vu d vu )or its reverse. Applying the tracing procedure described above, (b) holds.If d uv d vu is not a double edge, the set of double edges in P uv is either { a uv d uv , d vu a vu } or { a uv d vu , d uv a vu } . By symmetry (from π uv ) we may assume the former. The solo edgesform a single path ua uv d vu d uv a uv v which must be oriented in this direction or its reverse.Applying the tracing procedure, (a) holds.Thus, Lemma 3.6 says that up to symmetry or reversal a CPRS walk must pass through P uv in one of the ways shown on the right in Figure 4. Proposition 3.7.
For N and P as in Construction 3.5, the following are equivalent.(a) N has a hamilton cycle.(b) P has a Chinese postman reporter strand walk.Proof. Suppose N has a hamilton cycle, C . First, replace each (directed) edge uv of C bythe walk X uv . This gives a walk in P that uses every vertex of N once. Now for each edge10 w ∈ E ( N ) − E ( C ) splice X uw into this walk at u , and splice X wu into this walk at w . Theresult is a CPRS walk in P . See Figure 5.Conversely, suppose P has a CPRS walk W . By Lemma 3.6, at each u ∈ V ( N ), σ − W ( u )belongs to some subwalk W tu = X tu or π tu ( X tu ) of W , σ + W ( u ) belongs to some W uv = X uv or π uv ( X uv ), and both occurrences of δ W ( u ) belong to some W uw = X uw or ( X uw ) − , where t, v, w are the neighbors of u in N . Thus, deleting all subwalks W uw and replacing eachsubwalk W uv by the edge uv of N gives a hamilton cycle in N .Construction 3.5 therefore gives a polynomial time transformation from the hamiltoncycle problem for 3-connected cubic planar simple graphs, which is NP-complete [8], to the CPRS Walk problem for 2-connected cubic planar simple graphs. This yields the followingtheorem.
Theorem 3.8.
The problems
Shortest Reporter Strand Walk and
Chinese Post-man Reporter Strand Walk are NP-complete for -connected cubic planar simplegraphs.
4. A stronger NP-completeness result
In this section we show that the problems
SRS Walk and
CPRS Walk are NP-complete even for 3-connected planar graphs. -connectedness While Section 3 provides a simple proof of NP-completeness for the problems
SRS Walk and
CPRS Walk , the class of graphs that it uses does not have a stable 3-dimensionalstructure, so they are not likely to occur in situations where we design a DNA molecule tohave a specified geometric embedding in space. In particular, the graphs P produced byConstruction 3.5 have connectivity 2, while the graph formed by the edges of any polyhedronin 3-dimensional space is 3-connected. Theorem 3.8 leaves open the possibility that SRSWalk and
CPRS Walk can be solved easily for 3-connected graphs, or even that all 3-connected graphs with more than two vertices have a CPRS walk. Here we show that for3-connected graphs (in fact, 3-connected cubic planar graphs) the problems
SRS Walk and
CPRS Walk are NP-complete, and hence unlikely to have polynomial-time solutions. Theconstruction in our proof yields arbitrarily large 3-connected cubic planar graphs that donot have a CPRS walk.First we modify the graph P from Construction 3.5 to obtain a new graph Q withimproved connectivity, in Construction 4.1. However, CPRS walks in Q do not necessarilycorrespond to CPRS walks in P , so later we further modify Q into a graph R where wecan control the CPRS walks so that they do correspond to CPRS walks in P , and hence tohamilton cycles in N .Given a graph G with a plane embedding, let cwn G ( u, v ) denote the neighbor of u thatis immediately clockwise from v in the rotation around u .11 vts wa uv b uv c uv d uv a vu b vu c vu d vu a ut c tu d tu a us b us b vw Figure 6: Construction of Q . Construction 4.1.
Suppose we have N and P as in Construction 3.5. To construct Q , takea plane embedding of N , and a corresponding plane embedding of P in which each 4-cycle( a uv d uv a vu d vu ) is clockwise. Replace each edge a uv d uv of P by a path a uv b uv c uv d uv involvingtwo new vertices b uv , c uv . Then incident to each vertex c uv add a bracing edge c uv b vw where w = cwn N ( v, u ). See Figure 6.Given a graph G , define a relation E G , or just E , on V ( G ) by uE v when there arethree edge-disjoint uv -paths in G . Therefore, by the edge version of Menger’s Theorem (see[21, Theorem 4.2.19]), uE v if and only if no set of fewer than 3 edges separates u and v . Itfollows that G is 3-edge-connected precisely when all vertices of G are E -equivalent. Lemma 4.2. E is an equivalence relation.Proof. E is reflexive (take three copies of the trivial walk at a vertex) and clearly symmetric;we must show it is transitive. Suppose that uE v and vE w . If we do not have uE w thensome set of fewer than 3 edges separates u and w . But then this set either separates u and v , contradicting uE v , or v and w , contradicting vE w . Hence, uE w . Lemma 4.3.
The graph Q is -connected, planar, and simple.Proof. Clearly Q is planar and simple (see Figure 6). For cubic graphs such as N and Q ,3-connectedness is equivalent to 3-edge-connectedness, which is equivalent to showing thatall vertices are E -equivalent. 12 y x y x y x y x y pp Y Y Figure 7: Vertex gadget A , and how a CPRS walk passes through it. Vertices of Q are either original vertices, namely vertices of N , or new vertices, added byConstructions 3.5 and 4.1. If u and v are original vertices then there are three edge-disjoint uv -paths in N , which easily provide three edge-disjoint uv -paths in Q . Hence all originalvertices are E Q -equivalent. So it suffices to show that each new vertex is E Q -equivalent tosome original vertex.Suppose that in the plane embedding of N , the neighbors of u are s, t, v in clockwiseorder. The following paths from new vertices of Q to the original vertex u (see Figure 6)show that a uv , b uv and d uv are E Q -equivalent to u : a uv u -paths: a uv u , a uv b uv c tu d tu a ut u , a uv d vu c vu b us a us u . b uv u -paths: b uv a uv u , b uv c uv d uv d vu c vu b us a us u , b uv c tu d tu a ut u . d uv u -paths: d uv d vu a uv u , d uv c uv b uv c tu d tu a ut u , d uv a vu b vu c vu b us a us u .Rather than c uv it is more convenient to show that c vu is E Q -equivalent to u : c vu u -paths: c vu d vu a uv u , c vu b us a us u , c vu b vu a vu d uv c uv b uv c tu d tu a ut u .Since every new vertex is a uv , b uv , d uv or c vu for some choice of u and v , every new vertex is E Q -equivalent to an original vertex, as required. A connected graph H with three vertices v , v , v of degree 2 and all other vertices ofdegree 3 is called a vertex gadget . If G is a graph disjoint from H and u ∈ V ( G ) has degree3 with neighbors u , u , u , then we say the graph J = ( G − u ) ∪ H ∪ { u v , u v , u v } isobtained by replacing u in G by H . The cubic completion of H is H + = H ∪ { vv , vv , vv } where v is a new vertex. We leave the proof of the following straightforward result to thereader. Lemma 4.4.
Let G be a cubic graph, and H a vertex gadget. Construct J by replacing avertex of G with H . Then J is cubic. If G and the cubic completion H + are both -connected,planar, and simple, then J is -connected, planar, and simple. Now we construct subgraphs in which the route taken by a CPRS walk is constrained invarious ways.Let A be the vertex gadget shown (with additional incident edges p p, x x , y y ) inFigure 7. Let α be the automorphism of A that swaps the two paths x x x x and y y y y while fixing p . Note that the cubic completion A + is 3-connected (to see this, observe thatfor every v ∈ V ( A + ), A + − v has a hamilton cycle and is therefore 2-connected). Also, A + is planar and simple. 13 y x ′ y ′ x y x ′ y ′ x y x ′ y ′ x y x ′ y ′ pp p ′ p ′ q q Z Z Figure 8: Vertex gadget B , and how a CPRS walk passes through it. Lemma 4.5.
Suppose the vertex gadget A described above is an induced subgraph of a -connected cubic graph G . Let Y = px x x x and Y = x y y x x y y x py y y y . If W is a CPRS walk in G then W passes through A and its incident edges as two walks,either p p · Y · x x and x x · Y · y y , or p p · α ( Y ) · y y and y y · α ( Y ) · x x , or reversingboth walks in one of these pairs.Proof. Suppose first that p p is a double edge in W . If x y is not a double edge then x x and y y are double edges. We have a triangle ( px y ) of solo edges; we may assume its edgesare oriented in that direction by W . We have another path of solo edges x x y y . If thisis oriented as y y x x then then the tracing procedure fails by finding a 4-cycle ( x y y x ).So it is oriented as x x y y . If x y is a double edge then the tracing algorithm fails byfinding a 6-cycle ( x y y y x x ). Thus, x y is a solo edge, it must be oriented as y x , x x and y y are double edges, and x y is a solo edge. If x y is oriented as x y , then thetracing procedure fails by finding a 4-cycle ( y x x y ), and if it is oriented as y x , then thetracing algorithm fails by finding an 8-cycle ( x y y y y x x x ).If x y is a double edge we have a path of solo edges y y px x which without loss ofgenerality is oriented in that direction. If x y is a double edge, then the tracing procedurefails by finding the 4-cycle ( y y x x ). So x y is a solo edge, it must be oriented as x y , x x and y y are double edges, and x y is a single edge. If x y is oriented as x y thenour tracing algorithm fails by finding a 6-cycle ( x y y y x y ), and if it is oriented as y x then the tracing algorithm fails by finding a 4-cycle ( y x x y ).Therefore, p p is a solo edge; without loss of generality, p p = σ − W ( p ). By symmetry(from α ) we may assume that δ W ( p ) = px . Then y y , x x , y y and x x must all bedouble edges. The solo edges form a single path which is oriented p py x x y y x x y y .Now the tracing procedure gives p p · Y · x x and x x · Y · y y .Thus, Lemma 4.5 says that up to symmetry or reversal a CPRS walk must pass through A as shown in Figure 7. Loosely, A acts like a vertex, in that a CPRS walk passes throughit as two walks of the form (entering solo edge)(intermediary edges)(exiting double edge)and (entering double edge)(intermediary edges)(exiting solo edge), but with a restriction:the edge p p must be a solo edge.Now we build a larger vertex gadget. Let A (cid:48) be a copy of A , with a plane embeddingthat is the mirror image of the embedding of A in Figure 7. Let p (cid:48) in A (cid:48) correspond to p in14 , and so on. Let B = A ∪ A (cid:48) ∪ { x x (cid:48) , y q, y (cid:48) q } where q is a new vertex. Then B is a vertexgadget. Note that B + can be considered as obtained from K by replacing two vertices bycopies of A , so by Lemma 4.4 applied twice, B + is 3-connected, planar, and simple. Lemma 4.6.
Suppose the vertex gadget B described above is an induced subgraph of a -connected cubic graph G , with incident edges p p , qq and p (cid:48) p (cid:48) . Let Z = Y · x x (cid:48) · ( Y (cid:48) ) − and Z = qy (cid:48) · ( Y (cid:48) ) − · x (cid:48) x · Y · y q . If W is a CPRS walk in G then W passes through B andits incident edges as two walks, either p p · Z · p (cid:48) p (cid:48) and q q · Z · qq , or reversing both of thesewalks.Proof. Applying Lemma 4.5 to both A and A (cid:48) , p p and p (cid:48) p (cid:48) are solo edges. The perfectmatching of double edges of W must have an odd number of edges leaving the odd set V ( B ), so q q must be a double edge. Therefore, qy and qy (cid:48) are solo edges. Now Lemma4.5, applied to both A and A (cid:48) , gives the result.Thus, Lemma 4.6 says that up to reversal (or, equivalently, up to the automorphism of B swapping p and p (cid:48) ) a CPRS walk must pass through B as shown in Figure 8. -connected cubic planar graphs Construction 4.7.
Suppose we have N , P and Q as in Constructions 3.5 and 4.1. For eachvertex b uv take a copy B uv of B , where p uv , q uv , p (cid:48) uv , Z uv , Z uv correspond to p, q, p (cid:48) , Z , Z in B , respectively. Construct R by replacing each vertex of the form b uv in Q by B uv , so that if b uv is adjacent to a uv , c uv , c tu then the edges incident with B uv are a uv p uv , q uv c tu and p (cid:48) uv c uv . Claim.
The graph R is a 3-connected cubic planar simple graph. Proof of claim.
As noted above, B + is a 3-connected, planar and simple, and so is Q byLemma 4.3. The claim follows by repeated application of Lemma 4.4. Proposition 4.8.
For N , P , Q and R as in Constructions 3.5, 4.1 and 4.7, the followingare equivalent.(a) N has a hamilton cycle.(b) P has a Chinese postman reporter strand walk.(c) P has a Chinese postman reporter strand walk using every edge of the form a uv d uv as a solo edge.(d) R has a Chinese postman reporter strand walk.Proof. By Proposition 3.7, (a) ⇔ (b). Clearly (c) ⇒ (b). Suppose (b) holds and we havea CPRS walk W in P . Suppose some a uv d uv is not a solo edge of W . By Lemma 3.6, W must use X uv or its reverse; replacing this by π uv ( X uv ) or its reverse we still have a CPRSwalk, and now a uv d uv (and also a vu d vu ) is a solo edge. Applying this to all a uv d uv that arenot solo edges, we obtain a CPRS walk W (cid:48) satisfying (c). Thus, (b) ⇒ (c).So now we show that (c) ⇔ (d). Suppose that (c) holds, with a walk W using each a uv d uv ∈ E ( P ) as a solo edge. The bracing edge of Q incident with c uv has the form c uv b vw ,where w follows u in clockwise order around v in N . Replace each directed edge a uv d uv , orits reverse, in W by a walk in R according to the following rules:15 uses a uv d uv , a vw d vw : a uv d uv → T uv = a uv p uv · Z uv · p (cid:48) uv c uv q vw · Z vw · q vw c uv d uv . W uses a uv d uv , d vw a vw : a uv d uv → T uv = a uv p uv · Z uv · p (cid:48) uv c uv q vw · ( Z vw ) − · q vw c uv d uv . W uses d uv a uv , a vw d vw : d uv a uv → T uv = d uv c uv q vw · Z vw · q vw c uv p (cid:48) uv · ( Z uv ) − · p uv a uv . W uses d uv a uv , d vw a vw : d uv a uv → T uv = d uv c uv q vw · ( Z vw ) − · q vw c uv p (cid:48) uv · ( Z uv ) − · p uv a uv .The rules guarantee that in each B uv we use both Z uv and Z uv , or both ( Z uv ) − and ( Z uv ) − .Therefore, the result is a CPRS walk W (cid:48) in R . Thus, (d) holds.Conversely, suppose (d) holds, so R has a CPRS walk W . Consider each a uv d uv ∈ E ( P )and the corresponding bracing edge c uv b vw ∈ E ( Q ). Applying Lemma 4.6 to B uv and B vw ,we see that W must either travel from a uv to d uv along T uv or T uv from above, or travelfrom d uv to a uv along T uv or T uv . In the former case, replace this subwalk of W by the edge a uv d uv of P ; in the latter case replace it by d uv a uv . Making all such replacements gives aCPRS walk W (cid:48) in P in which each a uv d uv is a solo edge. Thus, (c) holds.Constructions 3.5, 4.1 and 4.7 therefore give a polynomial time transformation from thehamilton cycle problem for 3-connected cubic planar simple graphs to the CPRS Walk problem for the same family of graphs. Applying these constructions to nonhamiltonian 3-connected cubic planar graphs N proves the existence of arbitrarily large 3-connected cubicplanar simple graphs R with no CPRS walk (or we can construct small examples of suchgraphs easily using vertex gadgets A and B ). Our final theorem also follows immediately. Theorem 4.9.
The problems
Shortest Reporter Strand Walk and
Chinese Post-man Reporter Strand Walk are NP-complete for -connected cubic planar simplegraphs.
5. Conclusion
This application brings to light a new, natural area of investigation in topological graphtheory, edge-outer embeddability, which seems quite rich in attractive questions and newdirections:1. The algorithm in Section 2 provides a fast routing solution that is within 100% ofoptimal (at most twice the length). Is there a polynomial-time algorithm that willreturn a reporter strand walk that is within a smaller percentage of minimum length? Arelated result appears in [12], where they give a cubic-time -approximation algorithmin the special case that the graph is a triangulation of an orientable surface.2. Can we extend Corollary 2.2 to say more about the genus range of embeddings thatyield reporter strand walks, or reporter strand walks of minimum length? Are theseranges intervals?3. Are there classes of graphs where it is polynomial-time to find a minimum lengthreporter strand walk? Eulerian graphs are one such class. We have shown that theproblem is NP-hard for 3-connected graphs, but can it be solved in polynomial timefor graphs with higher connectivity? 16. What pragmatic approaches might there be to finding suitable scaffolding strandroutes, albeit possibly with restrictions or other design costs? One such approachis provided by [1], which describes a strand routing design algorithm using an A-trailheuristic that performs well on reasonably sized triangulations of the sphere, providedthat some ‘double-width’ edges (using two double helices) are acceptable in the finalproduct. Another approach may be found in [20], which gives a fast algorithm, butessentially makes all of the edges ‘double-width’. Other methods of efficiently deter-mining suitable routes with reasonable design trade-offs would help advance the fieldof DNA origami. References [1] E. Benson, A. Mohammed, J. Gardell, S. Masich, E. Czeizler, P. Orponen, B. H¨ogberg, DNA renderingof polyhedral meshes at the nanoscale, Nature 523 no. 7561 (2015) 441–444.[2] S. M. Douglas, H. Dietz, T. Liedl, B. H¨ogberg, F. Graf, W. M. Shih, Self-assembly of DNA intonanoscale three-dimensional shapes, Nature, 459 (2009) 414–418.[3] R. A. Duke, The genus, regional number, and Betti number of a graph, Canad. J. Math. 18 (1966)817-822.[4] J. Edmonds, E. L. Johnson, Matching, Euler tours and the Chinese postman, Math. Programming 5(1973) 88–124.[5] J. A. Ellis-Monaghan, A. McDowell, I. Moffatt, G. Pangborn, DNA origami and the complexity ofEulerian circuits with turning costs, Nat. Comp. 14 (2015) 1–13.[6] J. A. Ellis-Monaghan, I. Moffatt, Graphs on surfaces: Dualities, polynomials, and knots, SpringerBriefsin Mathematics, Springer, New York, 2013.[7] J. A. Ellis-Monaghan, G. Pangborn, N. C. Seeman, S. Blakeley, C. Disher, M. Falcigno, B. Healy, A.Morse, B. Singh, M. Westland, Design tools for reporter strands and DNA origami scaffold strands.Theoret. Comput. Sci. 671 (2017) 69–78.[8] M. R. Garey, D. S. Johnson, R. Endre Tarjan, The planar Hamiltonian circuit problem is NP-complete,SIAM J. Comput. 5 (1976) 704–714.[9] J. L. Gross, T. W. Tucker, Topological graph theory, Dover, Mineola, New York, 2001.[10] Meigu Guan (Guan Meigu), Graphic programming using odd or even points, Acta Mathematica Sinica10 (1960) 263–266 (in Chinese); translated as Mei-ko Kwan (Kwan Mei-ko), Chinese Mathematics 1(1962) 273–277.[11] N. Jonoska, N. Seeman, G. Wu, On existence of reporter strands in DNA-based graph structures,Theoret. Comput. Sci. 410 (2009) 1448–1460.[12] A. Mohammed, M. Hajij, Unknotted strand routings of triangulated meshes, in: Proceedings of DNAComputing and Molecular Programming: 23rd International Conference, DNA 23 (Austin, TX, USA,September 24-28, 2017), Lecture Notes in Computer Science 10467 (2017) 46-63.[13] B. Mohar, C. Thomassen, Graphs on surfaces, Johns Hopkins University Press, Baltimore, 2001.[14] A. Morse, W. Adkisson, J. Greene, D. Perry, B. Smith, G. Pangborn, J. Ellis-Monaghan, DNA origamiand unknotted A-trails in torus graphs, preprint. https://arxiv.org/abs/1703.03799 [15] J. Pelesko, Self Assembly: The Science of Things That Put Themselves Together, Chapman andHall/CRC, 2007.[16] J. Petersen, Die Theorie der regul¨aren Graphen, Acta Math. 15 (1891) 193–220.[17] P. W. K. Rothemund, Folding DNA to create nanoscale shapes and patterns, Nature, 440 (2006)297–302.[18] N. C. Seeman, Structural DNA Nanotechnology, Cambridge University Press, Cambridge, 2015.[19] M. ˇSkoviera, J. ˇSir´aˇn, Oriented relative embeddings of graphs, in: Proceedings of the InternationalConference on Combinatorial Analysis and its Applications (Pokrzywna, 1985), Zastos. Mat. 19 (1987-8) 589–597.
20] R. Veneziano, S. Ratanalert, K Zhang, F. Zhang, H. Yan, W. Chiu, M. Bathe, Designer nanoscale DNAassemblies programmed from the top down, Science 352 no. 6293 (2016) 1534.[21] Douglas B. West, Introduction to Graph Theory, 2nd edition, Prentice Hall, Upper Saddle River, NewJersey, 2001.20] R. Veneziano, S. Ratanalert, K Zhang, F. Zhang, H. Yan, W. Chiu, M. Bathe, Designer nanoscale DNAassemblies programmed from the top down, Science 352 no. 6293 (2016) 1534.[21] Douglas B. West, Introduction to Graph Theory, 2nd edition, Prentice Hall, Upper Saddle River, NewJersey, 2001.