[PDF] Subcubic Certificates for CFL Reachability

Abstract

Many problems in interprocedural program analysis can be modeled as the context-free language (CFL) reachability problem on graphs and can be solved in cubic time. Despite years of efforts, there are no known truly sub-cubic algorithms for this problem. We study the related certification task: given an instance of CFL reachability, are there small and efficiently checkable certificates for the existence and for the non-existence of a path? We show that, in both scenarios, there exist succinct certificates (O(n^2) in the size of the problem) and these certificates can be checked in subcubic (matrix multiplication) time. The certificates are based on grammar-based compression of paths (for positive instances) and on invariants represented as matrix constraints (for negative instances). Thus, CFL reachability lies in nondeterministic and co-nondeterministic subcubic time. A natural question is whether faster algorithms for CFL reachability will lead to faster algorithms for combinatorial problems such as Boolean satisfiability (SAT). As a consequence of our certification results, we show that there cannot be a fine-grained reduction from SAT to CFL reachability for a conditional lower bound stronger than n^\omega, unless the nondeterministic strong exponential time hypothesis (NSETH) fails. Our results extend to related subcubic equivalent problems: pushdown reachability and two-way nondeterministic pushdown automata (2NPDA) language recognition. For example, we describe succinct certificates for pushdown non-reachability (inductive invariants) and observe that they can be checked in matrix multiplication time. We also extract a new hardest 2NPDA language, capturing the "hard core" of all these problems.

Full PDF

aa r X i v : . [ c s . F L ] F e b Subcubic Certiﬁcates for CFL Reachability

Dmitry Chistikov Rupak Majumdar Philipp Schepper Centre for Discrete Mathematics and its Applications (DIMAP) & Department ofComputer Science, University of Warwick, Coventry, United Kingdom [email protected] Max Planck Institute for Software Systems, Kaiserslautern, Germany [email protected] CISPA Helmholtz Center for Information Security, Saarbr¨ucken, Germany &Saarbr¨ucken Graduate School of Computer Science, Saarland Informatics Campus,Germany [email protected]

Abstract

Many problems in interprocedural program analysis can be modeled as thecontext-free language (CFL) reachability problem on graphs and can be solved incubic time. Despite years of eﬀorts, there are no known truly sub-cubic algorithmsfor this problem. We study the related certiﬁcation task: given an instance of CFLreachability, are there small and eﬃciently checkable certiﬁcates for the existenceand for the non-existence of a path? We show that, in both scenarios, there existsuccinct certiﬁcates ( O ( n ) in the size of the problem) and these certiﬁcates canbe checked in subcubic (matrix multiplication) time. The certiﬁcates are basedon grammar-based compression of paths (for positive instances) and on invariantsrepresented as matrix constraints (for negative instances). Thus, CFL reachabilitylies in nondeterministic and co-nondeterministic subcubic time.A natural question is whether faster algorithms for CFL reachability will lead tofaster algorithms for combinatorial problems such as Boolean satisﬁability (SAT).As a consequence of our certiﬁcation results, we show that there cannot be a ﬁne-grained reduction from SAT to CFL reachability for a conditional lower boundstronger than n ω , unless the nondeterministic strong exponential time hypothesis(NSETH) fails.Our results extend to related subcubic equivalent problems: pushdown reacha-bility and two-way nondeterministic pushdown automata (2NPDA) language recog-nition. For example, we describe succinct certiﬁcates for pushdown non-reachability(inductive invariants) and observe that they can be checked in matrix multiplica-tion time. We also extract a new hardest 2NPDA language, capturing the “hardcore” of all these problems. Introduction

Context-free reachability is a fundamental problem in interprocedural program analysis,veriﬁcation of recursive programs, and database theory [20, 56, 38, 43, 8]. For a ﬁxedcontext-free language (CFL) L over an alphabet Σ, given a directed graph G = ( V, E ), anedge-labeling function λ : E → Σ, and two vertices s, t ∈ V , the L -reachability problemasks if there is a path from s to t in G such that the word formed by concatenating thelabels along the path belongs to L . It is well-known that the problem can be solved in timecubic in the size of the graph for any ﬁxed CFL. However, despite many years of eﬀorts,we only know speedups by logarithmic factors (i.e., to O ( n / log n )) [45, 16], leading toa conjecture that no better algorithms are possible for this and several related problems[29]. In recent years, a number of results in ﬁne-grained complexity give credence to theconjecture by demonstrating various conditional lower bounds for the problem [14, 1, 37],but even so the possibility of algorithms with running time n ω or above has not beenruled out. Here, ω < . certifying an instance of CFL reachability.Intuitively, this problem asks for easily veriﬁable proofs of inclusion or non-inclusion.Given a (positive or negative) instance of CFL reachability, we ask if there is an eﬃcientlycheckable proof that will convince anyone that the instance is indeed positive or negative.Formally, a certiﬁcate system for CFL reachability consists of two algorithms (thecheckers), one for positive instances and one for negative instance. Each checker takesas input an instance of the problem and an additional string (called the certiﬁcate) andaccepts or rejects. The positive (resp. negative) checker is complete if for each positive(resp. negative) instance, there is a certiﬁcate that makes it accept, and sound if for eachnegative (resp. positive) instance, there is no certiﬁcate that makes it accept. Of course,since the instance can be decided in cubic time, a certiﬁcate system is non-trivial only ifthe checkers run in subcubic time (in the size of the instance).Our main result shows the existence of subcubic certiﬁcate systems for CFLreachability : every positive or negative instance has a quadratic certiﬁcate and a checkerthat runs in O ( n ω ) time. • For a positive instance of the problem, a naive certiﬁcate is a path from s to t witnessing inclusion. Unfortunately, this is not an eﬃcient certiﬁcate, since it isknown that the shortest path can be exponentially long in the size of the graph. Weshow that the shortest path is well-compressible by a context free grammar of size O ( n ) in the number of vertices of the graph. Moreover, given such a compressedrepresentation, there is a checker verifying in time O ( n ) that the grammar indeedencodes a witness path. • For a negative instance of the problem, a certiﬁcate is an inductive invariant thatdemonstrates non- reachability. We show that such an inductive invariant can berepresented as relations between a constant number of n × n matrices, and there isa checker verifying in time O ( n ω ) that such an encoding does represent an inductiveinvariant. Additionally, if we allow randomization, there is a randomized checkerrunning in O ( n ) time.Summing up, CFL reachability can be certiﬁed in subcubic time. In retrospect, thecertiﬁcate system is simple but illuminates a conceptually new aspect of an old prob-lem. Certiﬁcate systems make it possible to separate two possibly independent phases ofcomputation, ﬁnding a solution to a computational problem and verifying it.2e consider complexity-theoretic implications. Impagliazzo and Paturi [31] intro-duced the strong exponential time hypothesis ( SETH ), which informally states that

SAT has no algorithms better than exhaustive search. Over the years,

SETH has becomea fundamental assumption relative to which many ﬁne-grained complexity results areproved [55]. For example,

SETH implies current (quadratic) algorithms for orthogonalvectors or edit distance problems are optimal. A natural question is if

SETH also impliesthat cubic algorithms for CFL reachability are optimal.Our result shows that such a reduction would be very diﬃcult to ﬁnd. Carmosinoet al. [13] extended

SETH to the nondeterministic strong exponential time hypothesis(

NSETH ), which states that there is no algorithm for Boolean tautology better thanexhaustive search, even with nondeterministic guessing. They show that both provingand refuting

NSETH imply breakthroughs in computational complexity. Our subcubiccertiﬁcation result implies that any conditional lower bound for CFL reachability from

SAT and

SETH will show that

NSETH does not hold.A model checking problem closely related to CFL reachability is pushdown reach-ability [8, 23]. Our results lead to a subcubic certiﬁcate system for pushdownreachability too, by extracting quadratic certiﬁcates from the standard saturation-based algorithm and the triplet construction for PDA to CFG conversion. Indeed, byexploiting ﬁne-grained reductions between CFL reachability, pushdown reachability, theemptiness problem for pushdown automata, and the recognition problem for two-waynondeterministic pushdown automata (2NPDA), we show all these problems (as well asother related problems known in the literature) have subcubic certiﬁcate systems. Ourconstructions and reductions have several implications. First, succinct certiﬁcates forpushdown (non-)reachability checkable in subcubic time is a new observation; it can havepotentially practical application in checking proofs of programs [40] and in “exports” ofmodel checking such as certiﬁcate set analysis in trust management systems [32]. Second,our reductions lead to a new insight beyond certiﬁcation. We identify a new hardest2NPDA language , that is, a ﬁxed 2NPDA language L such that for every 2NPDA lan-guage L there is a homomorphism h such that w ∈ L iﬀ h ( w ) ∈ L . A diﬀerent hardestlanguage was previously found by Rytter [44] using language-theoretic techniques. How-ever, our proof and reductions strengthen the link between 2NPDA language recognitionand CFL reachability, pointing to the hardest instances of the latter. Related work.

In a quest to classify the complexity of problems in P, ﬁne-grainedreductions interlink the asymptotic running time of algorithms for various problems.A ﬁne-grained reduction shows that a faster algorithm for one problem automaticallyimplies a faster algorithm for another problem. Conversely, the existence of ﬁne-grainedreductions can be interpreted as conditional lower bounds: no faster algorithm exists,unless a state-of-the-art algorithm for a well-known problem is actually suboptimal. Forexample, a truly sub-quadratic algorithm for Orthogonal Vectors will lead to a 2 (1 − ε ) n -time algorithm for SAT , breaking

SETH [55]. Similarly, the k -Clique conjecture statesthat no (randomized or deterministic) algorithm can detect a k -Clique on an n -vertexgraph in time O ( n ωk − ε ) for ε >

0. Abboud et al. [1] show a reduction from the k -Cliqueproblem to CFL recognition , giving a conditional lower bound of order n ω and matchingValiant’s ˜ O ( n ω ) upper bound for the problem [50]. This lower bound applies to CFLreachability as well. Chatterjee et al. [14], using Lee’s result [35], reduce Boolean matrixmultiplication to Dyck- k reachability (for growing k ), showing that faster algorithmsfor the latter avoiding matrix multiplication would be a breakthrough. Chatterjee and3sang [15] show a similar reduction to PDA emptiness.More broadly, a range of problems in formal languages are now being approachedwith tools from modern algorithms and complexity [21]. Our work contributes to thisongoing eﬀort. Backurs and Indyk [4] and Bringmann, Grønlund, and Larsen [11] ﬁnd SAT -based conditional lower bounds for regular expression matching problems. Oliveiraand Wehar [18] show reductions between triangle ﬁnding, 3SUM, and the non-emptinessof intersection of two or three DFA. Potechin and Shallit [42] show a reduction fromOrthogonal Vectors to the acceptance problem for (a subclass of) NFA and a reductionfrom triangle ﬁnding to (unary) NFA acceptance. Fernau and Krebs [22] establish condi-tional lower bounds for a variety of automata-theoretic problems beyond P. Wehar andco-authors have shown that faster algorithms for various intersection non-emptiness prob-lems have consequences for structural complexity classes [53, 48, 19]. We discuss furtherrelated work in Section 6.

Let L be a ﬁxed language. Given a directed graph G = ( V, E ), an edge-labeling function λ : E → Σ, and two vertices s, t ∈ V , the L -reachability problem asks if there is apath from s to t (possibly repeating vertices and edges) such that the word formed byconcatenating the labels along the path belongs to L [56]. When L is a ﬁxed context-freelanguage, the problem is called CFL reachability.

CFL reachability plays a foundationalrole in several areas within computer science. To the best of our knowledge, it ﬁrstappears in the work by Dolev, Even, and Karp [20] as the combinatorial core in thesecurity analysis of a cryptographic protocol. Yannakakis [56] and Melski and Reps [38]elucidate the role of this problem in the context of database theory and interproceduralprogram analysis, respectively, providing in particular a historical sketch.In formal language theory, L -reachability and CFL reachability can be seen as pro-viding an algorithmic perspective on the classic deﬁnition of rational index of a lan-guage [6, 41]. These problems have also been studied under the name “regular realizabil-ity” (see, e.g., [51, 52]).For CFL reachability, without loss of generality, the ﬁxed language can be assumed tobe the Dyck- language . This is the language of balanced parentheses with two kinds ofparenthesis symbols. Formally, it is the context free language over the alphabet { ( , ) , [ , ] } deﬁned by the following context-free grammar: S → SS | ( S ) | [ S ] | ε The

Dyck- reachability problem, denoted D Reach , is the L -reachability problem when L is the Dyck-2 language. Claim 1.

Let ( G, λ, s, t ) be an instance of the CFL reachability problem. There is alinear-time reduction (in the bit-size of the input) to an instance ( G ′ , λ ′ , s ′ , t ′ ) of theDyck- reachability problem. We call an algorithm truly subcubic if it has (worst-case) running time O ( n − ε ) forsome constant ε >

0, where n denotes the bit length of the input. Practical implementa-tions use a summarization-based O ( | V | ) algorithm [43]; note that | V | ≤ n . Using Ryt-ter’s trick [45], Chaudhuri [16] shows that the L -reachability problem is O ( | V | / log | V | )for any ﬁxed context-free language. However, no truly subcubic algorithm is known for4his problem. The best known conditional lower bound for the problem is has order | V | ω .On the other hand, Dyck-1 reachability (the language of balanced parentheses with onekind of parentheses) can be solved in time ˜ O ( | V | ω ) [9, 10, 37], matching best conditionallower bounds. In this section we show that, while truly subcubic algorithms for Dyck-2 reachability arenot known, solutions to Dyck-2 reachability have small and eﬃciently checkable certiﬁ-cates.An instance (

G, λ, s, t ) of D Reach is a yes-instance if there is a walk from s to t labeled with a string from Dyck-2, and a no-instance otherwise. Deﬁnition 2.

We say that D Reach has subcubic certiﬁcates for yes-instances (respec-tively, no-instances) if, for some real number ε >

0, there is an algorithm M and afunction p ( x ) = O ( x − ε ) such that for every instance ( G, λ, s, t ) of D Reach : (completeness) if the instance is a yes-instance (respectively, no-instance), then thereis a string u of length p ( | V | ), called a certiﬁcate , such that M accepts ( G, λ, s, t, u )in p ( | V | ) time, and (soundness) if the instance is a no-instance (respectively, yes-instance), then for everystring u of length p ( | V | ), the algorithm M rejects ( G, λ, s, t, u ) in p ( | V | ) time.(Note that the running time of M is subcubic in | V | , which is at most the bit size ofthe instance, and not in the size of the certiﬁcate.) That is, a subcubic certiﬁcate for yes-and no-instances allows us to verify, given the additional certiﬁcate, whether an instanceof D Reach is a positive or a negative instance in sub-cubic time.We will refer extensively to walks in labelled directed graphs. For a labelled directedgraph (

V, E, λ : E → { ( , ) , [ , ] } ), a walk from u ∈ V to v ∈ V is a sequence of edges π := e . . . e k from E , for k ≥

0, such that for each i ∈ { , . . . , k } , edge e i − arrives at thesame vertex that edge e i departs from, and moreover e departs from u and e k arrives at v .This walk is valid if the word λ ( e ) . . . λ ( e k ) belongs to the Dyck-2 language. A subwalkof a walk e . . . e k is a contiguous subsequence e i . . . e j of edges, possibly empty. We describe our certiﬁcate system for yes-instances of D Reach . These certiﬁcates arewitnesses for reachability. We ﬁx an instance of D Reach : G = ( V, E ) a directed graph, λ : E → { ( , ) , [ , ] } an edge-labeling function, and s, t ∈ V source and target vertices.A ﬁrst attempt is to provide a valid walk as a certiﬁcate (witness). However, it is well-known that the shortest valid walk can be exponential in the size of the input, namely itcan be of length exp Θ( | V | / log | V | ), and this bound is tight [41]. (For an intuition, onecan think of a pushdown automaton accepting only words of length exponential in itssize and longer.) The main observation to get subcubic certiﬁcates is that there is alwayssome valid walk (including the shortest one in particular) that is well-compressible andthat has a small representation ( O ( | V | ) in the size of the graph) and it is eﬃcient tocheck (in time O ( | V | )) that such a compressed walk is indeed a valid walk. Moreover,for every no-instance, one cannot get any valid walks, compressed or otherwise.5he following deﬁnition “inlines” the concept of a straight-line program, which is an“acyclic” context-free grammar that generates one word only. Straight-line programs areat the core of general-purpose compression algorithms such as LZ77 (see, e.g., [36]). Deﬁnition 3.

For an instance of D Reach , denote by −→ V a fresh copy of the set V ,written as −→ V = {−→ uv | ( u, v ) ∈ V } . A walk scheme is a context-free grammar withthe set of terminal symbols E , a set of nonterminal symbols NT ⊆ −→ V , and the axiom −→ st ∈ NT , where: • for each nonterminal −→ uv ∈ NT there is exactly one production, which moreover hasthe form:(a) −→ uv → −→ uw −→ wv for some w ∈ V , or(b) −→ uv → e −→ xy f for some edges e = ( u, x ) ∈ E and f = ( y, v ) ∈ E with λ ( e ) · λ ( f ) ∈ { () , [] } , or(c) −→ uu → ε for some u ∈ V , and • the directed graph with vertices NT and the following set of edges is acyclic: n ( −→ ab, −→ cd ) | −→ cd occurs on the right-hand side of the production of −→ ab o . (1) Proposition 4.

Every walk scheme has size O ( | V | ) and bit size O ( | V | log | V | ) . Theorem 5.

The following statements hold: • An instance of D Reach is a yes-instance if and only if there exists a walk schemefor it. • There is a deterministic algorithm that runs in time O ( | V | ) and decides if a givengrammar is a walk scheme for a given instance of D Reach . For the proof of Theorem 5, we need the following auxiliary result.

Lemma 6.

Let G be a context-free grammar with L ( G ) = ∅ . Suppose G contains morethan one production with the same nonterminal on the left-hand side. Then by removingall of them but one we can obtain a grammar G ′ with L ( G ′ ) = ∅ .Proof of Theorem 5. We split the proof into three parts.

Soundness.

We ﬁrst suppose that for a given instance of D Reach there exists a walkscheme, W , and show that the instance must be a yes-instance. Consider the directedgraph from the acyclicity condition in the deﬁnition of walk schemes, denote it D . Wewill consider all vertices of D , i.e., nonterminals from NT , in any reversed topologicalordering. In other words, whenever −→ cd occurs on the right-hand side of the production of −→ ab , we will consider −→ cd before −→ ab . We will show by induction that, for every −→ uv ∈ NT , the(one) word generated by −→ uv is a valid walk from u to v . (Recall that a walk is valid if it islabelled by a Dyck-2 word.) Indeed, it suﬃces to consider the three types of productions:6a) for a production of the form −→ uv → −→ uw −→ wv , we know from the inductive hypothesisthat −→ uw generates a valid walk from u to w , and −→ wv a valid walk from w to v , sotheir concatenation is a valid walk from u to v ;(b) for a production of the form −→ uv → e −→ xy f with edges e = ( u, x ) ∈ E and f = ( y, v ) ∈ E , we know from the inductive hypothesis that −→ xy generates a valid walk from x to y ,and since λ ( e ) · λ ( f ) ∈ { () , [] } , the result of the concatenation is a valid walk from u to v ;(c) ﬁnally, productions of the form −→ uu → ε correspond to trivial valid walks (containingno edges) and represent the induction base.As the axiom of the grammar W is −→ st , we conclude that there is a valid walk from s to t ,which means that the instance of D Reach we consider is a yes-instance.

Completeness.

In the converse direction, let us prove that that every yes-instance of D Reach has a walk scheme. Consider such an instance, (

G, λ, s, t ), and consider a walkfrom s to t , call it π . We construct a walk scheme in several steps.First consider a context-free grammar G with the set of terminal symbols E , set ofnonterminal symbols −→ V , and axiom −→ st . The set of productions is determined as follows.For each nonterminal −→ uv ∈ −→ V , we include all productions of the form: • −→ uv → −→ uw −→ wv for all w ∈ V ; • −→ uv → e −→ xy f where e = ( u, x ) ∈ E and f = ( y, v ) ∈ E such that λ ( e ) · λ ( f ) ∈{ () , [] } ; • −→ uu → ε for all u ∈ V .Induction on the structure of π shows that π ∈ L ( G ), so L ( G ) = ∅ .We can now prune the set of productions of the grammar G using Lemma 6, aswell as apply standard procedures of removing useless (non-productive or unreachable)nonterminals in context-free grammars (see, e.g., [30, Section 7.1]). We perform thesesteps until all three have no eﬀect on the grammar. The resulting grammar W satisﬁesall conditions in the deﬁnition of walk schemes, except possibly the acyclicity condition.We claim that W must satisfy that condition too. Indeed, the transformations appliedso far ensure that L ( W ) = ∅ . Let NT ⊆ −→ V be the set of nonterminals of W . Assume forthe sake of contradiction that the directed graph with vertices NT and edges (1) containsa directed cycle. Let −→ ab ∈ NT be a vertex on this cycle. Since all nonterminals of W arereachable and productive, there exists a valid parse tree with respect to W that containsa node labelled by −→ ab . By deﬁnition of the graph, and since every nonterminal in W has exactly one production, this node has a descendant labelled with −→ ab . By the samereasoning, this descendant also has a descendant labelled with −→ ab , etc., which cannot bethe case as the tree is ﬁnite. This contradiction means that the graph must be acyclic,so W is in fact a walk scheme. 7 eriﬁcation algorithm. The condition NT ⊆ −→ V and the choice of the axiom can bechecked in time O ( | V | ). The fact that there is exactly one production per nonterminalcan be checked under the same time constraints; and so can the form of these produc-tions and compatibility with the instance of D Reach . Finally, depth-ﬁrst search–basedtopological sort procedure can be used to detect the existence of directed cycles; it runsin time linear in the number of edges, which is at most | V | . Remark 7.

There is nothing special about Dyck-2 in the construction, and a similarcertiﬁcate can be constructed for any ﬁxed CFG.We already mentioned a link to compressed words above. Our proof of Theorem 5 ﬁndsa context-free grammar that generates exactly one word and has O ( | V | ) nonterminals inChomsky normal form. Importantly, while it is in general a PSPACE -complete problem todecide whether such a compressed word is accepted by a pushdown automaton (see, e.g.,the survey [36, section 9.4] and references therein), our grammar has special structure,leading to an eﬃcient veriﬁcation algorithm.

Fix an instance of D Reach . For ease of notation, we will assume that V = { , . . . , | V |} .A certiﬁcate for no-instances will be a separator , as deﬁned next. Such a certiﬁcate isessentially an inductive invariant, certifying non-reachability.Let A ( , A [ , A ) , A ] be four 0–1 matrices of size | V | × | V | that are adjacency matricesfor the graph G restricted to sets of edges with labels ( , [ , ) , ] , respectively.For a nonnegative integer matrix N , denote by bool( N ) the matrix obtained from N by replacing every nonzero element by 1. Let I denote the | V | × | V | identity matrix. Wewrite A ≤ B for matrices A = ( a ij ) and B = ( b ij ) of the same size whenever a ij ≤ b ij forall i , j . Deﬁnition 8. A separator for an instance of D Reach is a sextuple of | V | × | V | matrices,( M S , M SS , M ( S , M [ S , M ( S ) , M [ S ] ), where all entries belong to { , , . . . , | V | } , and more-over all entries of M S belong to { , } , and such that the following ten conditions aresatisﬁed: I ≤ M S , A ( · M S = M ( S , A [ · M S = M [ S ,M S · M S = M SS , M ( S · A ) = M ( S ) , M [ S · A ] = M [ S ] , bool( M SS ) ≤ M S , bool( M ( S ) ) ≤ M S , bool( M [ S ] ) ≤ M S , and ( M S ) s,t = 0 , (2)where s and t are the source and target vertex in the instance of D Reach . Proposition 9.

Every separator has O ( | V | ) entries and bit size O ( | V | log | V | ) . Theorem 10.

The following statements hold: • An instance of D Reach is a no-instance if and only if there exists a separator forit. • There is a deterministic algorithm that runs in time O ( | V | ω ) and decides if a givensextuple of | V | × | V | matrices is a separator for a given instance of D Reach . There is a randomized algorithm that runs in time O ( | V | ) and decides if a givensextuple of | V | × | V | matrices is a separator for a given instance of D Reach . Inthe case it is, the algorithm never errs; otherwise the algorithm ﬂags an issue withprobability ≥ . .Proof. We split the proof into four parts.

Completeness.

First consider a no-instance of D Reach . Take the matrix M S = ( m ij ),where each m ij is 1 if there is a valid walk from vertex i to vertex j . It is clear that m st = 0, because the instance is a no-instance. We now show that picking the othermatrices M SS , M ( S , M [ S , M ( S ) , M [ S ] so that all the ﬁve matrix equalities among the con-straints (2) are satisﬁed leads to the satisfaction of the remaining (four) inequality con-straints. Indeed: • I ≤ M S because for each vertex i the empty walk from i to i is valid; • bool( M SS ) ≤ M S because the concatenation of two valid walks is a valid walk; • bool( M ( S ) ) ≤ M S and bool( M [ S ] ) ≤ M S because every walk e · π · e ′ is validwhenever π is valid and e and e ′ are labelled by a matching pair of parentheses,either ( , ) or [ , ] .This shows that there is a separator for each no-instance. Soundness.

In the converse direction, consider an arbitrary instance of D Reach . Weshow that for every valid walk π from a vertex u to a vertex v in the graph, all separatorsmust satisfy the condition m uv = 1 where M S = ( m ij ). (It then follows that yes-instanceshave no separators.) We use induction on the label of walk π , which is simply theconcatenation of individual edge labels: • The base case is the empty label, ε . The walk π must then be the empty walk,from some vertex u to itself. We recall that I ≤ M S for every separator; so indeed m ii must be set to 1 for all vertices i , and for the chosen vertex i = u in particular. • If the walk π is labelled by α · β , where both α and β are nonempty Dyck-2 words,then there exists a vertex w such that π = π ′ · π ′′ and π ′ and π ′′ are valid walks from u to w and from w to v , respectively. By the inductive hypothesis, m u,w = m w,v = 1.Since bool( M SS ) = bool( M S · M S ) ≤ M S , we conclude that m u,v = 1 in this case aswell. • Finally, suppose the label of the walk π is ( α ) , for some Dyck-2 word α . (The case [ α ] is analogous.) Then π = e · π ′ · f , where e and f are individual edges, say from u to u ′ and from v ′ to v (for some u ′ , v ′ ∈ V ), and π ′ is a valid walk from u ′ to v ′ . Theedges e = ( u, u ′ ) and f = ( v ′ , v ) have labels ( and ) , respectively. By the inductivehypothesis, m u ′ v ′ = 1. We now observe that bool( M ( S ) ) = bool( M ( S · A ) ) =bool( A ( · M S · A ) ) ≤ M S . On the left-hand side, the matrix product has a positiveentry in position uv , because ( A ( ) u,u ′ = ( A ) ) v ′ ,v = 1 by the deﬁnition of A ( and A ) .Therefore m uv = 1.This concludes the proof of the ﬁrst assertion of the theorem.9 eterministic algorithm. The algorithm from the second assertion of the theoremveriﬁes all conditions in the deﬁnition of separator directly. This means in particular ﬁvematrix multiplications where the factors are matrices with elements from { , . . . , | V |} (worst-case time O ( | V | ω )), four inequalities between individual matrices (worst-case time O ( | V | )), and a single equality constraint on one of the entries (constant time). Remark.

This algorithm reduces the veriﬁcation of separators to 5 matrix multiplicationsover the nonnegative integers. While this result has complexity-theoretic consequences(see Section 4 below), it may appear unsatisfactory, as many theoretical algorithms forfast matrix multiplication are impractical. This brings the randomized algorithm tothe fore.

Randomized algorithm.

The algorithm from the ﬁnal assertion of the theorem is thesame as the previous one, except that instead of computing matrix multiplication it runsFreivalds’ algorithm for verifying matrix multiplication [24].Recall that Freivalds’ algorithm for verifying A · B = C for some n × n matrices A , B ,and C proceeds by picking a 0–1 vector u ∈ { , } n uniformly at random and checkingif A · ( Bu ) = Cu . The algorithm runs in O ( n ) time and has error probability 1 /

2. Theproperties of the algorithm are transferred directly to give a O ( | V | ) bound. Since wehave ﬁve products to check, we reduce the error probability in an individual check to 1 / / ≤ / Remark 11.

For the deterministic veriﬁcation algorithm, it suﬃces to specify the 0–1matrix M S only, because the other ﬁve matrices can be computed in time O ( | V | ω ) from it. Remark 12.

Once again, there is nothing special about the Dyck-2 language in ourcertiﬁcate system. One can readily see that the conditions we impose on separatorscorrespond to the following context-free grammar for the Dyck-2 language: S → SS | P ) | Q ] | ε P → ( S Q → [ S .

Replacing this grammar with a diﬀerent one, we obtain a certiﬁcate system (for no-instances) for the CFL reachability problem where the ﬁxed CFL is represented by anyﬁxed CFG.

Remark 13.

In a model of computation with unit-cost integer arithmetic, integer matrixmultiplication can be veriﬁed in deterministic time O ( n ) [33]. For RAM with O (log n )-bit arithmetic operations, derandomization of Freivalds’ algorithm is an open problemeven in the nondeterministic setting. However, if the number of errors in the product isguaranteed to be O ( n − ε ), then a deterministic O ( n − ε )-time algorithm is known [34]. Complexity-theoretic summary of Section 3.

Leaving out sharper bounds on cer-tiﬁcate size and polylog( n ) factors (required in the Turing model), Theorems 5 and 10imply: Theorem 14. D Reach ∈ NTIME ( n ) ∩ coNTIME ( n ω ) ∩ co - MATIME ( n ) . L ∈ MATIME ( t ) (Merlin-Arthur time,introduced by Babai [3]) iﬀ there exists a deterministic machine M that takes inputs x, y, z where | y | = | z | = O ( t ( | x | )), runs in time O ( t ( | x | )), and such that for every x , x ∈ L ⇒ ∃ y. Pr z [ M ( x, y, z ) accepts] = 1 , x L ⇒ ∀ y. Pr z [ M ( x, y, z ) accepts] ≤ / , where the probability is with respect to the uniform distribution of z in { , } t ( | x | ) . Finally, co - MATIME ( t ) is the class of complements of languages in MATIME ( t ). Fine-grained complexity of D Reach . Fine-grained complexity research shows thateven small improvements in (the exponent of) the running time of many algorithmicproblems, such as orthogonal vectors or edit distance, would automatically give fasteralgorithms for Boolean satisﬁability,

SAT [55]. Would improvements over Chaudhuri’s O ( n / log n )-time algorithm for D Reach also have consequences for

SAT ? Here we showthat subcubic certiﬁcates give an answer to this question.In ﬁne-grained complexity, perhaps the most inﬂuential hypothesis, and the ultimatesource of many lower bounds, is the strong exponential-time hypothesis ( SETH ) [31], stat-ing (roughly) that there is no algorithm for

SAT better than exhaustive enumeration. The non-deterministic strong exponential-time hypothesis ( NSETH ) [13] extends it further.

Hypothesis 15 ( SETH ) . For every ε > , there exists a k so that k - SAT is not in

DTIME [2 n (1 − ε ) ] , where k - SAT is the language of all satisﬁable Boolean formulas in k -CNF. Hypothesis 16 ( NSETH ) . For every ε > , there exists a k so that k - TAUT is not in

NTIME [2 n (1 − ε ) ] , where k - TAUT is the language of all Boolean tautologies in k -DNF. In both hypotheses, n is the number of variables. It is unknown whether SETH and

NSETH are true.

NSETH implies

SETH , and

SETH implies P = NP . Carmosino et al. [13]explore consequences of NSETH and show that both proving and refuting it would leadto interesting consequences. In particular,

NSETH implies the absence of ﬁne-grainedreductions from

SAT to a number of problems and ¬ NSETH implies circuit lower bounds.It turns out that, because of our subcubic certiﬁcate systems (Section 3), there existsno ﬁne-grained reduction from

SAT (as well as from any

SETH -hard problem) to D Reach that would imply hardness beyond n ω , unless NSETH fails.Because of space constraints, we relegate the formal deﬁnition of ﬁne-grained reduc-tions to Appendix D. Intuitively, a ﬁne-grained reduction from (

L, t ( n )) to ( D Reach , n c )means that, for every ε >

0, an O ( n c − ε )-time algorithm for D Reach implies a O ( t ( n ) − δ )algorithm for problem L for some δ = δ ( ε ) >

0. This is not unlike usual Turing reductions(allowing multiple queries), tracking the precise exponents in the running time bounds.The following result is a consequence of Theorem 14.

Theorem 17.

Unless

NSETH fails, there is no ﬁne-grained reduction from ( SAT , n ) to ( D Reach , n ω + γ ) for any γ > . While CFL reachability is a central problem in program analysis, an analogous problemin model checking is pushdown reachability [8, 23, 7, 47], formalized as follows.11e are given a pushdown automaton (PDA) P = ( Q, Γ , ∆), where Q is a ﬁnite setof states, Γ is a ﬁnite alphabet of stack symbols, and ∆ ⊆ ( Q × Γ) × ( Q × Γ ≤ ) is aset of transitions, and an initial conﬁguration ( q , γ ) ∈ Q × Γ. We are additionallygiven a regular set of conﬁgurations R speciﬁed by a P -automaton : this is a usual, ε -freenondeterministic ﬁnite automaton (NFA) over the alphabet Γ in which the set of controlstates is S ⊇ Q and the transition relation is δ ⊆ S × Γ × S . A set of ﬁnal states, F ⊆ S , is usually taken to be disjoint from Q . Such a P -automaton is said to accept aconﬁguration ( q, w ) ∈ Q × Γ ∗ of the PDA P iﬀ there is a walk from control state q tosome ¯ q ∈ F labelled by the word w ; in other words, if w is accepted by this NFA whenstarted from q as initial state. We ask if the PDA P has a run from ( q , γ ) to someconﬁguration from R .We adapt our certiﬁcate system to pushdown reachability. For yes-certiﬁcates of size O ( | Γ || S | ), we can convert the PDA to an equivalent CFG using the standard tripletconstruction (see, e.g., [30, Chapter 6]) and repeat the second half of the completenessargument from Subsection 3.1. Explicitly, a certiﬁcate is a “sub-grammar” of this CFGthat is a straight-line program.We now show how to certify that a given initial conﬁguration cannot reach any con-ﬁguration from a given regular set R . The classic saturation algorithm for computingPre ∗ ( R ), the set of (reﬂexive, transitive) predecessors of conﬁgurations in R , takes a P -automaton A as input and iteratively adds transitions to it by the following rule: P has transition ( p, A ) → ( q, w ), A has walk q w −→ s ⇒ add transition p A −→ s to A .(3)By the following claim, saturation under (3) implies overapproximation of Pre ∗ ( R ). Theconverse inclusion is more subtle and will not be required. Claim 18 (see, e.g., Carayol and Hague [12, Section 3.2]) . A P -automaton A accepts allconﬁgurations from Pre ∗ ( R ) if (i) it contains all transitions of the original P -automatonand (ii) it is saturated, i.e., applying rule (3) does not change the transition relation. Our certiﬁcate system for non-reachability relies on the observation that the updaterule (3) can be expressed using matrix multiplication. A certiﬁcate is a ﬁnite familyof matrices, M A , M A,B , M A,B,C , M A,B,C , for all A, B, C ∈ Γ, satisfying the followingconditions: P A ≤ M A , ( M γ ) q ,f = 0 for all f ∈ F , T A,ε ≤ M A , bool( M A,B ) ≤ M A , M A,B = T A,B · M B , bool( M A,B,C ) ≤ M A , M A,B,C = T A,BC · M B , M A,B,C = M A,B,C · M C , (4)where we assume with no loss of generality that S = { , . . . , | S |} and denote by P A the A -transition matrix of the original P -automaton and, for all A ∈ Γ, w ∈ Γ ≤ , by T A,w = ( t ( A,w ) ij ) the 0–1 matrix of size | S | × | S | in which t ( A,w ) ij = 1 if i, j ∈ Q and P contains a transition ( i, A ) → ( j, w ). The following proposition summarises the propertiesof this system: Proposition 19.

Certiﬁcates have O ( | Γ | | S | ) entries. An instance of PDA emptinessis a no-instance iﬀ there exists a certiﬁcate for it. The conditions can be veriﬁed by adeterministic algorithm with running time O ( | Γ | | S | ω ) or a randomized algorithm withrunning time O ( | Γ | | S | ) that accepts valid certiﬁcates with probability one and rejectsinvalid ones with probability ≥ . . backwards invariant for the pushdown system P in question, an overapproximation of the set of conﬁgurations from which R is reachable. D Reach instance

In interprocedural program analysis, the lack of algorithms with running time O ( n − ε )is referred to as “the cubic bottleneck”. Heintze and McAllester [29] captured this phe-nomenon by the class of “2NPDA-complete” problems. Here “2NPDA” stands for two-way nondeterministic pushdown automata, a model of computation that extends stan-dard PDA with the ability to move back and forth on the (read-only) input tape [2]. Aproblem is 2NPDA-complete (following Neal [39]) if it is subcubic equivalent to : given a word, does it belong to the language of a ﬁxed 2NPDA. Heintzeand McAllester show a number of 2NPDA-complete problems, including ground monadicrewriting reachability (see also [39]), data ﬂow reachability, control ﬂow reachability,and certain (non-)typability problems. Melski and Reps [38] show a reduction from CFLreachability to data ﬂow reachability and set constraints (and thus to 2NPDA recognition)and a reverse reduction from data ﬂow reachability to an instance of CFL reachabilitywhere the language is not ﬁxed.The following result appears to be folklore but is not found in the literature, strength-ening the reduction of Melski and Reps to show hardness of CFL reachability for the ﬁxed Dyck-2 language. The equivalence between problems (1) and (2) is sketched byChaudhuri [16]. While we state the result for PDA emptiness, one can equivalently (oradditionally) state it for pushdown reachability. We provide full proofs in the appendix.

Proposition 20.

The following problems either all have truly subcubic algorithms, ornone of them do: (1) 2NPDA language recognition, (2) PDA language emptiness, and(3) D Reach .Proof (sketch).

We show three reductions: • In 2NPDA recognition to PDA emptiness, each control state of the PDA remembersthe position of the 2NPDA on the input tape and the control state of the 2NPDA.The size of PDA is linear in the length of the input word, because the 2NPDA isﬁxed. • In PDA emptiness to D Reach , the graph mimics the transition diagram of the PDA.Stack symbols from Γ are encoded by sequences of opening parentheses of two kindsof length ⌈ log | Γ |⌉ . Push transitions are modelled by sequences of edges with theselabels, and pop transitions by sequences with matching closing parentheses. Thereduction is linear-time, because the bit size of the PDA accounts for the log | Γ | factor. • In the last reduction, we give a ﬁxed 2NPDA that solves D Reach . The 2NPDAguesses a path through the graph, maintaining at the bottom of the stack a sequence σ ∈ { ( , [ } ∗ , and the current vertex at the top of the stack. The length of the inputword is proportional to the bit size of the graph (adjacency lists).13s a corollary, all of these problems have subcubic certiﬁcate schemes, and an ana-logue of Theorem 14 holds for them too (worked out for PDA emptiness in Section 5).Theorem 17 on the absence of SETH -hardness also extends to PDA emptiness and 2NPDArecognition.For upper bounds, note that 2NPDA recognition is solvable in time O ( | w | / log | w | ) [45],and language emptiness for PDA in time O ( n / log n ) .We observe that the hardness of 2NPDA recognition is witnessed by a single “hardest”2NPDA language: recognition for an arbitrary 2NPDA can be reduced to a single 2NPDA.Suppose some 2NPDA A over Σ is given and the input to 2NPDA recognition for A isa word w . Applying our cycle of reductions from Proposition 20 (to PDA emptiness,then to CFL reachability, and then back to 2NPDA recognition), we get another word u = u ( A , w ) and a 2NPDA B = B ( A , w ) such that B accepts u iﬀ A accepts w . But B infact doesn’t depend on A or w , because it is a ﬁxed 2NPDA for D Reach . One refers tosuch languages as hardest L ( B )cannot be easier than the recognition problem for any 2NPDA language L . The followingtheorem states this result in language-theoretic terms. (Recall that a homomorphism isa mapping, say h : Σ ∗ → Σ ∗ , such that h ( uv ) = h ( u ) h ( v ) for all u, v ∈ Σ ∗ .) Theorem 21.

There exists a 2NPDA A over an input alphabet Σ with the followingproperty: for every 2NPDA A over every ﬁnite Σ there is a homomorphism h : Σ ∗ → Σ ∗ such that, for all w ∈ Σ + , w ∈ L ( A ) if and only if h ( w ) ∈ L ( A ) . Essentially, B = A . Working out the details shows that the mapping u ( A , · ) can bemade a homomorphism for every A . This requires an appropriate encoding for inputsto A . Remark 22.

Rytter [44] showed there is a ﬁxed hardest 2NPDA language L , basedon the classic hardest context-free language by Greibach [28]. Theorem 21 identiﬁes adiﬀerent hardest 2NPDA language. In contrast with Rytter’s proof, our construction isself-contained and does not depend on Greibach’s hardest CFL. Instead, our new hardest2NPDA language is an encoding of a restricted version of Dyck-2 Reachability.We now describe the hardest language L ( A ). The alphabet is Σ = { ( , ) , [ , ] , , , − , ∗} .The language contains only words of the form ℓ o ∗ ℓ o ∗ . . . ∗ ℓ q o q ℓ q +1 o q +1 . . . . . . ℓ m o m (5)and the membership of such words in the language is determined as follows. Consider adirected graph G = ( V, E ) with V = { , . . . , n } where n is the number of blocks separatedby the vertex marker e = ( i, j ) belongs to E if and only if the i th block has asubword ℓ p o p with ℓ p ∈ { ( , ) , [ , ] } , o p = 1 k or o p = − k where j = i + k and this subwordis preceded and followed by symbols from { , ∗} or tape endmarker. The edge label is inthis case λ ( e ) = ℓ i . (If for some i and k the index j is “oﬀ the tape”, the tape endmarkercounts as one virtual vertex and then the counting reverses the direction, “reﬂecting” oﬀthe endmarker.) The word belongs to L ( A ) if and only if ( G, λ, , n ) is a yes-instance The reduction of Proposition 20, combined with Chaudhuri’s algorithm for CFL reachability [16],implies a O ( n / log n ) bound for PDA emptiness where n is the bit size of the input. (We give a sketch inAppendix F.) In contrast, “textbook” algorithms for PDA emptiness go through equivalent context-freegrammars [30], for which a cubic blow-up is unavoidable in the worst case [26]. Actually, Rytter only proves that, for all w ∈ Σ + , one has w ∈ L iﬀ h ( w $) ∈ L . D Reach , i.e., if G contains a walk from 1 to n labelled by a word from the Dyck-2language.To sum up, this restricted version of 2NPDA recognition is the “hard core” of theproblem: by Theorems 21 and 20, in order to ﬁnd subcubic algorithms for D Reach , itsuﬃces to handle instances obtained from it (exploiting any structural properties). PDAemptiness and D Reach are already hard for sparse graphs: a truly subcubic algorithmfor either problem restricted to graphs with a linear number of edges would already resultin a breakthrough algorithm for 2NPDA recognition.

Acknowledgements.

We thank Sayan Bhattacharya and Karl Bringmann for interesting discussions. PhilippSchepper is supported by the European Research Council (ERC) consolidator grantno. 725978 SYSTEMATICGRAPH. Rupak Majumdar was funded in part by the DeutscheForschungsgemeinschaft project 389792660-TRR 248 and by the European Research Coun-cil under the Grant Agreement 610150 (ERC Synergy Grant ImPACT).

References [1] Amir Abboud, Arturs Backurs, and Virginia Vassilevska Williams. If the currentclique algorithms are optimal, so is Valiant’s parser. In

IEEE 56th Annual Sympo-sium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20October, 2015 , pages 98–117. IEEE Computer Society, 2015.[2] Alfred V. Aho, John E. Hopcroft, and Jeﬀrey D. Ullman. Time and tape complexityof pushdown automaton languages.

Information and Control , 13(3):186–206, 1968.[3] L´aszl´o Babai. Trading group theory for randomness. In Robert Sedgewick, ed-itor,

Proceedings of the 17th Annual ACM Symposium on Theory of Computing,May 6-8, 1985, Providence, Rhode Island, USA , pages 421–429. ACM, 1985. URL: https://doi.org/10.1145/22145.22192 , doi:10.1145/22145.22192 .[4] Arturs Backurs and Piotr Indyk. Which regular expression patterns are hardto match? In Irit Dinur, editor, IEEE 57th Annual Symposium on Founda-tions of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, NewBrunswick, New Jersey, USA , pages 457–466. IEEE Computer Society, 2016. URL: https://doi.org/10.1109/FOCS.2016.56 , doi:10.1109/FOCS.2016.56 .[5] Daniel Bienstock, Neil Robertson, Paul D. Seymour, and Robin Thomas.Quickly excluding a forest. J. Comb. Theory, Ser. B , 52(2):274–283, 1991. URL: https://doi.org/10.1016/0095-8956(91)90068-U , doi:10.1016/0095-8956(91)90068-U .[6] Luc Boasson, Bruno Courcelle, and Maurice Nivat. The rational index: A com-plexity measure for languages. SIAM J. Comput. , 10(2):284–296, 1981. URL: https://doi.org/10.1137/0210020 , doi:10.1137/0210020 .[7] Ahmed Bouajjani, Javier Esparza, Alain Finkel, Oded Maler, Peter Ross-manith, Bernard Willems, and Pierre Wolper. An eﬃcient automata ap-proach to some problems on context-free grammars. Inf. Process. Lett. , 74(5-15):221–227, 2000. URL: https://doi.org/10.1016/S0020-0190(00)00055-7 , doi:10.1016/S0020-0190(00)00055-7 .[8] Ahmed Bouajjani, Javier Esparza, and Oded Maler. Reachability analysis of push-down automata: Application to model-checking. In CONCUR ’97: ConcurrencyTheory, 8th International Conference, Warsaw, Poland, July 1-4, 1997, Proceed-ings , volume 1243 of

Lecture Notes in Computer Science , pages 135–150. Springer,1997.[9] Phillip G. Bradford. Eﬃcient exact paths for Dyck and semi-Dyck labeled pathreachability.

CoRR , abs/1802.05239, 2018. arXiv:1802.05239 .[10] Karl Bringmann. Personal communication. 2018.[11] Karl Bringmann, Allan Grønlund, and Kasper Green Larsen. A dichotomy forregular expression membership testing. In Chris Umans, editor, , pages 307–318. IEEE Computer Society, 2017. URL: https://doi.org/10.1109/FOCS.2017.36 , doi:10.1109/FOCS.2017.36 .[12] Arnaud Carayol and Matthew Hague. Saturation algorithms for model-checkingpushdown systems. In Zolt´an ´Esik and Zolt´an F¨ul¨op, editors, Proceedings 14thInternational Conference on Automata and Formal Languages, AFL 2014, Szeged,Hungary, May 27-29, 2014 , volume 151 of

EPTCS , pages 1–24, 2014. URL: https://doi.org/10.4204/EPTCS.151.1 , doi:10.4204/EPTCS.151.1 .[13] Marco L. Carmosino, Jiawei Gao, Russell Impagliazzo, Ivan Mihajlin, RamamohanPaturi, and Stefan Schneider. Nondeterministic extensions of the strong exponentialtime hypothesis and consequences for non-reducibility. In Proceedings of the 2016ACM Conference on Innovations in Theoretical Computer Science, Cambridge, MA,USA, January 14-16, 2016 , pages 261–270. ACM, 2016.[14] Krishnendu Chatterjee, Bhavya Choudhary, and Andreas Pavlogiannis. OptimalDyck reachability for data-dependence and alias analysis.

PACMPL , 2(POPL):30:1–30:30, 2018. URL: https://doi.org/10.1145/3158118 , doi:10.1145/3158118 .[15] Krishnendu Chatterjee and Georg Osang. Pushdown reachability with constanttreewidth. Inf. Process. Lett. , 122:25–29, 2017.[16] Swarat Chaudhuri. Subcubic algorithms for recursive state machines. In

POPL ’08 ,pages 159–169. ACM, 2008.[17] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic pro-gressions.

J. Symb. Comput. , 9(3):251–280, 1990.[18] Mateus de Oliveira Oliveira and Michael Wehar. Intersection non-emptiness andhardness within polynomial time. In

DLT 2018 , volume 11088 of

Lecture Notes inComputer Science , pages 282–290. Springer, 2018.[19] Mateus de Oliveira Oliveira and Michael Wehar. On the ﬁne grained com-plexity of ﬁnite automata non-emptiness of intersection. In Natasa Jonoska16nd Dmytro Savchuk, editors,

Developments in Language Theory - 24th In-ternational Conference, DLT 2020, Tampa, FL, USA, May 11-15, 2020, Pro-ceedings , volume 12086 of

Lecture Notes in Computer Science , pages 69–82. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-48516-0_6 , doi:10.1007/978-3-030-48516-0\_6 .[20] Danny Dolev, Shimon Even, and Richard M. Karp. On thesecurity of ping-pong protocols. Inf. Control. , 55(1-3):57–68,1982. URL: https://doi.org/10.1016/S0019-9958(82)90401-6 , doi:10.1016/S0019-9958(82)90401-6 .[21] Henning Fernau. Modern aspects of complexity within formal languages.In Carlos Mart´ın-Vide, Alexander Okhotin, and Dana Shapira, editors, Lan-guage and Automata Theory and Applications - 13th International Con-ference, LATA 2019, St. Petersburg, Russia, March 26-29, 2019, Pro-ceedings , volume 11417 of

Lecture Notes in Computer Science , pages 3–30. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-13435-8_1 , doi:10.1007/978-3-030-13435-8\_1 .[22] Henning Fernau and Andreas Krebs. Problems on ﬁnite automata andthe exponential time hypothesis. Algorithms , 10(1):24, 2017. URL: https://doi.org/10.3390/a10010024 , doi:10.3390/a10010024 .[23] Alain Finkel, Bernard Willems, and Pierre Wolper. A direct symbolicapproach to model checking pushdown systems. In Faron Moller, ed-itor, Second International Workshop on Veriﬁcation of Inﬁnite StateSystems, Inﬁnity 1997, Bologna, Italy, July 11-12, 1997 , volume 9 of

Electronic Notes in Theoretical Computer Science , pages 27–37. Else-vier, 1997. URL: https://doi.org/10.1016/S1571-0661(05)80426-8 , doi:10.1016/S1571-0661(05)80426-8 .[24] Rusins Freivalds. Fast probabilistic algorithms. In Jir´ı Becv´ar, editor, Mathematical Foundations of Computer Science 1979, Proceedings, 8th Sym-posium, Olomouc, Czechoslovakia, September 3-7, 1979 , volume 74 of

Lec-ture Notes in Computer Science , pages 57–69. Springer, 1979. URL: https://doi.org/10.1007/3-540-09526-8_5 , doi:10.1007/3-540-09526-8\_5 .[25] Zvi Galil. Some open problems in the theory of computation as questions about two-way deterministic pushdown automaton languages. Mathematical Systems Theory ,10:211–228, 1977.[26] Jonathan Goldstine, John K. Price, and Detlef Wotschke. A pushdown automatonor a context-free grammar: which is more economical?

Theoret. Comput. Sci. ,18:33–40, 1982.[27] Jim Gray, Michael A. Harrison, and Oscar H. Ibarra. Two-way pushdown automata.

Information and Control , 11(1/2):30–70, 1967.[28] Sheila A. Greibach. The hardest context-free language.

SIAM J. Comput. , 2(4):304–310, 1973. URL: https://doi.org/10.1137/0202025 , doi:10.1137/0202025 .1729] Nevin Heintze and David McAllester. On the cubic bottleneck in subtyping and ﬂowanalysis. In LICS’97 . IEEE, 1997.[30] John E. Hopcroft, Rajeev Motwani, and Jeﬀrey D. Ullman.

Introduction to AutomataTheory, Languages, and Computation (3rd Edition) . Addison-Wesley Longman Pub-lishing Co., Inc., Boston, MA, USA, 2006.[31] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-sat.

J. Comput.Syst. Sci. , 62(2):367–375, 2001. URL: https://doi.org/10.1006/jcss.2000.1727 , doi:10.1006/jcss.2000.1727 .[32] Somesh Jha and Thomas W. Reps. Model checkingSPKI/SDSI. J. Comput. Secur. , 12(3-4):317–353, 2004. URL: http://content.iospress.com/articles/journal-of-computer-security/jcs209 .[33] Ivan Korec and Jir´ı Wiedermann. Deterministic veriﬁcation of integer matrix mul-tiplication in quadratic time. In Viliam Geﬀert, Bart Preneel, Branislav Rovan,Julius Stuller, and A Min Tjoa, editors,

SOFSEM 2014: Theory and Practiceof Computer Science - 40th International Conference on Current Trends in The-ory and Practice of Computer Science, Nov´y Smokovec, Slovakia, January 26-29,2014, Proceedings , volume 8327 of

Lecture Notes in Computer Science , pages 375–382. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-04298-5_33 , doi:10.1007/978-3-319-04298-5\_33 .[34] Marvin K¨unnemann. On nondeterministic derandomization of freivalds’ al-gorithm: Consequences, avenues and algorithmic progress. In Yossi Azar,Hannah Bast, and Grzegorz Herman, editors, , vol-ume 112 of LIPIcs , pages 56:1–56:16. Schloss Dagstuhl - Leibniz-Zentrumf¨ur Informatik, 2018. URL: https://doi.org/10.4230/LIPIcs.ESA.2018.56 , doi:10.4230/LIPIcs.ESA.2018.56 .[35] Lillian Lee. Fast context-free grammar parsing requires fast boolean matrix multi-plication. J. ACM , 49(1):1–15, 2002.[36] Markus Lohrey. Algorithmics on slp-compressed strings: A survey.

Groups Complex.Cryptol. , 4(2):241–299, 2012. URL: https://doi.org/10.1515/gcc-2012-0016 , doi:10.1515/gcc-2012-0016 .[37] Anders Alnor Mathiasen and Andreas Pavlogiannis. The ﬁne-grained and parallelcomplexity of andersen’s pointer analysis. Proc. ACM Program. Lang. , 5(POPL):1–29, 2021. URL: https://doi.org/10.1145/3434315 , doi:10.1145/3434315 .[38] David Melski and Thomas Reps. Interconvertibility of a class of set constraints andcontext-free-language reachability. Theor. Comput. Sci. , 248(1-2):29–98, 2000.[39] Radford Neal. The computational complexity of taxonomic inference. Unpublishedmanuscript. Available at ,1989.[40] G.C. Necula. Proof carrying code. In

POPL 97: Principles of Programming Lan-guages , pages 106–119. ACM, 1997. 1841] Laurent Pierre. Rational indexes of generators of the coneof context-free languages.

Theor. Comput. Sci. , 95(2):279–305,1992. URL: https://doi.org/10.1016/0304-3975(92)90269-L , doi:10.1016/0304-3975(92)90269-L .[42] Aaron Potechin and Jeﬀrey O. Shallit. Lengths of words ac-cepted by nondeterministic ﬁnite automata. Inf. Process. Lett. ,162:105993, 2020. URL: https://doi.org/10.1016/j.ipl.2020.105993 , doi:10.1016/j.ipl.2020.105993 .[43] T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataﬂow analysis viagraph reachability. In POPL 95: Principles of Programming Languages , pages 49–61.ACM, 1995.[44] Wojciech Rytter. A hardest language recognized by two-way non-deterministic pushdown automata.

Inf. Process. Lett. , 13(4/5):145–146, 1981. URL: https://doi.org/10.1016/0020-0190(81)90045-4 , doi:10.1016/0020-0190(81)90045-4 .[45] Wojciech Rytter. Fast recognition of pushdown automaton and context-free lan-guages. Information and Control , 67(1-3):12–22, 1985.[46] Wojciech Rytter. 100 exercises in the theory of automata and formal lan-guages, April 1987. Research report RR-99, University of Warwick, Departmentof Computer Science, available at http://wrap.warwick.ac.uk/60795/ . URL: http://wrap.warwick.ac.uk/60795/ .[47] Stefan Schwoon.

Model checking pushdown systems . PhDthesis, Technical University Munich, Germany, 2002. URL: http://tumb1.biblio.tu-muenchen.de/publ/diss/in/2002/schwoon.html .[48] Joseph Swernofsky and Michael Wehar. On the complexity of intersecting regular,context-free, and tree languages. In Magn´us M. Halld´orsson, Kazuo Iwama, NaokiKobayashi, and Bettina Speckmann, editors,

Automata, Languages, and Program-ming - 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015,Proceedings, Part II , volume 9135 of

Lecture Notes in Computer Science , pages 414–426. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-47666-6_33 , doi:10.1007/978-3-662-47666-6\_33 .[49] Roei Tell. Proving that prBPP = prP is as hard as proving that“almost NP ” is not contained in P /poly. Inf. Process. Lett. ,152, 2019. URL: https://doi.org/10.1016/j.ipl.2019.105841 , doi:10.1016/j.ipl.2019.105841 .[50] Leslie G. Valiant. General context-free recognition in less than cubic time. J. Comput.Syst. Sci. , 10(2):308–315, 1975.[51] Mikhail N. Vyalyi. On regular realizability problems.

Probl. Inf. Transm. ,47(4):342–352, 2011. URL: https://doi.org/10.1134/S003294601104003X , doi:10.1134/S003294601104003X . 1952] Mikhail N. Vyalyi and Alexander A. Rubtsov. On regular realizabil-ity problems for context-free languages. Probl. Inf. Transm. , 51(4):349–360, 2015. URL: https://doi.org/10.1134/S0032946015040043 , doi:10.1134/S0032946015040043 .[53] Michael Wehar. Hardness results for intersection non-emptiness. In JavierEsparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, edi-tors, Automata, Languages, and Programming - 41st International Collo-quium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings,Part II , volume 8573 of

Lecture Notes in Computer Science , pages 354–362. Springer, 2014. URL: https://doi.org/10.1007/978-3-662-43951-7_30 , doi:10.1007/978-3-662-43951-7\_30 .[54] Virginia Vassilevska Williams. Multiplying matrices faster than Coppersmith-Winograd. In STOC , pages 887–898. ACM, 2012.[55] Virginia Vassilevska Williams. On some ﬁne-grained questions in al-gorithms and complexity. In

International Congress of Mathematicians(ICM’18) , 2018. Available at https://eta.impa.br/dl/194.pdf and https://people.csail.mit.edu/virgi/eccentri.pdf .[56] Mihalis Yannakakis. Graph-theoretic methods in database theory. In

Proceedings ofthe Ninth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of DatabaseSystems, April 2-4, 1990, Nashville, Tennessee, USA , pages 230–242. ACM Press,1990. 20

Proof of Claim 1

Let P = ( Q, Σ , Γ , δ, q , F ). With at most a constant factor blow-up, we can assume that P is in a normal form, in which each transition ( q, a, Z, q ′ , γ, d ) ∈ δ is either a “push”( γ = Z ′ Z ∈ Γ ) or a “pop” ( γ = ε ) or an “unchanged” ( γ = Z ).Let ℓ = ⌈ log | Γ |⌉ . Fix any injective maps φ : Γ → { ( , [ } ℓ and ψ : Γ → { ) , ] } ℓ suchthat, for all Z ∈ Γ, the words ψ ( Z ) is obtained from φ ( Z ) by switching opening bracketsto closing brackets without changing their type—i.e., ( is replaced by ) and [ by ] ; andthen reversing the word.The graph G ′ has vertices V × Q ∪ V ′ ∪ { q f } , where V ′ is a set of new vertices and q f is a new vertex. We shall specify V ′ later.The vertices s ′ = ( s, q ) and t ′ = q f . There is a path from ( v, q ) to ( v ′ , q ′ ) labeled withthe consecutive letters of φ ( Z ′ ) if there is an edge v a −→ v ′ in G and ( q, a, Z, q ′ , Z ′ Z ) ∈ δ .The intermediate vertices along this path are distinct and are not incident to any otheredge of G ′ . Similarly, there is a path from ( v, q ) to ( v ′ , q ′ ) labeled with the consecutiveletters of ψ ( Z ) if there is an edge v a −→ v ′ in G and ( q, a, Z, q ′ , ε ) ∈ δ . There is an edge( v, q ) → ( v ′ , q ′ ) labeled with ε if there is an edge v a −→ v ′ in G and ( q, a, Z, q ′ , Z ) ∈ δ . Theset of all intermediate vertices added along the way constitute V ′ . Finally, there is anedge ( t, q ) → q f labeled with ε for each q ∈ F .Since P is ﬁxed, the algorithm runs in linear time in G and outputs G ′ which is linearin the size of G . By induction, we can show that there is a path from s to t in G labeledwith a word from P iﬀ there is a path from ( s, q ) to q f in G ′ labeled with a path inDyck-2. B Proof of Lemma 6

Let N be the nonterminal from the statement of the lemma. If N is not productive, i.e.,cannot derive any word, then all of its productions can be removed without any eﬀecton L ( G ). This is simply because N cannot appear in any successful derivation. We willtherefore assume that N is productive.Consider the parse tree of any successful derivation from N . We can ﬁnd in this parsetree a vertex labelled with N such that none of its descendants is labelled with N . Thesubtree T N rooted at this vertex corresponds to a derivation that applies some production P : N → ξ ﬁrst and never uses N again.By removing all other productions with left-hand side N from G , we obtain a newgrammar G ′ . Let us show that L ( G ′ ) = ∅ . Indeed, let S be the axiom of G . As S isproductive, u ∈ L ( G ) for some word u . Consider any parse tree T of u in G . If T containsno occurrence of N , then it is already a valid parse tree with respect to G ′ , and we aredone. Otherwise, for every node labelled with N in T from which the shortest path tothe root has no other occurrence of N , we replace the corresponding subtree by T N . Thisresults in a valid parse tree with respect to G ′ , because T N has one occurrence of N only,namely at its root, where the production applied is P . The new parse tree is a derivationof some word in L ( G ′ ), which concludes the proof.21 Proof of Proposition 19

Let A be a P -automaton (saturated or not). For each A ∈ Γ, let M A = ( m ij ) denote the A -transition matrix of A , that is, the 0–1 matrix of size | S | × | S | in which m ij = 1 if A contains a transition i A −→ j and m ij = 0 otherwise. Then rule (3) can be decomposedinto the following updates, for all A, B, C ∈ Γ: M A := bool( M A + T A,ε ) ,M A := bool( M A + T A,B · M B ) ,M A := bool( M A + T A,BC · M B · M C ) . The composition of certiﬁcates (4) and the existence of veriﬁcation algorithms follow asin Subsection 3.2.

D Fine-grained reductions and proof of Theorem 17

We discuss further preliminaries on ﬁne-grained complexity, referrinig the reader to therecent survey by Vassilevska Williams [55] and to the paper on nondeterministic strongexponential-time hypothesis by Carmosino et al. [13].Let L and L be languages, and let T and T be time bounds, i.e., functions N → N .We interpret pairs ( L i , T i ) as problems with their conjectured (or presumed) complexities.We say that ( L , T ) ﬁne-grained reduces to ( L , T ), written ( L , T ) ≤ FGR ( L , T ), if (a)for all ε >

0, there is δ > M L from L to L suchthat DTIME [ M ] ≤ T − δ and such that (b) if Q ( M, x ) denotes the set of queries madeby M to the L oracle on an input x of length n , then the query lengths obey the timebound X q ∈ Q ( M,x ) ( T ( | q | )) − ε ≤ ( T ( n )) − δ . Intuitively, a ﬁne-grained reduction from ( L , T ) to ( L , T ) enables algorithmic savingsfor L to be transferred to L . That is, if L can be solved in time T − ε , then L can be solved in time T − δ . A language L with time complexity T is SETH -hard if(

SAT , n ) ≤ FGR ( L, T ). Theorem 23 ([13], Theorem 2 and Corollary 2) . Suppose

NSETH holds and a problem L belongs to NTIME [ T ] ∩ coNTIME [ T ] . Then ( SAT , n ) FGR ( L, T γ ) for any γ > .Also, for any L ′ that is SETH -hard with time T ′ , and any γ > , we have ( L ′ , T ′ ) FGR ( L, T γ ) . We are now ready to formulate Theorem 17 rigorously.

Theorem 24 (Theorem 17 restated) . Unless

NSETH fails, ( SAT , n ) FGR ( D Reach , n ω + γ ) for any γ > . It remains to observe that Theorem 17 follows from Theorem 14 and 23.

E Proof of Proposition 20

Preliminary Deﬁnitions

Two-way nondeterministic pushdown automata (2NPDA) [27] are a powerful formalism introduced in 1967 by Gray, Harrison, and Ibarra [27].22NPDA have the form A = ( Q, Σ , Γ , δ, q , F ), where Q is a ﬁnite set of states, Σ are Γare ﬁnite alphabets of input and stack symbols, respectively, q ∈ Q is the initial state, F ⊆ Q is the set of ﬁnal states, and a transition relation δ ⊆ Q × Σ × Γ × Q × Γ ∗ ×{− , , +1 } . We assume Σ contains two designated “end of tape” symbols ⊳ and ⊲ . Weassume that Γ contains a designated “end of stack” symbol Z such that any transition( q, σ, Z , q ′ , w, d ) ∈ δ satisﬁes w = Z . Thus, no transition of A replaces Z on the stackwith a diﬀerent symbol and no transition pushes Z .Informally, the 2NPDA A has a ﬁnite control (states from Q ) which reads a symbol ofΣ on its input tape and the top symbol in Γ of a pushdown store. Based on the transitionrelation δ , 2NPDA moves by changing the control state, replacing the top symbol of thepushdown store by a ﬁnite string of symbols (possibly the empty string), and moving itsinput head at most one symbol left or right. Initially, the 2NPDA is in state q , andits pushdown store consists of the single symbol Z . The input tape consists of a word w ∈ (Σ \ { ⊳ , ⊲ } ) ∗ surrounded by a left marker ⊳ and a right marker ⊲ and the 2NDPAscans the left marker ⊳ . Remark 25.

We include the endmarkers ⊳ and ⊲ in the set Σ here, even though wedid not mention them back in Section 6 when specifying the alphabet Σ for our hardest2NPDA language. Naturally, all symbols used by automata (including the endmarkers)should be included in the tape alphabet of these automata.A conﬁguration of the 2NPDA A is a triple ( q, w ˆ ax, γ ), where q ∈ Q , w, x ∈ Σ ∗ , a ∈ Σ,and γ ∈ Γ ∗ . The “hat” on a denotes that the machine is currently scanning the letter a .We write ( q , a . . . ˆ a i . . . a n , Zγ ) → ( q , a . . . ˆ a j . . . a n , γ ′ γ ) whenever ( q , a i , Z, q , γ ′ , d ) ∈ δ for d ∈ {− , , +1 } , and j = i + d . We require j ∈ { , . . . , n } , that is, the scan positiondoes not “fall oﬀ” the input word. Note that the input tape is not changed, only thescan position may change. We write → ∗ for the reﬂexive and transitive closure of → . Aword w ∈ (Σ \ { ⊳ , ⊲ } ) ∗ is accepted by the 2NPDA if ( q , ˆ ⊳ w ⊲ , Z ) → ∗ ( q, ⊳ w ˆ ⊲ , Z ) forsome q ∈ F . The language L ( A ) of A is the set of all accepted words (in (Σ \ { ⊳ , ⊲ } ) ∗ ).Informally, the 2NPDA has some run that leads it from the initial conﬁguration withthe word on the input tape to a ﬁnal state. Wlog, we can assume above that a word isaccepted in a ﬁnal state with the 2NPDA scanning the right end marker and the pushdownstore only contains Z . The transition relation is nondeterministic; we only require thatsome run is accepting. For the reader familiar with one-way automata, we remark thatthe role of epsilon-transitions is played by explicit speciﬁcation of head movements.A 1NPDA, or just PDA for short, is a 2NPDA such that δ ⊆ Q × Σ × Γ × Q × Γ ∗ × { , +1 } . Informally, the transitions of a PDA do not allow the scan position tomove left, so PDA can only move left to right. PDA accept exactly the context-freelanguages. In comparison, 2NPDA are surprisingly powerful devices. In fact, even theirdeterministic counterparts can recognize languages such as { a n b p ( n ) | n ≥ } where p is aﬁxed polynomial with natural coeﬃcients and { x y | x is a subword (factor) of y } [46,25].We consider the following decision problems for these machine classes. The recognition problem for a class of machines C asks, for a ﬁxed machine M ∈ C and an input word w ∈ Σ ∗ , if w is accepted by M , i.e., if w ∈ L ( M ). The emptiness problem for class C asks, given a machine M ∈ C , if L ( M ) = ∅ . Proof.

Proposition 20 follows from Lemmas 26, 27, and 28, which we prove next.23 emma 26.

There exists a linear-time algorithm that, given a 2NPDA B and a word w ,outputs a PDA P such that: • |P| ≤ O ( | w | ) for any ﬁxed B and • the language of P is nonempty iﬀ B accepts w . Remark.

In fact, |P| ≤ O ( |B| · | w | ). Proof.

Denote n = | w | and let S be the set of control states of B . Construct a PDA P with the set of control states Q = { , , . . . , n + 1 } × S . The ﬁrst component of the statesof P corresponds to a possible position of the input head of the 2NPDA B run on w .Indeed, when B is run on the word w , its head has n + 2 possible positions: over any ofthe n letters of w , over the left endmarker, and over the right endmarker.PDA P has the initial state (0 , s ), where s is the initial state of B . Transitions of the(nondeterministic) PDA P are deﬁned so that P would simulate the (nondeterministic)computation of B on w . The stack of P is always the same as the stack of B , and thesecond component of the control state of P the same as the control state of B . Transitionsof B depend on the input letter, which is available to P , because P ‘remembers’ in thecontrol state where the input head of B is positioned—and the input word w is ﬁxed.Transitions of P need not read any letter from the input; P accepts (rejects) wheneverso does B . It is straightforward to see that both assertions of the lemma hold. Lemma 27.

There exists a linear-time algorithm that, given a PDA P , outputs a directedgraph G = ( V, E ) , labels λ : E → { ( , ) , [ , ] } and two vertices s, t ∈ V such that ( G, λ, s, t ) is a yes-instance of D Reach iﬀ the language of P is nonempty.Proof. We show how to construct the required instance of D Reach given a PDA P =( Q, Σ , Γ , δ, q , F ).The idea is that we encode stack symbols from Γ by sequences of words over thealphabet { ( , [ } ; pushing symbols on the stack corresponds to traversing edges of G la-beled by opening brackets, and popping symbols—to traversing edges labeled by closingbrackets.Let ℓ = ⌈ log | Γ |⌉ . Fix any injective maps φ : Γ → { ( , [ } ℓ and ψ : Γ → { ) , ] } ℓ suchthat, for all Z ∈ Γ, the words ψ ( Z ) is obtained from φ ( Z ) by switching opening bracketsto closing brackets without changing their type—i.e., ( is replaced by ) and [ by ] ; andthen reversing the word.We next construct an auxiliary graph G ′ = ( V ′ , E ′ ) with labels λ : E → { ( , ) , [ , ] } .The set V ′ contains Q as a subset. For each transition ( q, a, Z, q ′ , γ, d ) ∈ δ , the graph G ′ contains a path from q to q ′ of length ℓ · (1 + | γ | ). The edges of this path are labelled byconsecutive letters of the word ψ ( Z ) · φ ( γ ); all intermediate vertices are distinct and areincident to no other edge of G ′ . It is easy to see that the number of edges of G ′ does notexceed |P| . (Notice that the input letter a is ignored in this construction.)Recall that the automaton P has a nonempty language if and only if there is a pathfrom its initial conﬁguration to a ﬁnal conﬁguration, enabled by some input word fromΣ ∗ . The initial conﬁguration c of P has control state q and stack content Z ; and anyﬁnal conﬁguration c has some control state q ∈ F and the same stack content Z . Byconstruction, c → ∗ c in the PDA P if and only if the graph G ′ has a walk from q to q labeled by a word u ∈ { ( , ) , [ , ] } ∗ such that φ ( Z ) · u · ψ ( Z ) is a Dyck-2 word.It now remains to obtain the graph G from G ′ by adding fresh states s and t andconnecting them to the other vertices by (1) a path from s to q labeled by φ ( Z ) and242) paths from each q ∈ F to t labeled by ψ ( Z ). Each of these paths has length ℓ ; pathsof type (2) have ℓ − G, λ, s, t ) is the instance of D Reach withthe required property.

Lemma 28.

There exist a 2NPDA language L ′ and a linear-time algorithm that, givena directed graph G = ( V, E ) with labels λ : E → { ( , ) , [ , ] } and two vertices s, t ∈ V ,outputs a word w such that w ∈ L ′ iﬀ ( G, λ, s, t ) is a yes-instance of D Reach . Remark 29.

For the linear time bound, we assume that the graph G is encoded in binaryin the input. If this is not the case, the running time of the algorithm suﬀers a slowdownby a factor of O (log | V | ). Proof.

Words of the language L ′ are encodings of the quadruples ( G, λ, s, t ), where thevertices of G are encoded in binary. In more detail, every w ∈ L ′ has the followingform: ﬁrst an encoding of s , then an encoding of t , and ﬁnally a sequence of encodingsof edges of G , where every edge e ∈ E is followed by its label λ ( e ). All these encodingsare separated by delimiters.The language L ′ is over an alphabet of size O (1); a word belongs to L ′ iﬀ it followsthe format we have just described and the graph G has a walk from s to t labeled witha sequence from the Dyck-2 language over { ( , ) , [ , ] } .The algorithm from the assertion of the lemma simply writes down the encodings inthe required format; it is clear that the algorithm runs in linear time and the obtainedword belongs to L ′ iﬀ ( G, λ, s, t ) is a yes-instance of D Reach .It remains to prove that the language L ′ is recognized by a 2NPDA. Let us describethis 2NPDA R . It ﬁrst reads the input word and checks that it follows the formatdescribed above. If this is not the case, R rejects, otherwise it guesses the required walkin G from s to t as follows.A conﬁguration of R stores on the stack the following data: • (at the bottom) a sequence σ ∈ { ( , [ } ∗ , and • (at the top) a vertex v ∈ V .In this conﬁguration, R has already found a walk from s ∈ V to v ∈ V labeled with someword σ ′ ∈ { ( , ) , [ , ] } ∗ that reduces to σ . (A word σ ′ ∈ { ( , ) , [ , ] } ∗ reduces to σ if σ canbe obtained from σ ′ by a sequence of transformations that replace the subwords () and [] with ε .)Here is how R works:1. At the beginning, initialize σ with the empty word and v with s ∈ V , pushing themto the stack.2. Repeatedly guess the next edge e ∈ E in the walk (leaving the loop nondetermin-istically after some iteration):(a) move the head to the encoding of e = ( u , u ) written on the input tape;(b) pop the encoding of v ∈ V from the stack, reading the encoding of u fromthe input tape in sync; if u = v , reject;(c) look at the label λ ( e ): • if λ ( e ) ∈ { ( , [ } , then push λ ( e ) onto the stack, extending the current σ ∈ { ( , [ } ∗ , and 25 if λ ( e ) ∈ { ) , ] } , then pop the last symbol of σ ∈ { ( , [ } ∗ ; proceed if the twosymbols form a matching pair, otherwise reject (also reject if σ is empty);(d) push the encoding of u to the stack.3. Check if the current vertex v is equal to t and σ is empty. Accept if the checksucceeds, otherwise reject.It is easy to see that an accepting computation of R exists iﬀ ( G, λ, s, t ) is a yes-instanceof D Reach . F On an O ( n / log n ) algorithm for PDA emptiness As mentioned in Section 6, the PDA emptiness to Dyck-2 reachability reduction fromProposition 20, combined with Chaudhuri’s algorithm for CFL reachability [16], impliesa slightly subcubic bound for PDA emptiness.Indeed, Chaudhuri shows how to solve instances of CFL reachability for a ﬁxed lan-guage (including the Dyck-2 language) in time O ( n / log n ), where n is the number ofnodes in the graph.Suppose we start with a PDA emptiness instance with s states, t transitions, and r stack symbols. Note that we can safely ignore the input alphabet symbols. The bit sizeof the instance is b = O ( t log( s + r )). The reduction from Lemma 27 gives an instance ofDyck-2 reachability with O ( b ) nodes. Chaudhuri solves it in time O ( b / log b ), which issubcubic in the bit size of the input of PDA emptiness (although not necessarily subcubicin s + t ).This complexity seems folklore but was never made explicit. In particular, “textbook”algorithms for PDA emptiness go through equivalent context-free grammars [30], forwhich a cubic blow-up is unavoidable in the worst case [26]. G Proof of Theorem 21

Fix an arbitrary 2NPDA A over a ﬁnite alphabet Σ. We can assume with no loss ofgenerality that A has a single ﬁnal state and that it is diﬀerent from its initial state: | F | = 1, q F . (It is an easy exercise to modify A to ensure this assumption holds.)Suppose an input word w ∈ Σ ∗ is given. Lemma 26 reduces L ( A ) to the emptinessproblem for a PDA deﬁned as a product of the word w and 2NPDA A . More concretely,this PDA has control states Q = { , , . . . , n +1 }× S where S is the set of control states of A . Note that | Q | = O ( | w | · |A| ) = O ( | w | ) since A is ﬁxed. Here and below, the constantbehind O ( · ) depends on A but not on w . Similarly, the stack alphabet of the PDA isﬁxed too. We now give this PDA as input to a further reduction to Dyck-2 Reachability(Lemma 27), which produces an instance ( G, λ, s, t ). Claim 30.

The graph G has the following properties:(a) it has O ( | w | ) vertices (including intermediate ones, resulting from mapping the stackalphabet into binary words);(b) its edges are labeled with symbols from { ( , ) , [ , ] } ; c) there is a linear order on the vertices such that each edge connects two vertices thatare O (1) positions away from each other in this order;(d) the source is ﬁrst and the sink is last in the order.Proof. Property (a) is due to the fact that A , and thus its stack alphabet, is ﬁxed.Property (b) is immediate. Property (c) ultimately reﬂects the fact that A , as a two-waypushdown automaton, cannot jump cells of the input tape, that is, its head can only moveone cell left or right if it moves at all —this is represented by d ∈ {− , , +1 } in the syntaxof 2NPDA. Thus, the linear order on vertices of the graph is inherited from the naturalordering of letters of the input tape, ⊳ w ⊲ . Reductions to PDA emptiness and D Reach eﬀectively apply a direct product construction with a constant factor expansion. Withineach block corresponding to an input letter, vertices can be ordered arbitrarily, providedthat the initial state of A comes ﬁrst and the ﬁnal state last —ensuring property (d).Note that our previous preprocessing of A ensures that these two states are diﬀerent, andour acceptance condition and subsequent reductions do the rest of the work.We refer to instances ( G, λ, s, t ) with the properties stated in Claim 30 as those of

Restricted Dyck-2 Reachability .Suppose k ∈ N is chosen such that the constants behind O ( · ) in conditions (a) and (c)are at most k and every vertex has at most k outgoing edges. We think of this k = O (1)as the “width” of the instance, which depends on the original 2NPDA A but not on w . Remark 31.

The constant O (1) in property (c) is reminiscent to the bounded pathwidthcondition (see, e.g, Bienstock et al. [5]). However, in our case the graph has an evenmore “regular” structure. We leave it open whether this structure can be characterizedby constant pathwidth and constant degree (and restricting the direction and labels ofthe edges). In comparison, Chatterjee and Osang look at pushdown reachability withconstant treewidth [15].It remains to map this instance of Dyck-2 Reachability to an instance of 2NPDArecognition, for a ﬁxed 2NPDA which we now deﬁne.For each vertex v , let index ( v ) denote the position of v in the order speciﬁed inproperty (c), ranging from 1 to O ( | w | ). (Once again, the constant behind O ( · ) dependson A but not on w .) The construction below follows in spirit the proof of Lemma 28 andreﬁnes the details in order to produce a homomorphism h . The key diﬀerence is that, toproduce the new input word, we will not write edges as “( u, v ) , λ ( u, v )”. Instead we will:1) sort the vertices u according to their index ( u ) ascending and, for each u , group allthe edges departing from u together (each u will have at most k outgoing edges);2) write edges ( u, v ) as pairs ( λ ( u, v ) , oﬀset ( u, v )) where oﬀset ( u, v ) = index ( v ) − index ( u ), i.e., how many vertices to the right the destination of the edge is; thisdiﬀerence is written in unary notation (without incurring blowup, as this diﬀerencecannot exceed k );3) write vertices as “separators” between groups of edges.Putting everything together, the input to the new 2NPDA has the form (5) (see page 14),where ℓ i ∈ { ( , ) , [ , ] } , and each o i is either the emptyword, or 1 . . . − . . .

1. 27 laim 32.

The set of valid encodings (5) of Restricted Dyck-2 Reachability can be rec-ognized by a ﬁxed 2NPDA.

The construction of the 2NPDA in Claim 32 is similar to the reduction of Lemma 28which we already have. Instead of guessing the next vertex, this new 2NPDA A ′ “scrolls”left and right in a deterministic way to the destination of the current edge, counting inunary with the help of its stack. The nondeterministic choices that A ′ makes are whichoutgoing edge from the current vertex to choose next.Note that the construction of A ′ is independent of k , thus identifying a single hardestlanguage, L ( A ′ ). Moreover, for a given initial 2NPDA A this reduction replaces eachsymbol in w with O (1) vertices and O (1) edges, where this O (1) depends just on A and not w . The exact collection of these vertices and edges is fully determined by eachsymbol of w , independently of its position within w . The vertices are not addressed in any“absolute” numbering scheme — so this mapping can be realised as a homomorphism. Remark 33.

The use of relative rather than absolute addresses (to encode oﬀset s) ap-pears in a related context but for a diﬀerent problem in Neal’s work on taxonomic infer-ence [39], which is at the origin of the connection between 2NPDA and program analysis.

Summary and the endmarkers problem.

We now have achieved the following:for every 2NPDA A there is a homomorphism h such that w ∈ L ( A ) if and only if h ( E w D ) ∈ L ( A ′ ). Note the appearance of the endmarkers here. (We use E instead of ⊳ and D instead of ⊲ to avoid a notation clash in the discussion that follows.) Theyreﬂect the fact that, in the chain of our reductions, the set of control states of the PDAis { , , . . . , n + 1 } × S not { , . . . , n } × S .To lift our construction from E w D to just w , it may be tempting to appeal to thefollowing fact, which is not diﬃcult to prove. Let x, y ∈ Σ ∗ be ﬁxed. Suppose a 2NPDAaccepts a language L ⊆ x · Σ ∗ · y . Then there exists another 2NPDA which accepts thelanguage { w | xwy ∈ L } .Unfortunately, this fact does not quite achieve our goal. This is because the new2NPDA we would obtain from it depends on x and y . In our context, x and y shouldbe the images of the original endmarkers, i.e., we would like to have x = h ( E ) and y = h ( D ). But these two words depend on the homomorphism h , and thus on the2NPDA A that we started from. This is at odds with our objective: we need a single2NPDA for our hardest language, not an entire family dependent on A .There are several ways to deal with this issue. One is reminiscent of Rytter’s ap-proach [44]: we can decide we are content with keeping a single endmarker in, i.e., wewould only like to ﬁnd an L such that, for all w ∈ Σ + , one has w ∈ L iﬀ h ( w $) ∈ L .Here $ is a fresh symbol. It is not very diﬃcult to ﬁnd such an h and L based onour construction: essentially, the word h ( E ) needs to be merged with the word h ( D )and placed to the right of h ( w ). So we would like to choose h ($) = h ( D ) h ( E ) and h ( a ) = h ( a ) for all other symbols a . The 2NPDA for L is the same as our 2NPDA A ′ constructed above, with the following modiﬁcation. Suppose it starts following an edgefrom some vertex (block) to the left but hits the left end of the tape, i.e., the left end-marker ⊳ . We now use this symbol to refer to the tape alphabet of the 2NPDA A ′ (andnot the tape alphabet of the original machine A ). The new 2NPDA will move all the wayto the right end of the tape and continue its search for the destination vertex from theright endmarker ⊲ . Edges within h ( D ) need not be changed, but the ones among them28hat lead to the right ( oﬀset ( u, v ) >

0, or equivalently o i ∈ + ) will make the 2NPDAhit ⊲ , go back to the left of the tape and continue the seart from ⊳ .One further technicality that needs to be dealt with is the beginning and end of thecomputation. Recall that our Restricted Dyck-2 Reachability asked for a path from thevery ﬁrst vertex to the very last one. Since h ( E ) moved, we now need to change thisconvention. More concretely, the only two vertices that we can distinguish correspond tothe last two control states and the head position over the left endmarker. So the original2NPDA A needs to be changed accordingly.While this approach recovers Rytter’s result, we show below that there is a way toeliminate the extra symbol $ altogether. Merging endmarker blocks into other symbols.

Our solution to the endmarkersproblem acknowledges that the words h ( E ) and h ( D ) cannot be eliminated completely.Indeed, the vertices and edges that these two words encode correspond to the behaviourof the original 2NPDA A over the tape endmarkers, and this behaviour can contributeto the computations of A in a nontrivial way.However, what we can do is to embed all this information into words h ( a ) for all other symbols a . For a ﬁrst intuition (to be amended later), we would like to set h ( a ) = h ( E ) k h ( a ) k h ( D ) for all non-endmarker symbols a , where k denotes a speciallytailored ternary version of the perfect shuﬄe operation. More concretely, let w , w , w be arbitrary words such that, for some single ℓ , we have w i = Q ℓj =1 w i,j where none ofthe words w i,j contains the vertex marker symbol w k w k w := ℓ Y j =1 w ,j w ,j w ,j . Note that this shuﬄing relies on ℓ being the same for all three arguments, and ultimatelythis means the same number of vertices (blocks) in all words h ( a ). This is in fact ensuredby our constructions above (although we could always achieve this by introducing extradummy vertices where necessary).As a result of this shuﬄing arrangement, we can think of new input words as havingthree interleaving “tracks”, each containing a separate sequence of vertices. Naturally,this requires some changes to the wiring, as follows.First, the oﬀsets that specify the edges of the graph departing from the vertices of h ( a ) need to be updated. This is not diﬃcult. Recall that edge destinations are speciﬁedusing relative addresses of vertices. For every edge from a vertex in h ( a ), its oﬀset needsto be multiplied by 3, so that the edge skips intermediate vertices from copies of h ( E )and h ( D ).Second, we need to provide a way for the new 2NPDA A to reach the vertices in h ( E ) and h ( D ). To achieve this, we consider the scenario in which A will traverse edgesleading to a vertex in h ( E ). (The case of h ( D ) is handled in a symmetric way.) Supposethe head of the 2NPDA is over a block (vertex) within the leftmost h ( a ). Taking anedge with a negative oﬀset, it moves left but then hits the left tape endmarker ⊳ . Whenit does so, the stack of the 2NPDA still contains the number of vertices to be skipped.The 2NPDA then needs to change from the second (main) track, which contains theinformation from h ( a )s, to the ﬁrst track, which stores multiple copies of the word h ( E ).Eﬀectively, this amounts to treating ⊳ as just another h ( E ) and not from the left wherethe head of the automaton is now located. Thus, we reverse the encoding of each of h ( E ) and h ( D ), as follows: we re-deﬁne our special shuﬄe as w k w k w := ℓ Y j =1 w ,ℓ +1 − j w ,j w ,ℓ +1 − j , where u is the same word as u in which every maximal subword of the form − m isreplaced with 1 m and each 1 m , without a preceding − , with − m . Our 2NPDA mustremember, in its control state, which “track” of the input it is over. The second trackcorresponds to the usual operation. Over the ﬁrst track: • Edges previously speciﬁed by positive oﬀset needs to followed to the left instead ofto the right (hence the w ,ℓ +1 − j above and not w ,ℓ +1 − j . If the left tape endmarker ⊳ is encountered, the automaton transitions to the second (main) track, and only thencontinues to the right (in the normal mode). • Edges speciﬁed by the negative oﬀset need to be followed to the right instead of tothe left (again, this matches the w ,ℓ +1 − j above). We note that if the input wordfor our 2NPDA is the homomorphic image under h of some word in Σ ∗ , then theautomaton will never leave the leftmost h ( a ) while being on the ﬁrst track, becausethe original 2NPDA A cannot move left from the left endmarker.The third track is arranged in a similar way.Importantly, while we apply these changes to h and the “wiring” of the graph, we cankeep the semantics of our hardest language untouched. The “tracks” themselves neednot enter the description of the language. The only new “feature” that is necessary is changing the tracks — and this can be achieved simply by specifying that when our new2NPDA A encounters a tape endmarker during its operation, this endmarker is countedas a virtual vertex and “reﬂects” oﬀ it, continuing the countdown in the opposite direction.However, as was the case with the approach described above and involving $, ournew construction of h breaks the convention about the source and target vertices inthe Dyck-2 reachability instance (albeit in a slightly diﬀerent way). Because of theeﬀective reversal of vertex ordering within h ( E ) and h ( D ), the required adjustment tothe original 2NPDA A is that its initial control state needs to be the last and its (only)ﬁnal control state the ﬁrst in the ordering.To sum up, by applying these adjustments and “compiling” the homomorphism h theway we have described, we arrive at the desired 2NPDA A . (Note that there is freedomin whether we take ε ∈ L ( A ) or ε L ( A ), but as some 2NPDA languages contain ε and some do not, their homomorphic images will necessarily disagree on εε