A Quantum Interior-Point Predictor-Corrector Algorithm for Linear Programming
AA Quantum Interior-Point Predictor-Corrector Algorithm for LinearProgramming
P. A. M. Casares ∗ and M. A. Martin-Delgado † Departamento de F´ısica Te´orica, Universidad Complutense de Madrid. (Dated: July 7, 2020)We introduce a new quantum optimization algorithm for dense Linear Programming problems,which can be seen as the quantization of the Interior Point Predictor-Corrector algorithm [1] usinga Quantum Linear System Algorithm [2]. The (worst case) work complexity of our method is, upto polylogarithmic factors, O ( L √ n ( n + m ) || M || F ¯ κ(cid:15) − ) for n the number of variables in the costfunction, m the number of constraints, (cid:15) − the target precision, L the bit length of the input data, || M || F an upper bound to the Frobenius norm of the linear systems of equations that appear, || M || F ,and ¯ κ an upper bound to the condition number κ of those systems of equations. This represents aquantum speed-up in the number n of variables in the cost function with respect to the comparableclassical Interior Point algorithms when the initial matrix of the problem A is dense: if we substitutethe quantum part of the algorithm by classical algorithms such as Conjugate Gradient Descent, thatwould mean the whole algorithm has complexity O ( L √ n ( n + m ) ¯ κ log( (cid:15) − )), or with exact methods,at least O ( L √ n ( n + m ) . ). Also, in contrast with any Quantum Linear System Algorithm, thealgorithm described in this article outputs a classical description of the solution vector, and thevalue of the optimal solution. Keywords: Linear Programming Problem, Quantum Algorithms, Quantum Linear Approximation, InteriorPoint Method, Iteration Complexity, Strong Polynomiality.
I. INTRODUCTION
Linear Programming problems are among themost fundamental optimization problems [3–5]. Ap-plications abound both at personal and professionalfronts: improving a project delivery, scheduling oftasks, analyzing supply chain operations, shelf spaceoptimization, designing better strategies and logis-tics and scheduling problems in general. Linear Pro-gramming is also used in Machine Learning whereSupervised Learning works on the basis of linearprogramming. A system is trained to fit a math-ematical model of an objective (cost) function fromthe labeled input data that later can predict val-ues from unknown test data [6, 7]. More specifically,linear programming is a method to find the best out-come from a linear function, such as maximum profitor lowest cost, in a mathematical model whose re-quirements are represented by linear constraints ofthe variables. Semi-Definite Programming (SDP) isan extension of Linear Programming where the ob-jective or cost function is formulated with a non-diagonal matrix and constraints contain more gen-eral inequalities [8–11].We are in the time of small quantum computerswith reduced computational capabilities due to noisyphysical qubits [12–15]. The challenge of surpassingthe power of current and foreseeable classical com- ∗ [email protected] † [email protected] puters is attracting a lot of attention in the academia[16, 17] and in technological companies. This moti-vates the endeavour of searching for new quantumalgorithms beyond the standard ones that spurredthe field of quantum computation in the mid 90s(Shor, Grover, etc.) [18–21]. Only recently, a quan-tum algorithm for solving SDP problems has beenproposed by Brand˜ao and Svore providing us withthe first quantum advantage for these optimizationproblems [22–26]. A. Background on Linear Programming.
The development of methods to solve Linear Pro-gramming problems has a long tradition startingwith the Simplex Method [5], which is simple andwidely used in practice, but has (in the worst case)exponential time complexity in the number of vari-ables. In 1979 Khachiyan proved that the ellip-soid method ensured (weak) polynomial complexitythe number of variables, O ( n L ) [27]. However, inpractice the ellipsoid algorithm is complicated andnot competitive. In 1984 Karamark proposed thefirst Interior Point algorithm [28], with complexity O ( n . L ). It was more practical than the ellipsoidmethod and gave rise to a large variety of availableInterior Point methods [29]. The best advantage ofthese methods is that, contrary to what happensin the Simplex Method, Interior Point algorithmshave a worst case runtime polynomial in the numberof variables. Among them, the Predictor-Corrector a r X i v : . [ qu a n t - ph ] J u l Method [1, 30] is arguably one of the best proce-dures to achieve an extremely well-behaved solution,and requires just O ( √ nL ) iterations. However, thePredictor-Corrector method does not explicitly indi-cate what method should be used to solve the linearsystem of equations that appear in each iteration,so the solution of the system will depend on whatmethod is used, as can be seen in table I. For fur-ther background on Interior Point methods we referthe reader to the review [29]. B. Our algorithm
Here we present a quantum algorithm that re-lies on the quantization of this method. One im-portant feature of our quantum Interior Point algo-rithm is that it is a hybrid algorithm: partially clas-sical, partially quantum. This feature has becomevery common and a similar situation occurs withthe Brand˜ao-Svore algorithm in SDP, the QuantumEigen-Solver for quantum chemistry [37–41], andmany others, and has the advantage of requiringshorter coherence times. The core of the quanti-zation of the Interior Point algorithm relies on theuse of the block encoding techniques [2], that extendthe applicability of the Quantum Linear System Al-gorithm proposed by Harrow, Hassadim and Lloyd(HHL) [42] in the case where A is dense, to solvethe linear system of equations that appear in thePredictor-Corrector steps.However, in order to apply the QLSA in the con-text of Linear Programming, we have to solve severalcaveats since the straightforward application of it isdoomed to failure.The quantum Interior Point algorithm we pro-pose benefits from several fundamental propertiesinherited from the classical Predictor-Corrector al-gorithm, and has a better performance than otherclassical Interior Point algorithms. In particular [1]:1. The Predictor-Corrector method can solve theLinear Programming problem without assum-ing the existence of feasible or optimal solu-tions.2. If the Linear Programming problem has solu-tion, the loop of this interior point algorithmapproaches feasibility and optimality at thesame time for both the primal and dual prob-lem, and if the problem is infeasible or un-bounded the algorithm detects infeasibility foreither the primal or dual problem.3. The algorithm can start from any point nearthe center of the positive orthant. The notions of feasible, optimal solutions etc. aredefined in Sec. II where a self-contained review ofthe Predictor-Corrector method is presented.The work complexity of the algorithm proposedhere is O ( L √ n ( n + m ) || M || F ¯ κ(cid:15) − ), where n is thenumber of variables of the cost function, m is thenumber of constraints, L is the bit length of the in-put data (see Eq. (1)), || M || F is an upper bound tothe Frobenius norm of the linear systems of equa-tions that appear, ¯ κ is an upper bound to the con-dition numbers of the linear systems of equationsthat appear in the Predictor-Corrector steps, and (cid:15) − is the precision with which one wants to solvethe linear system of equations. To avoid confusionnotice that in the text we will call (cid:15) the error and (cid:15) − its associated precision, because a low error meanshigh precision and viceversa. The time complexityof the proposed quantum Interior Point algorithmcan be reduced from O ( L √ n ( n + m ) || M || F ¯ κ(cid:15) − ) to O ( L √ n || M || F ¯ κ ) distributing the work of each itera-tion between O (( n + m ) (cid:15) − ) quantum processors.If we substituted the QLSA by a classical LinearSystem Algorithm, the price to pay would be, atleast, an O ( √ n + m ) increase in the work complex-ity, as || M || F = O ( √ n + m ) if the spectral norm of M is bounded [43]. For example, if we used con-jugate gradient descent, the overall algorithm com-plexity would be O ( L √ n ( n + m ) ¯ κ log( (cid:15) − )). Also, ifwe wanted to use an exact Linear System Algorithmthe best we could hope for is the complexity it takesto exactly invert a matrix [33], O (( n + m ) . ) [44],thus implying an overall work complexity for the al-gorithm O ( √ n ( n + m ) . L ), that could be paral-lelized in ( n + m ) . processors to lower the timecomplexity O ( √ nL ) up to polylogarithmic terms[33]. A summary of these results is presented intable I.It is worth mentioning that our quantization ap-proach to Linear Programming problems is radicallydifferent from the method of Brand˜ao and Svore andthis comes with several benefits. Namely, the prob-lem of quantising linear programming using multi-plicative weight methods [45] as in Brand˜ao-Svore isthat they yield an efficiency depending on parame-ters R and r of the primal and dual problems. Infact, these parameters might depend on the sizes n, m of the cost function, thereby the real time com-plexity of the algorithm remains hidden. For in-stance, for some classes of problems Rr/(cid:15) = O ( n )according to theorem 24 of [26]. Moreover and gener-ically, unless specified, these R, r parameters cannotbe computed beforehand, but after running the al-gorithm (we will have a similar situation with ¯ κ ).Thus, the real efficiency of the quantum algorithmis masqueraded by overhead factors behaving badlyon R and r . Their algorithm has nevertheless a good Algorithms for Linear Programming Work complexity Parallelizable?Pred-Corr. [1] + Conjugate Gradient [31] O ( L √ n ( n + m ) ¯ κ log( (cid:15) − )) O (( n + m ) )Pred-Corr. [1] + Cholesky decomposition [32] O ( L √ n ( n + m ) ) O (( n + m ) )Pred-Corr. [1] + Optimal exact [33] O ( L √ n ( n + m ) . ) O (( n + m ) . )Multiplicative weights [25] O (( √ n (cid:0) Rr(cid:15) (cid:1) + √ m ) (cid:0) Rr(cid:15) (cid:1) ) O(1)Quantum Interior Point with Newton system [34] O ( L √ n ( n + m ) µ ¯ κ (cid:15) − log( (cid:15) (cid:48)− )) O (( n + m ) (cid:15) − )Pred-Corr. [1] + Block-encoding [2] (This algorithm) O ( L √ n ( n + m ) || M || F ¯ κ(cid:15) − ) O (( n + m ) (cid:15) − )TABLE I. Comparison of complexity of different algorithms a that can be used for solving dense Linear Programmingproblems; the first three are purely classical whereas the following three are hybrid quantum-classical. It includesonly leading-order terms. QLSA stands for a dense Quantum Linear System Algorithm [2]. Note that the algorithms[25] and [34] can be applied to more general problems concerning Semidefinite Programming. In the table µ ≤|| M || F = O ( √ n ), which reflects the fact that both [34] and our algorithm share common features even if they weredeveloped independently. In reference [26], theorem 24, it is proven that for many combinatorial problems the term( Rr/(cid:15) ) = O ( n + m ), affecting the complexity of [25]. With respect to L , for many cases the complexity of thePredictor-Corrector method does not depend on L [36]; but as an upper bound O ( L ) complexity will be commonto all Interior Point methods [29]. In contrast, the Multiplicative weight method will not depend on it. Finally, thecolumn ‘Parallelizable?’ gives the number of quantum or classical processors that can be used in parallel to solve theproblem; the time complexity is divided by the corresponding amount. a For completitude we want to mention that recently [35] has proposed quantum subroutines to speedup the simplex method. complexity on n , O ( √ n + m ), but much worse com-plexity on the precision, O ( (cid:15) − ) for the most recentimprovement of the Brand˜ao-Svore algorithm [25]. C. Structure of the paper
The paper has two main sections. The first re-views the Predictor-Corrector algorithm from [1]. Itis itself divided in subsections where we explain howto initialize and terminate the algorithm, and themain loop.In the second section we explain the changes wecarry out to be able to use the QLSA from [2]. Inparticular we start with a subsection discussing thecondition number and then we focus on how to pre-pare the initial quantum states for the QLSA andread out the results using a tomography protocolfrom [46]. Finally we explain the QLSA, commenton the possibility of quantizing the termination ofthe algorithm, and devote two subsections to thecomplexity of the overall algorithm and its compar-ison with other alternatives, and the possibility offailure.We recommend the reader to first understand thePredictor-Corrector algorithm in its classical form,and then take a look at figures 1 and 2 in order to getan overall impression of the algorithm before tryingto understand the technical details.
II. THE PREDICTOR-CORRECTORALGORITHM
In this section we review the Predictor-Correctoralgorithm of Mizuno, Todd and Ye for solving LinearProgramming problems [1]. As stated in the originalarticle, we will see that it performs O ( √ nL ) itera-tions of the main loop in the worst case scenario,where n is the number of variables and L the size ofthe encoding the input data in bits: L := n (cid:88) i m (cid:88) j (cid:100) log ( | a ij | + 1) + 1 (cid:101) , (1)for a ij elements of the matrix defining the problem, A , that is defined in the following equations (2) and(3). Note that L = O ( mn ). However, in a typicalcase the number of iterations will not depend on L ,but will rather be O ( √ n log n ) [36].The linear programming problem we want to solveis called primal problem (Linear Problem, LP):Given A ∈ R m × n , c ∈ R n and b ∈ R m , find x ∈ R n such that: minimizes c T x (2a)subject to A x ≥ b , x ≥ . (2b)The dual problem (Dual Problem, DP) has the samesolution: finding y ∈ R m such thatmaximizes b T y (3a)subject to A T y ≤ c . (3b)Then, for linear programming problems, the primal-dual gap is 0: b T y − c T x = 0 . (4)A usual strategy is to use slack variables to turnall inequality constraints into equality constraints,at the cost of additional constraints. Thus, we cansubstitute A T y ≤ c by A T y + s = c , s ≥ ∈ R n being the slack (dual) variable to the constraint (3). A. Initialization
According to the prescription of [1], one way tosolve the previous problems (2) and (3) is to set an-other problem from which the solution of (2) and (3)can be easily obtained. This new problem is homo-geneous, in the sense that there is a single non-zeroconstraint, and self-dual, as its dual problem is it-self. Therefore, let x > ∈ R n , s > ∈ R n , and y ∈ R m be arbitrary initialization variables whichwill be chosen later on. Then, formulate the Homo-geneous Linear Problem (HLP) asmin θ (5)such that ( x ≥ , τ ≥ , τ ∈ R ):+ A x − b τ +¯ b θ = − A T y + c τ − ¯ c θ ≥ + b T y − c T x +¯ z θ ≥ − ¯ b T y +¯ c T x − ¯ z τ = − ( x ) T s − b := b − A x , ¯ c := c − A T y − s , ¯ z := c T x + 1 − b T y . (7)The last constraint from (6) is used to impose self-duality. It is also important to remark that ¯ b , ¯ c and¯ z indicate the infeasibility of the initial primal anddual points, and the dual gap, respectively.Recall also that we use slack variables to con-vert inequality constraints into equality constraints.Those slack variables indicate the amount by whichthe original constraint deviates from an equality.As we have two inequality constraints, we introduceslack variables s ∈ R n for the second constraint in(6) and k ∈ R (in [1] denoted κ ) for the third: − A T y + c τ − ¯ c θ − s = ; s ≥ (8) b T y − c T x + ¯ zθ − k = 0; k ≥ s ) T x +( x ) T s + τ + k − (( x ) T s +1) θ = ( x ) T s +1 . (10) Once we have defined these variables, theorem 2 of[1] proves that any point fulfilling y = y , x = x > , s = s > , τ = k = θ = 1 . (11)is a feasible point, and therefore a suitable set ofinitialization parameters for our algorithm. A par-ticularly simple one can choose is y = 0 m × , x = 1 n × = s , (12)where 1 n × = [1 , ..., T , and 0 m × = [0 , ..., T . B. Main loop
In this section we explain how to set up an itera-tive method that allows us to get close to the opti-mal point, following a path along the interior of thefeasible region. The original references are [1, 30].Begin defining X := diag( x ) and S := diag( s ).Define also F h the set of feasible points of (HLP) v = ( y , x , τ, θ, s , k ); and F h ⊂ F h those such that( x , τ, s , k ) > .Finally, define the following central path in (HLP) C = { ( y , x , τ, θ, s , k ) ∈ F h : (cid:18) X s τ k (cid:19) = x T s + τ kn + 1 1 ( n +1) × } , (13)and its neighbourhood N ( β ) = { ( y , x , τ, θ, s , k ) ∈ F h : (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:18) X s τ k (cid:19) − µ ( n +1) × (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ βµ where µ = x T s + τ kn + 1 } . (14)Then, theorem 5 of [1] ensures that the central pathlies in the feasibility region of (HLP).In consequence, the algorithm proceeds as fol-lows: start from an interior feasible point v =( y , x , τ , θ , s , k ) ∈ F h . Then, recursively, formthe following system of equations for variables d v =( d y , d x , d τ , d θ , d s , d k ) and t = 0 , , ... ∈ N : + A − b +¯ b − A T + c − ¯ c − b T − c T +¯ z − − ¯ b T +¯ c T − ¯ z d y d x d τ d θ d s d k = (15a) (cid:18) X t d s + S t d x τ t d k + k t d τ (cid:19) = γ t µ t ( n +1) × − (cid:18) X t s t τ t k t (cid:19) , (15b)where γ t takes values 0 and 1 for even and odd stepsrespectively, starting in t = 0. The linear system of equations can be written in matrix form as M t d v t = f t , i.e. m n n m A − b ¯ b n − A T c − ¯ c − b T − c T z − − ¯ b T ¯ c T − ¯ z n S t X t
01 0 0 k t τ t d y d x d τ d θ d s d k = γ t µ t n × − X t s t γ t µ t − τ t k t . (16)So, the main loop of the algorithm consists in per-forming the following steps iteratively: Predictor step:
Solve (16) with γ t = 0 for d v t where v t = ( y t , x t , τ t , θ t , s t , k t ) ∈ N (1 / δ such that v t +1 = v t + δ d v t (17)is in N (1 / t ← t + 1. Corrector step:
Solve (16) with γ t = 1 and set v t +1 = v t + d v t (18)that will be back in N (1 / t ← t + 1. C. Termination
Define a strictly self-complementary solution of(HLP) v ∗ = ( y ∗ , x ∗ , τ ∗ , θ ∗ = 0 , s ∗ , k ∗ ) as an opti-mal solution to (HLP) that fulfills (cid:18) x ∗ + s ∗ τ ∗ + k ∗ (cid:19) > . (19)Theorem 3 in [1] tells us that if we have a strictlyself-complementary solution to (HLP), then a solu-tion to (LP) and (LD) exits whenever τ ∗ >
0, inwhich case x ∗ /τ ∗ and ( y ∗ /τ ∗ , s ∗ /τ ∗ ) are the solu-tions respectively. On the other hand, if τ ∗ = 0at least one of two things will happen: c T x ∗ < − b T y ∗ < t until one of the following two criteria are fulfilled:For (cid:15) , (cid:15) , (cid:15) small numbers, either( x t /τ t ) T ( s t /τ t ) ≤ (cid:15) and( θ t /τ t ) || (¯ b T , ¯ c T ) || ≤ (cid:15) , (20a)or τ t ≤ (cid:15) . (20b) We can see that the two equations in (20a) are re-lated to the dual gap being 0, and θ ∗ = 0 (neededconditions for the solution to be optimal); suppos-ing τ ∗ >
0. The equation (20b) is the procedure todetect τ ∗ = 0. (cid:15) and (cid:15) should therefore be chosentaking into account the precision we are seeking inthe optimality of the solution, and the error our cal-culations will have. In particular, (cid:15) and (cid:15) can betaken to be the target error of the algorithm, (cid:15) .To get to this point we will have to it-erate up to O ( L ¯ t √ n ) times, with ¯ t =max[log(( x ) T ( s ) / ( (cid:15) (cid:15) )) , log( || (¯ b T , ¯ c T ) || /(cid:15) (cid:15) )].If the termination is due to condition (20b),then we know that there is no solution fulfilling || ( x , T s T ) || ≤ / (2 (cid:15) ) −
1. Therefore one shouldchoose (cid:15) small enough so that the region we areexploring is reasonable. We will then consider, fol-lowing [1], that either (LP) or (LD) are infeasible orunbounded.However, if termination is due to (20a), denote by ζ t the index set { j ∈ , ..., n : x tj ≥ s tj } . Let also B the columns of M t such that their index is in ζ t ,and the rest by C . Case 1: If τ t ≥ k t solve for y , x B , τ min y , x B ,τ || y t − y || + || x tB − x B || + ( τ t − τ ) (21a)such that B x B − b τ = 0; − B T y + c B τ = 0; b T y − c TB x B = 0;(21b) Case 2: If τ t < k t and we solve for y , x B , and k frommin y , x B ,k || y t − y || + || x tB − x B || + ( k t − k ) (22a)such that B x B = 0; − B T y = 0; b T y − c TB x B − k = 0 . (22b)The result of either of these two calculations will bethe output of our algorithm, and the estimate of thesolution of the (HLP) problem. In particular, x willbe the calculated x B in the least square projectiontogether with x C , and y will be the calculated y again in the least square projection. Calculating thesolution to (LP) and (LD) is then straightforward: x ∗ /τ ∗ and ( y ∗ /τ ∗ , s ∗ /τ ∗ ) respectively. III. THE QUANTUM ALGORITHM
The aim of this section is to explain how the Quan-tum Linear System Algorithm (QLSA) can help usefficiently run this algorithm, in the same spirit of,for example, [47] solving the problem of the FiniteElement Method. This is due to the fact that solv-ing (16) is the most computationally expensive partof each step for large matrices.To solve this system we propose leveraging the useof a technique called block encoding or qubitization,introduced in [48]. Using such technique, which weshall expand further, we have the following result(theorem 34 from [2]).
Theorem 1. [2]: Let M be an n (cid:48) × n (cid:48) Hermitian ma-trix (if the matrix is not Hermitian it can be includedas a submatrix of a Hermitian one) with conditionnumber κ , Frobenius norm || M || F = (cid:113)(cid:80) ij | M ij | and spectral norm || M || ≤ . Let f be an n (cid:48) -dimensional unit vector, and assume that there isan oracle P f which produces the state | f (cid:105) = f / || f || in time T f . Let also M be encoded in the quantumaccessible data structure indicated in III B. Let d v = M − f , | d (cid:105) = d v || d v || . (23) Then,1. We can prepare state | d (cid:105) in time complexity ˜ O (( || M || F + T f ) κ poly log( n (cid:48) (cid:15) − )) . (24)
2. We can get an (cid:15) -multiplicative estimate of || M − | f (cid:105) || in time ˜ O (( || M || F + T f ) κ(cid:15) − poly log( n (cid:48) (cid:15) − )) . (25)Proof omited.In our case the variable n (cid:48) is the size of the matrixof (16), that is n (cid:48) = 2( m +2 n +3), the 2 coming fromsymmetrisation as in the HHL algorithm. For spec-tral norm bounded matrices, || M || ≤ C constant, || M || F = O ( √ n (cid:48) ) [43]. Thus, the time complexityof running the algorithm would be O ( √ n (cid:48) ). We arealso assuming m = O ( n ). Notice that since n ap-pears in the number of iterations but m does not, it is convenient to set m ≥ n by exchanging the primaland dual problems if needed.An alternative could be to use a combination ofthe Hamiltonian simulation for dense matrices of [49]with the Quantum Linear System Algorithm from[50], that would result in slightly different complex-ity factors.Let us know study how to integrate this algorithmwithin the Interior-Point algorithm. A. The condition number κ . We have seen that the QLSA is linear in κ . There-fore it is important to check that κ is as low as pos-sible.However, preconditioning a dense matrix is muchmore complicated than a sparse matrix. In fact, weare not aware of any method that allows us to do itwithout incurring in expensive computations in theworst case. For example, the method proposed in[51] is only useful for sparse matrices.Thus, as preconditioning does not seem possible,we might attempt setting an upper bound to κ for allsteps of the iteration, taking into account that onlya small part of the matrix M t depends on t . Theentries that depend on t are the n +1 last rows, 2( n +1) entries, see (16). However, if we try doing thatwe will see that even if it is possible to upper boundthe maximum singular value σ max ( M t ) knowing theentries of the last rows, we cannot see a way to lowerbound σ min ( M t ), so we cannot bound the conditionnumber.In conclusion, we have not been able to bound thecondition number from the start, so we have to relyon a rather unknown upper bound ¯ κ . We remarkthat this shortcoming of the algorithm is commonto both our algorithm, and the Predictor-Correctorif we substituted QLSA by other iterative methods. B. Quantum state preparation andquantum-accessible data structure
In order to prepare quantum states there are manyoptions that include [52–54]. However we are inter-ested here in some method that can allow us to provesome quantum advantage for the case.So, in order to do this, we introduce the methodof [2], which additionally we will need in order toapply the QLSA.
Theorem 2. [2, 46]: Let M ∈ R n (cid:48) × n (cid:48) be amatrix. If w is the number of nonzero entries,there is a quantum accessible data structure of size O ( w log ( n (cid:48) )) , which takes time O (log( n (cid:48) )) to storeor update a single entry. Once the data structure is set up, there are quantum algorithms that can per-form the following maps to precision (cid:15) − in time O ( poly log( n (cid:48) /(cid:15) )) : U M : | i (cid:105) | (cid:105) → || M i · || (cid:88) j M ij | ij (cid:105) ; (26) U N : | (cid:105) | j (cid:105) → || M || F (cid:88) i || M i · || | ij (cid:105) ; (27) where || M i · || is the l − norm of row i of M . Thismeans in particular that given a vector f in this datastructure, we can prepare an (cid:15) approximation of it, / || v || (cid:80) i v i | i (cid:105) , in time O ( poly log( n (cid:48) /(cid:15) )) . Proof: To construct the classical data structure,create n (cid:48) trees, one for each row of M . Then, in leaf j of tree B i one saves the tuple ( M ij , sgn( M ij )). Also,intermediate nodes are created (that join nearbybranches) so that node l of tree B i at depth d con-tains the value B i,l = (cid:88) j ,...,j d = l M ij . (28)Notice that j , ..., j d is a string of values 0 and 1, asis l . The root node contains the value || M i · || .An additional tree is created taking the root nodesof all the other trees, as the leaves of the former. Onecan see that the depth of the structure is polyloga-rithmic on n (cid:48) , and so a single entry of M can befound or updated in time polylogarithmic on n (cid:48) .Now, to apply U M , we perform the following kindof controlled rotations | i (cid:105) | l (cid:105) | ... (cid:105) →| i (cid:105) | l (cid:105) (cid:112) B i,l (cid:16)(cid:112) B i, l | (cid:105) + (cid:112) B i, l +1 | (cid:105) (cid:17) | ... (cid:105) , (29)except for the last rotation, where the sign of the leafis included in the coefficients. It is simple to see that U N is the same algorithm applied with the last tree,the one that contains || M i · || for each i . Finally, fora vector, we have just one tree, and the procedure isthe same.One may worry about two things: the first is thatsetting up the database might take too long, sinceour matrices are dense. However, notice that in M t only O ( n + m ) entries depend on t , so the rest canbe prepared at the beginning of the algorithm withan overall time complexity of O (( n + m ) ), up topolylogarithmic factors. This is the same complexityas the overall algorithm when matrices are spectrallybounded.On the other hand twice per iteration one mustupdate the entries in the last n + 1 rows of M t , and prepare the data structures for the preparation ofthe quantum states, which will take time O ( n + m ),but that is fine since the work complexity on n + m is the same as needed to read out the result, andso will not add any complexity to the result, and ithas to be done just once for each linear system ofequations.Finally, preparing the states | f (cid:105) themselves comesat a polylogarithmic cost in both n + m and (cid:15) , so wedo not need to care about it. This quantum statepreparation protocol seems particularly useful whenwe want to solve the same linear system of equationsmultiple times to read out the entire solution. C. Readout of the solution of QLSA.
In the same way that we need some procedure toprepare the quantum state that feeds in the QLSA,we need some way to read out the information in | d (cid:105) ,defined as in equation (23).This is a tomography problem where the solutionis a pure state with real amplitudes. Thus, one maybe inclined to use Amplitude Estimation to readeach of the entries. However, this has the problemthat entries of large normalized vectors will be small,making it costly to determine each amplitude withthe additive precision provided by Amplitude Esti-mation.So, instead of that procedure, we propose us-ing the tomography algorithm explained in section4 of [34]. The complexity of the tomography is O ( n (cid:48) (cid:15) − ). Theorem 4.3 in [34] proves that if | ˜ d (cid:105) is the output of the tomography algorithm 1, then || | ˜ d (cid:105)−| d (cid:105) || ≤ √ (cid:15) with probability greater or equalto 1 − /n (cid:48) . . Notice that theorem 1 allows us torecover the norm of the solution. The global sign ofthe solution might be recovered multiplying a singlerow of A with the solution, and comparing the resultagainst the corresponding entry of f t . D. Block encoding and Quantum LinearSystem Algorithm (QLSA)
In this subsection let us briefly review some ofthe results that are needed for the Quantum Linearsystem algorithm. In the first place, let us reviewwhat we mean by block encoding (definition 3 in[2]).
Definition 1.
Block encoding [2] Given a s -qubitoperator A , the unitary operator U is an ( a, α, (cid:15) ) -block encoding of A when || A − α ( (cid:104) | ⊗ a ⊗ U ( | (cid:105) ⊗ a ⊗ || ≤ (cid:15). (31) Algorithm 1
Vector tomography [34] procedure Vector tomography Amplitude estimation Prepare N = n (cid:48) log n (cid:48) (cid:15) copies of | d (cid:105) , and measurethem, obtaining n i times the basis state | i (cid:105) . Defineamplitudes √ p i = (cid:112) n i /N . Save the vector | p (cid:105) = (cid:80) i √ p i | i (cid:105) in the quantumaccessible data structure from theorem 2. Sign estimation (Swap Test) Create N = n (cid:48) log n (cid:48) (cid:15) copies of the state √ | (cid:105) | d (cid:105) + √ | (cid:105) | d (cid:105) . Apply a Hadamard gate to the control qubit ineach copy of the state in the previous step. Thisprepares12 (cid:88) i [( d i + √ p i ) | , i (cid:105) + ( d i − √ p i ) | , i (cid:105) ] . (30) Measure the states in the computational basis,obtaining frequencies n ( b, i ), where b = 0 , Set the sign σ i = +1 if n (0 , i ) > . p i N . Else, σ i = − Output the classical vector with entries σ i √ p i . That means that one can write U = (cid:18) A/α ·· · (cid:19) . (32)Block encodings are useful because they allowfor a variety of linear algebra techniques includingmultiplication of block encodings, exponentiation topositive and negative powers, and Hamiltonian sim-ulation; efficiently. For more information we referthe reader to [2, 55].Another important algorithm we are using is Vari-able Time Amplitude Amplification [56] and Vari-able Time Amplitude Estimation [2]. These algo-rithms were developed to reduce the quadratic com-plexity in the condition number κ in the originalHHL algorithm [42]. In that algorithm the quadraticcomplexity appears because of the κ complexity inphase estimation and the κ cost in amplitude ampli-fication (or estimation).The insight of [56] was to realise that for the eigen-values λ for which the phase estimation was costly( λ small) the cost of amplitude amplification is low,and viceversa. Thus, he proposed variable time al-gorithms, that allow to stop some branches of thealgorithm before others. Since the more expensivebranches require less Amplitude Amplification, theystop earlier. The result is a reduction of the cost tolinear in κ . For more information on this complexalgorithm we refer the reader to those references.These two elements are the key ideas needed to ob- tain the result of theorem 1. E. On quantizing the termination.
If there exist a feasible and optimal solution, wehave seen that the loop should terminate with eitherprocedures (21) or (22). However it is unclear howto carry out this minimization. What we know isthat it can be efficiently calculated, or substitutedby any more modern and efficient method if found.The cost of carrying out this termination by clas-sical procedures should be not too big. In fact, ac-cording to [57] the overall cost is around that of oneiteration of the main loop.However, we can also propose a quantum methodto finish this. It would consist on using a smallGrover subroutine [19] to find all solutions of (21b)or (22b) in a small neighbourhood of the latest cal-culated point. After that, without reading out thestate, one could apply the algorithm described in[58] to calculate the one with the smallest distanceto the calculated point, as in (21a) or (22a).
F. Complexity
In this section we will indicate the complexity ofour algorithm against other algorithms that can beused to solve Linear Programming problems. Inparticular, in table I we compare against the samePredictor-Corrector algorithm but using one itera-tive Classical Linear System Algorithm (conjugategradient descent [31]), two exact classical methods(like Gauss or Cholesky decomposition, or the op-timal exact algorithm [33]), and against the recentquantum algorithms proposed by Brand˜ao and Svore[22], and Keredinis and Prakash [34] for solvingSemi Definite Programming problems, a more gen-eral class of problems than those studied here (Lin-ear Programming).Firstly, we must take into account that, as weare using the Predictor-Corrector algorithm [1], thatmeans by construction O ( √ nL ) iterations of themain loop. For dense problems (as those we are con-sidering), we should also take into account the com-plexity of solving two Linear Systems of Equations.The QLSA we are using is described in [2], with com-plexity O ( || M || F ¯ κ poly log(( n + m ) (cid:15) − )). In contrast,the fastest comparable Classical Linear System Al-gorithm is the conjugate gradient method [31], whichhas time complexity O (( n + m ) ¯ κ log( (cid:15) − )) for gen-eral (not symmetric, positive semidefinite) densematrices.But we also have to take into account other proce-dures. Those are: the preparation of quantum stateshas work complexity O ( n + m ) if we take into accountthe preparation of the classical data structure, andthe tomography requires O (( n + m ) (cid:15) − ) complexity,multiplied by the complexity of QLSA.In general we have for our algorithm a runtimeof O ( L √ n ( n + m ) || M || F ¯ κ(cid:15) − ), where each iterationcomes at a runtime cost of O (( n + m ) || M || F ¯ κ(cid:15) − ),up to polylogarithmic terms. All of this is quan-tum work, since the only classical operations we per-form are multiplicating the vectors needed to find δ in the Predictor step, and recalculating and updat-ing the data-base for M t and f t in each round. Inthe next section, III G, we will see that some timesa small number of steps of a gradient descent willbe necessary after the Corrector steps, with cost O ( n + m )poly log( n + m ), which we already had.Finally, an additional inexpensive classical postpro-cessing step after the Predictor step will be neededto ensure the same convergence guarantees as theoriginal Predictor Corrector algorithm.It is also remarkable to mention that, thanks tothe de-quantization algorithm of Ewin Tang [59], itis possible to solve linear systems of equations (andtherefore use our Interior-Point algorithm) in workcomplexity O ( || M || F k κ (cid:15) − ). Therefore, this clas-sical algorithm is only useful if the rank of the matrix k is low compared to O ( n + m ) [60, 61]. Howevernotice that we have not made any assumption aboutthe rank of the matrix A , and for Linear Program-ming problems we do not expect this to be the casein general. G. Probability of failure
Finally, we want to analyze the probability of fail-ure of the algorithm. The reason for this is becausethe classical Predictor-Corrector algorithm assumesexact arithmetic, and we have to take care of theerror (cid:15) . What is more, the time complexity of thealgorithm is quadratic on the associated precision (cid:15) − , so it is computationally expensive to reach highprecision. The failure of the algorithm may happenbecause we get out of N ( β ) in one of the steps. Letus now analyze if this is in fact possible.In the Predictor steps we can state that the failureis not possible. This is because we are moving from N (1 /
4) to N (1 /
2) where we classically calculate δ such that this step is performed correctly. Thereforethere is no chance of failure here, since in the worstcase we can always find δ small enough such that in(17) v t +1 ∈ N (1 /
2) if v t ∈ N (1 / N (1 /
2) to N (1 / O ( (cid:15) (cid:48)− ) on the error parameter of the quantum sub-routine (that forces us to have a loose error) and thepossibility of a corrector step getting out of N (1 / N (1 / N (1 / N (1 /
4) but also in N (1 / √ N (1 / √ N (1 / (cid:15) (cid:48)− = O ( n poly log n ). The reason for the n depen-dence on the precision is related to the fact that theerror (cid:15) (cid:48) affects each coordinate of x t or s t , and thereare O ( n ) of them.To solve this problem let us introduce a differentstrategy. Choose a low precision on the error for thecorrector step (cid:15) (cid:48)− = O ( n /a ), with a sufficientlylarge. This is ‘cheap’ in terms of the complexity ofthe quantum subroutine for a relatively large. Wewill now shift the result slightly ( (cid:15) (cid:48)(cid:48) close) but in sucha way that the new point fulfils || X s − µ || ≤ βµ ,thus introducing an error of size (cid:15) (cid:48)(cid:48) in each coordi-nate of the result. To do that we can perform onestep of gradient descent (or a small number of them).Therefore we shift each coordinate of the output ofthe Corrector step at most (cid:15) (cid:48)(cid:48) . How large is that (cid:15) (cid:48)(cid:48) ? Intuitively one can think that if the error ofthe Corrector step is at most (cid:15) (cid:48) for each coordinate,then (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) ), which we already had anyway. Aswe calculate in appendix B this is in fact true andperforming gradient descent with such (cid:15) (cid:48)(cid:48) is sufficientto end up inside N (1 / N (1 / (cid:15) . Finally, note that this pro-cedure has time complexity O ( n + m ) as calculatedin appendix B, so it adds no additional complexityto the algorithm. In appendix C we also see that im-posing (C9), which has no complexity consequences,we can prove that our modification does not affect0the convergence of our algorithm, and so the num-ber of iterations of our algorithm is the same as theoriginal Predictor-Corrector.The conclusion of the previous is that, in orderto comply with both the requirements of having alow complexity in (cid:15) (cid:48) in the quantum subroutine andstill ending up in N (1 /
4) in the end of the correctorstep, we can O ( (cid:15) (cid:48) )-shift the output of the quantumsubroutine and make it comply with the second re-quirement while (cid:15) (cid:48) is sufficiently large. This onlyimposes is that the final precision we want from ouralgorithm ( (cid:15) − ) cannot be greater than the one in-duced by the error (cid:15) (cid:48) that the Corrector step and‘shifting’ procedure has in the worst case. Or, inother words, (cid:15) ≥ O ( (cid:15) (cid:48) ) = O ( n − /a ). This shouldnevertheless pose no problem since we assume thatin general we do not ask for a greater precision ineach variable for a larger problem, what means that (cid:15) = O (1) in the variable n . In other words, even ifthe target error (cid:15) for each variable is small, it willhave a priori no relation with the number of variables n or constraints m . It is reasonable to assume thissince in general one is interested in a given precisionfor each variable, independent of their number.Another way our algorithm could fail is because ineach step we saw that the tomography has a failureprobability of 1 / (3 + 2 n + m ) . . Let us analyze ifthat is a problem. First notice that after each stepwe can easily check that the solution is correct up to (cid:15) precision. For example, using the quantum accessi-ble data structure we can prepare the trial solution,and perform a swap test to check that the solution is (cid:15) -close to the actual solution. The swap test may becombined with Amplitude Estimation and the me-dian lemma from [62] to yield exponentially smallfailure probability. This means that if we perform c attempts at each step, the probability of failure ineach iteration decreases to 1 / (3 + 2 n + m ) . c .So, we have to choose a small constant c such thatafter O ( √ nL ) iterations the total probability of fail-ure ( √ nL ) / (3 + 2 n + m ) . c (cid:28)
1. Since in manycases the number of iterations will be O ( √ n log n ) taking c = 1 should do, but in general L = O ( mn ),so we can take c = 4 > . / .
83. In conclusion, theprobability of failure of the tomography algorithmshould pose no problem.Finally, in order to decrease the final error of thealgorithm even further, in practice it could be a goodidea to perform a small (constant number) of itera-tions classically (doing this will not affect the theo-retical complexity). Theoretically though, this doesnot give any guarantee of success, since the numberof additional iterations would be O ( √ nL (¯ t (cid:15) a − ¯ t (cid:15) b )),when we want to lower the error from (cid:15) b to (cid:15) a . Evenif the difference is small, there is a dependence on √ n that would increase the complexity of the algorithmmaking it comparable to the complexity of classicalalgorithms.We already explained in section II C thatthe number of iterations should be ¯ t =max[log(( x ) T ( s ) / ( (cid:15) (cid:15) )) , log( || (¯ b T , ¯ c T ) || /(cid:15) (cid:15) )].In particular, if any one supposes every entryof x , s , b and c to be of order O (1), then¯ t a = log( O ( n + 1) /(cid:15) a ) and similarly for ¯ t b and (cid:15) b .This would mean that to lower the error from (cid:15) b to (cid:15) a , one would need¯ t a − ¯ t b = log n + 1 (cid:15) a − log n + 1 (cid:15) b = log (cid:15) b (cid:15) a = O ( n − / ) , (33)what implies that (cid:15) a = (cid:15) b e − / √ n . (34)Since lim n →∞ e − / √ n = 1, we would need the fi-nal error (cid:15) a to be very similar to the one we achievewith the quantum subroutine (cid:15) b . Or, with the nota-tion we are using: (cid:15) a = O ( (cid:15) b ) = O ( n − /a ).To summarize all the components in our quantumPredictor-Corrector algorithm and the interrelationsamong them, we show a diagram in Fig. 1 in theform of a flow chart of actions from the initializa-tion to the termination of the quantum algorithmproviding the solution to the given (LP) and (LD)problems in (2) and (3). IV. CONCLUSIONS
Quantization of Linear Programming problemsthus far have been achieved by using multiplica-tive weight methods as in the pioneering work ofBrand˜ao and Svore for Semidefinite Programming(SDP) problems [22], which are more general thanLinear Programming problems. In this work, wehave enlarged the range of applicability of quantum algorithms for Linear Programming problems by us-ing Interior Point methods instead. Specifically, ourquantum algorithm relies on a type of Interior Pointalgorithm known as the Predictor-Corrector methodthat is very well behaved with respect to the feasi-bility, optimality conditions of the output solutionand the iteration complexity.The core of our quantum Interior Point algorithmis the application of block encoding techniques as aQuantum Linear System Algorithm [2] to an auxil-1 ......
FIG. 1. Flow chart of the algorithm FIG. 2. Scheme of the algorithm. iary system of equations that comprises an homo-geneous self-dual primal-dual problem associated tothe original Linear Programming problem. This isthe basis of the Predictor-Corrector method, fromwhich many of its good properties derive. In par-ticular, the iteration complexity of the classical partscales as the square root of the size n of the costfunction. Then, the advantage of the quantum partof the Predictor-Corrector algorithm amounts to afaster solution of the linear system of equations, withcomplexity O (( n + m ) √ n + m ) including the readoutprocess, as can be seen in table I.Hence, this quantum Predictor Corrector algo-rithm is an hybrid algorithm, partially classical, par-tially quantum. Applying the QLSA is not an easytask if we want to achieve a clear advantage. Thesealgorithms come with several shortcomings, some ofwhich have been recently overcome [51] for sparselinear systems. Also, even though the solution tothe system of linear equations can be obtained in aquantum state, then it is not easy to extract all theinformation provided by the solution. One has to be satisfied by obtaining partial information from theencoded solution such as an expectation value of in-terest or a single entry of the vector solution. Never-theless this does not stop us from obtaining a poly-nomial quantum advantage in the number of vari-ables of the problem n , if the matrix is dense, well-conditioned, with m = O ( n ), and with a constant-bounded spectral norm. ACKNOWLEDGEMENTS
We acknowledge financial support from the Span-ish MINECO grants FIS2015-67411P, and theCAM research consortium QUITEMAD-CM, GrantNo.S2018/TCS-4342. The research of M.A.M.-D.has been partially supported by the U.S. Army Re-search Office through Grant No. W911NF-14-1-0103. P. A. M. C. thanks the support of a FPUMECD Grant.3 [1] Y. Ye, M. J. Todd, and S. Mizuno, “An o ( √ nl )-iteration homogeneous and self-dual linear program-ming algorithm,” Mathematics of Operations Re-search , vol. 19, no. 1, pp. 53–67, 1994.[2] S. Chakraborty, A. Gily´en, and S. Jeffery, “Thepower of block-encoded matrix powers: improvedregression techniques via faster hamiltonian simula-tion,” arXiv preprint arXiv:1804.01973 , 2018.[3] E. D. Nering and A. W. Tucker,
Linear Programs& Related Problems: A Volume in the ComputerScience and Scientific Computing Series . Elsevier,1992.[4] M. Padberg,
Linear optimization and extensions ,vol. 12. Springer Science & Business Media, 2013.[5] K. G. Murty,
Linear programming , vol. 60. WileyNew York, 1983.[6] S. J. Russell and P. Norvig,
Artificial intelligence:a modern approach . Pearson Education Limited,,2016.[7] M. Mohri, A. Rostamizadeh, and A. Talwalkar,
Foundations of machine learning . MIT press, 2012.[8] L. Vandenberghe and S. Boyd, “Semidefinite pro-gramming,”
SIAM review , vol. 38, no. 1, pp. 49–95,1996.[9] M. J. Todd, “Semidefinite optimization,”
Acta Nu-merica , vol. 10, pp. 515–560, 2001.[10] M. Laurent and F. Rendl,
Semidefinite programmingand integer programming . Centrum voor Wiskundeen Informatica, 2002.[11] E. De Klerk,
Aspects of semidefinite programming:interior point algorithms and selected applications ,vol. 65. Springer Science & Business Media, 2006.[12] J. Preskill, “Quantum computing in the nisq era andbeyond,” arXiv preprint arXiv:1801.00862 , 2018.[13] D. Nigg, M. Mueller, E. A. Martinez, P. Schindler,M. Hennrich, T. Monz, M. A. Martin-Delgado, andR. Blatt, “Quantum computations on a topologi-cally encoded qubit,”
Science , p. 1253742, 2014.[14] R. Barends, J. Kelly, A. Megrant, A. Veitia,D. Sank, E. Jeffrey, T. C. White, J. Mutus, A. G.Fowler, B. Campbell, et al. , “Superconductingquantum circuits at the surface code threshold forfault tolerance,”
Nature , vol. 508, no. 7497, p. 500,2014.[15] A. D. C´orcoles, E. Magesan, S. J. Srinivasan, A. W.Cross, M. Steffen, J. M. Gambetta, and J. M.Chow, “Demonstration of a quantum error detectioncode using a square lattice of four superconduct-ing qubits,”
Nature communications , vol. 6, p. 6979,2015.[16] J. Preskill, “Quantum computing and the entan-glement frontier,” arXiv preprint arXiv:1203.5813 ,2012.[17] S. Aaronson and A. Arkhipov, “The computationalcomplexity of linear optics,” in
Proceedings of theforty-third annual ACM symposium on Theory ofcomputing , pp. 333–342, ACM, 2011.[18] P. W. Shor, “Polynomial-time algorithms for primefactorization and discrete logarithms on a quantum computer,”
SIAM review , vol. 41, no. 2, pp. 303–332, 1999.[19] L. K. Grover, “Quantum mechanics helps in search-ing for a needle in a haystack,”
Physical review let-ters , vol. 79, no. 2, p. 325, 1997.[20] M. A. Nielsen and I. Chuang,
Quantum computationand quantum information . Cambridge, CambridgeUniversity Press, 2000.[21] A. Galindo and M. A. Martin-Delgado, “Informa-tion and computation: Classical and quantum as-pects,”
Reviews of Modern Physics , vol. 74, no. 2,p. 347, 2002.[22] F. G. Brandao and K. M. Svore, “Quantum speed-ups for solving semidefinite programs,” in
Founda-tions of Computer Science (FOCS), 2017 IEEE 58thAnnual Symposium on , pp. 415–426, IEEE, 2017.[23] F. G. Brandao, A. Kalev, T. Li, C. Y.-Y. Lin,K. M. Svore, and X. Wu, “Exponential quantumspeed-ups for semidefinite programming with ap-plications to quantum learning,” arXiv preprintarXiv:1710.02581 , 2017.[24] J. Van Apeldoorn, A. Gily´en, S. Gribling, andR. de Wolf, “Quantum sdp-solvers: Better upperand lower bounds,” in
Foundations of ComputerScience (FOCS), 2017 IEEE 58th Annual Sympo-sium on , pp. 403–414, IEEE, 2017.[25] J. van Apeldoorn and A. Gily´en, “Improvementsin quantum sdp-solving with applications,” arXivpreprint arXiv:1804.05058 , 2018.[26] S. Chakrabarti, A. M. Childs, T. Li, and X. Wu,“Quantum algorithms and lower bounds for convexoptimization,” arXiv preprint arXiv:1809.01731 ,2018.[27] L. G. Khachiyan, “A polynomial algorithm in linearprogramming,” in
Doklady Academii Nauk SSSR ,vol. 244, pp. 1093–1096, 1979.[28] N. Karmarkar, “A new polynomial-time algorithmfor linear programming,” in
Proceedings of the six-teenth annual ACM symposium on Theory of com-puting , pp. 302–311, ACM, 1984.[29] F. A. Potra and S. J. Wright, “Interior-point meth-ods,”
Journal of Computational and Applied Math-ematics , vol. 124, no. 1-2, pp. 281–302, 2000.[30] S. Mizuno, M. J. Todd, and Y. Ye, “On adaptive-step primal-dual interior-point algorithms for lin-ear programming,”
Mathematics of Operations re-search , vol. 18, no. 4, pp. 964–981, 1993.[31] J. R. Shewchuk et al. , “An introduction to theconjugate gradient method without the agonizingpain,” 1994.[32] W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T.Vetterling, et al. , Numerical recipes , vol. 2. Cam-bridge university press Cambridge, 1989.[33] V. Pan and J. Reif, “Fast and efficient parallel solu-tion of dense linear systems,”
Computers & Mathe-matics with Applications , vol. 17, no. 11, pp. 1481–1491, 1989.[34] I. Kerenidis and A. Prakash, “A quantum inte-rior point method for lps and sdps,” arXiv preprint arXiv:1808.09266 , 2018.[35] G. Nannicini, “Fast quantum subroutines for thesimplex method,” arXiv preprint arXiv:1910.10649 ,2019.[36] K. M. Anstreicher, J. Ji, F. A. Potra, and Y. Ye,“Average performance of a self–dual interior pointalgorithm for linear programming,” in Complexity innumerical optimization , pp. 1–15, World Scientific,1993.[37] A. Aspuru-Guzik, A. D. Dutoi, P. J. Love, andM. Head-Gordon, “Simulated quantum computa-tion of molecular energies,”
Science , vol. 309,no. 5741, pp. 1704–1707, 2005.[38] A. Kandala, A. Mezzacapo, K. Temme, M. Takita,M. Brink, J. M. Chow, and J. M. Gambetta,“Hardware-efficient variational quantum eigensolverfor small molecules and quantum magnets,”
Nature ,vol. 549, no. 7671, p. 242, 2017.[39] M.-H. Yung, J. Casanova, A. Mezzacapo, J. Mc-clean, L. Lamata, A. Aspuru-Guzik, and E. Solano,“From transistor to trapped-ion computers forquantum chemistry,”
Scientific reports , vol. 4,p. 3589, 2014.[40] C. Hempel, C. Maier, J. Romero, J. McClean,T. Monz, H. Shen, P. Jurcevic, B. Lanyon, P. Love,R. Babbush, et al. , “Quantum chemistry calcula-tions on a trapped-ion quantum simulator,” arXivpreprint arXiv:1803.10238 , 2018.[41] Y. Cao, J. Romero, J. P. Olson, M. Degroote, P. D.Johnson, M. Kieferov´a, I. D. Kivlichan, T. Menke,B. Peropadre, N. P. Sawaya, et al. , “Quantum chem-istry in the age of quantum computing,” arXivpreprint arXiv:1812.09976 , 2018.[42] A. W. Harrow, A. Hassidim, and S. Lloyd, “Quan-tum algorithm for linear systems of equations,”
Physical review letters , vol. 103, no. 15, p. 150502,2009.[43] L. Wossnig, Z. Zhao, and A. Prakash, “Quantumlinear system algorithm for dense matrices,”
Physi-cal review letters , vol. 120, no. 5, p. 050502, 2018.[44] D. Coppersmith and S. Winograd, “Matrix multipli-cation via arithmetic progressions,”
Journal of sym-bolic computation , vol. 9, no. 3, pp. 251–280, 1990.[45] S. Arora and S. Kale, “A combinatorial, primal-dualapproach to semidefinite programs,” in
Proceedingsof the thirty-ninth annual ACM symposium on The-ory of computing , pp. 227–236, ACM, 2007.[46] I. Kerenidis and A. Prakash, “Quantum recommen-dation systems,” in
Proceedings of the 8th Innova-tions in Theoretical Computer Science Conference ,2017.[47] A. Montanaro and S. Pallister, “Quantum algo-rithms and the finite element method,”
Physical Re-view A , vol. 93, no. 3, p. 032324, 2016.[48] G. H. Low and I. L. Chuang, “Hamiltonian simula-tion by qubitization,”
Quantum , vol. 3, p. 163, 2019.[49] C. Wang and L. Wossnig, “A quantum algorithm forsimulating non-sparse hamiltonians,” arXiv preprintarXiv:1803.08273 , 2018.[50] A. M. Childs, R. Kothari, and R. D. Somma,“Quantum algorithm for systems of linear equations with exponentially improved dependence on preci-sion,”
SIAM Journal on Computing , vol. 46, no. 6,pp. 1920–1950, 2017.[51] B. D. Clader, B. C. Jacobs, and C. R. Sprouse,“Preconditioned quantum linear system algorithm,”
Physical review letters , vol. 110, no. 25, p. 250504,2013.[52] V. V. Shende, S. S. Bullock, and I. L. Markov, “Syn-thesis of quantum-logic circuits,”
IEEE Transac-tions on Computer-Aided Design of Integrated Cir-cuits and Systems , vol. 25, no. 6, pp. 1000–1010,2006.[53] L. Grover and T. Rudolph, “Creating superpositionsthat correspond to efficiently integrable probabilitydistributions,” arXiv preprint quant-ph/0208112 ,2002.[54] Y. R. Sanders, G. H. Low, A. Scherer, and D. W.Berry, “Black-box quantum state preparation with-out arithmetic,”
Physical review letters , vol. 122,no. 2, p. 020502, 2019.[55] A. Gily´en, Y. Su, G. H. Low, and N. Wiebe, “Quan-tum singular value transformation and beyond: ex-ponential improvements for quantum matrix arith-metics,” arXiv preprint arXiv:1806.01838 , 2018.[56] A. Ambainis, “Variable time amplitude amplifi-cation and quantum algorithms for linear alge-bra problems,” in
STACS’12 (29th Symposium onTheoretical Aspects of Computer Science) , vol. 14,pp. 636–647, LIPIcs, 2012.[57] Y. Ye, “On the finite convergence of interior-pointalgorithms for linear programming,”
MathematicalProgramming , vol. 57, no. 1-3, pp. 325–335, 1992.[58] L. A. B. Kowada, C. Lavor, R. Portugal, and C. M.De Figueiredo, “A new quantum algorithm for solv-ing the minimum searching problem,”
InternationalJournal of Quantum Information , vol. 6, no. 03,pp. 427–436, 2008.[59] E. Tang, “A quantum-inspired classical algo-rithm for recommendation systems,” arXiv preprintarXiv:1807.04271 , 2018.[60] N.-H. Chia, H.-H. Lin, and C. Wang, “Quantum-inspired sublinear classical algorithms for solv-ing low-rank linear systems,” arXiv preprintarXiv:1811.04852 , 2018.[61] J. M. Arrazola, A. Delgado, B. R. Bardhan, andS. Lloyd, “Quantum-inspired algorithms in prac-tice,” arXiv preprint arXiv:1905.10415 , 2019.[62] D. Nagaj, P. Wocjan, and Y. Zhang, “Fast ampli-fication of qma,” arXiv preprint arXiv:0904.1549 ,2009.
Appendix A: Calculation of the error (cid:15) (cid:48) in theCorrector step
In this appendix we want to calculate the size ofthe error (cid:15) (cid:48) we need in order to make the Correctorstep successfully output a point within N (1 / x as the concate-nation of x t and τ t , and s as concatenating s t and k t . Call x and s the exact arithmetic solution, sothat being (cid:15) (cid:48) the error of the quantum subroutine x = x + (cid:15) (cid:48) x ; s = s + (cid:15) s ; || s || = || x || ≤ n +1 . (A1)since each entry will have an error of (cid:15) (cid:48) at most, andthen each entry in x and s are, say, at most 1.Using (14), we can see that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ x T s n + 1 , (A2)and we want to calculate how small (cid:15) (cid:48) needs to bein order to comply with (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ x T s n + 1 . (A3)Expanding, to leading order O ( (cid:15) (cid:48) ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (A4) || X s + (cid:15) (cid:48) ( X s + X s ) (A5) − x T s + (cid:15) (cid:48) ( x T s + x T s ) n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (A6) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (A7)+ (cid:15) (cid:48) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (A8)+ (cid:15) (cid:48) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (A9)The first term is clearly our hypothesis (A2), so letus calculate one of other two terms (calculating oneis the same as calculating the other, they are sym-metrical). Let x i = (¯ x i, , ..., ¯ x i,n +1 ) T for i ∈ { , } and similarly for s i , and let us expand the expres- sion. (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (A10) (cid:118)(cid:117)(cid:117)(cid:117)(cid:116) n +1 (cid:88) i =1 ¯ x ,i ¯ s ,i − n + 1 n +1 (cid:88) j =1 ¯ x ,j ¯ s ,j = (A11) n +1 (cid:88) i =1 ¯ x ,i ¯ s ,i − n + 1 ¯ x ,i ¯ s ,i n +1 (cid:88) j =1 ¯ x ,j ¯ s ,j (A12)+ 1( n + 1) n +1 (cid:88) j =1 ¯ x ,j ¯ s ,j / = (A13) n +1 (cid:88) i =1 ¯ x ,i ¯ s ,i − n + 1 n +1 (cid:88) i,j =1 ¯ x ,i ¯ s ,i ¯ x ,j ¯ s ,j (A14)+ n + 1( n + 1) n +1 (cid:88) i,j =1 ¯ x ,i ¯ s ,i ¯ x ,j ¯ s ,j / (A15)Now we can see that the second term partially can-cels out with the third n +1 (cid:88) i =1 ¯ x ,i ¯ s ,i − n + 1 n +1 (cid:88) i,j =1 ¯ x ,i ¯ s ,i ¯ x ,j ¯ s ,j / (A16)= (cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (A17)Therefore, we can write (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Xs − x T sn + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ √ x T s n + 1 (A18)+ (cid:15) (cid:48) (cid:32)(cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (A19)+ (cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (cid:33) (A20)Enforcing (A3) can be done if we choose (cid:15) (cid:48) such that14 √ x T s n + 1 + (cid:15) (cid:48) (cid:32)(cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (A21)+ (cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (cid:33) (A22) ≤ x T s n + 1 = 14 ( x + (cid:15) (cid:48) x ) T ( s + (cid:15) (cid:48) s ) n + 1 . (A23)6To leading order O ( (cid:15) (cid:48) ) that means14 √ x T s n + 1 + (cid:15) (cid:48) (cid:32)(cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (A24)+ (cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (cid:33) ≤ (A25)14 x T s n + 1 + (cid:15) (cid:48) x T s + x T s n + 1 , (A26) or equivalently2 − √ x T s n + 1 ≥ (A27) (cid:15) (cid:48) (cid:32)(cid:20) || X s || − n + 1 ( x T s ) (cid:21) / (A28)+ (cid:20) || X s || − n + 1 ( x T s ) (cid:21) / − x T s + x T s n + 1 (cid:33) , (A29)implying (cid:15) (cid:48) ≤ −√ x T s n +1 (cid:104) || X s || − n +1 ( x T s ) (cid:105) / + (cid:104) || X s || − n +1 ( x T s ) (cid:105) / − x T s + x T s n +1 . (A30)Let us start analyzing the denominator. The worstcase is that the denominator is very large so it forcesthe error to be small. The fraction x T s + x T s n +1 could turn negative, thus increasing the denomi-nator. But we can see that its influence will below due to its denominator n + 1. Due to (A1),we can expect x T s ≤ ( n + 1) max i | ¯ s ,i | = O ( n )and x T s ≤ ( n + 1) max i | ¯ x ,i | = O ( n ). Therefore x T s + x T s n +1 = O (1).Additionally, if all entries in x and s are O (1)it is easy to check that || X s || = O ( n ) and || X s || = O ( n ). Therefore the denominator willbe O ( n / ).The numerator of (A30) is a bit more tricky. Inthis case the worst case is when it is small. Whatseems the biggest problem is that [1] tells us that inthe exact solution x T s = 0, and furthermore it isdivided by n +1. Therefore we can see that in the ex-act solution (cid:15) should be 0 (which is what we shouldexpect). What matters is the convergence speed.At the beginning recall that x = s = [1 , ..., T .Therefore, at the very beginning x T s n +1 = 1. Accord-ing to [1], every two iterations x T s should decreaseat a rate (1 − / ( √ √ n + 1)). The question there-fore is what happens after O ( L √ n ¯ t ) iterations (theiterations of the algorithm).This finally gives us the threshold for (cid:15) (cid:48) we wouldlike in order to stay inside N (1 / (cid:15) (cid:48) ≤ lim n →∞ O ( n − / ) (cid:18) − √ √ n (cid:19) O ( L √ n ¯ t ) = O ( n − / ) e − O ( L ¯ t ) . (A31) Since L = log n in the usual case and ¯ t =max[log(( x ) T ( s ) / ( (cid:15) (cid:15) )) , log( || (¯ b T , ¯ c T ) || /(cid:15) (cid:15) )],the error threshold is (cid:15) (cid:48) = O ( n − poly log n − / ).On the other hand, the complexity on the pre-cision of the algorithm is O ( (cid:15) − ) and the differ-ence in complexity in n that we stated in tabla Ibetween our algorithm and the ‘best classical’ al-gorithm is O ( √ n ). This means that in order tomaintain the quantum advantage one would want (cid:15) ≤ O ( n − / ). Therefore we have seen that settingan (cid:15) (cid:48) small enough might be too expensive compu-tationally in general for the quantum subroutine. Appendix B: Gradient descent for shifting theoutput of the corrector step.
We have seen that setting an (cid:15) (cid:48) small enoughmight be too expensive computationally. Thereforelet us try something: choose a small precision on theerror for the corrector step (cid:15) (cid:48) = O ( n − /a ), and then,once we get the result, perform one step gradientdescent of the size (cid:15) (cid:48)(cid:48) towards N (1 / N ( β ) (14), g ( x, s ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − β (cid:18) x T s n + 1 (cid:19) . (B1)7Using the equivalent to (A16), we can rewrite it like g ( x, s ) = || Xs − µ || − β µ (B2)= (cid:88) i ¯ x i ¯ s i − n + 1 n +1 (cid:88) i,j =1 ¯ x i ¯ s i ¯ x j ¯ s j (B3) − β ( n + 1) n +1 (cid:88) i,j =1 ¯ x i ¯ s i ¯ x j ¯ s j (B4)= (cid:88) i ¯ x i ¯ s i − (cid:18) β + ( n + 1)( n + 1) (cid:19) n +1 (cid:88) i,j =1 ¯ x i ¯ s i ¯ x j ¯ s j . (B5)Notice that the points that are in N ( β ) are those forwhich g ( x, s ) ≤
0. We calculate the gradient, calling B = β +( n +1)( n +1) = O ( n − ): dg ( x, s ) d ¯ x k = 2¯ x k ¯ s k − B (cid:88) i ¯ x i ¯ s i ¯ s k . (B6)Clearly, dg ( x, s ) d ¯ s k = 2¯ s k ¯ x k − B (cid:88) i ¯ s i ¯ x i ¯ x k . (B7)The idea is now, supposing that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − x T s n + 1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ β (cid:18) x T s n + 1 (cid:19) , (B8)if we can find (cid:15) (cid:48)(cid:48) = O ( n − /a ) such that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X s − (cid:15) (cid:48)(cid:48) (cid:18) X dgds + S dgdx (cid:19) (B9) − n + 1 (cid:18) x T s − (cid:15) (cid:48)(cid:48) (cid:18) x T dgds + s T dgdx (cid:19)(cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (B10) ≤ β ( n + 1) (cid:18) x T s − (cid:15) (cid:48)(cid:48) (cid:18) x T dgds + s T dgdx (cid:19)(cid:19) . (B11)For that, as it will be very useful, first calculate toleading order O ( (cid:15) (cid:48)(cid:48) ) (cid:88) i (cid:18) ¯ x i − (cid:15) (cid:48)(cid:48) dgd ¯ x i (cid:19) (cid:18) ¯ s i − (cid:15) (cid:48)(cid:48) dgd ¯ s i (cid:19) = (cid:88) i ¯ x i ¯ s i − (cid:15) (cid:48)(cid:48) (¯ s i + ¯ x i ) ¯ s i ¯ x i − B (cid:88) j ¯ x j ¯ s j . (B12)We can write, to leading order O ( (cid:15) ), how much is || Xs || in the new point: (cid:88) i (cid:18) ¯ s i − (cid:15) (cid:48)(cid:48) dgd ¯ s i (cid:19) (cid:18) ¯ x i − (cid:15) (cid:48)(cid:48) dgd ¯ x i (cid:19) = (cid:88) i ¯ x i ¯ s i − (cid:15) (cid:48)(cid:48) ¯ x i ¯ s i (¯ s i + ¯ x i ) ¯ s i ¯ x i − B (cid:88) j ¯ x j ¯ s j . (B13)The next thing we want to calculate is, to leadingorder O ( (cid:15) ), the new value of ( x T s ) (cid:88) i,j (cid:18) ¯ s i − (cid:15) (cid:48)(cid:48) dgd ¯ s i (cid:19) (cid:18) ¯ x i − (cid:15) (cid:48)(cid:48) dgd ¯ x i (cid:19)(cid:18) ¯ s j − (cid:15) (cid:48)(cid:48) dgd ¯ s j (cid:19) (cid:18) ¯ x j − (cid:15) (cid:48)(cid:48) dgd ¯ x j (cid:19) = (cid:88) i,j ¯ x i ¯ x j ¯ s i ¯ s j − (cid:88) i,j (cid:15) (cid:48)(cid:48) ¯ x j ¯ s j (¯ s i + ¯ x i ) (cid:32) ¯ s i ¯ x i − B (cid:88) k ¯ x k ¯ s k (cid:33) (B14)We now have all the parts we need to calculatethe result we were seeking. Let us try to check thatwe can make g (cid:18) x − (cid:15) (cid:48)(cid:48) dgdx , s − (cid:15) (cid:48)(cid:48) dgds (cid:19) ≤ . (B15)Let’s check it g (cid:18) x − (cid:15) (cid:48)(cid:48) dgdx , s − (cid:15) (cid:48)(cid:48) dgds (cid:19) == (cid:88) i ¯ x i ¯ s i − B n +1 (cid:88) i,j =1 ¯ x i ¯ s i ¯ x j ¯ s j + − (cid:15) (cid:48)(cid:48) (cid:88) i ¯ x i ¯ s i (¯ s i + ¯ x i ) ¯ s i ¯ x i − B (cid:88) j ¯ x j ¯ s j − B (cid:88) i,j ¯ x j ¯ s j (¯ s i + ¯ x i ) (cid:32) ¯ s i ¯ x i − B (cid:88) k ¯ x k ¯ s k (cid:33) (B16)This allows us to calculate the approximate (cid:15) (cid:48)(cid:48) intime O ( n ) such that the point is in N (1 / (cid:15) (cid:48)(cid:48) large? To answer that question expand in x =8 x + (cid:15) (cid:48) x and s = s + (cid:15) (cid:48) s g (cid:18) x − (cid:15) (cid:48)(cid:48) dgdx , s − (cid:15) (cid:48)(cid:48) dgds (cid:19) == (cid:88) i ¯ x i, ¯ s i, − B n +1 (cid:88) i,j =1 ¯ x i, ¯ s i, ¯ x j, ¯ s j, ++ 2 (cid:15) (cid:48) (cid:34)(cid:88) i (¯ x i, ¯ x i, ¯ s i, + (cid:88) i ¯ x i, ¯ s i, ¯ s i, ) − B (cid:88) i,j (¯ x i, ¯ s i, ¯ x j, ¯ s j, + ¯ x i, ¯ s i, ¯ x j, ¯ s j, ) − (cid:15) (cid:48)(cid:48) (cid:88) i ¯ x i ¯ s i (¯ s i + ¯ x i ) ¯ s i ¯ x i − B (cid:88) j ¯ x j ¯ s j − B (cid:88) i,j ¯ x j ¯ s j (¯ s i + ¯ x i ) (cid:32) ¯ s i ¯ x i − B (cid:88) k ¯ x k ¯ s k (cid:33) . (B17)We know by hypothesis that the second line of (B17)is less than 0 (the exact point is in N (1 / (cid:15) (cid:48)(cid:48) to cancel that of (cid:15) (cid:48) .Notice that this means that (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) ), since bothterms in the square brackets are of size O ( n ). There-fore, we are only introducing an error of size O ( (cid:15) (cid:48) ),which is what we already had, but we can now en-sure that the output is in N (1 / (cid:15) (cid:48)(cid:48) such that, to order O ( (cid:15) (cid:48)(cid:48) ) we areinside N (1 / (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) ).Since in the expansion we have only consider termsup to (cid:15) (cid:48)(cid:48) , there is the possibility that after this shiftwe are still outside N (1 / (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) ) away from N (1 / O ( (cid:15) (cid:48)(cid:48) i ),which decreases quickly on the iteration i . Alter-natively one could take into account further termswith higher exponents of (cid:15) (cid:48)(cid:48) in the expansions (B12),(B13) and (B14).Further, one could also wonder if the gradientstep could converge towards different local minimaof g ( x, s ) than those in the central path. However,it is easy to check using (B6) and (B7) that all lo-cal minima (or maxima or saddle points) fulfil either x i = 0 and s i = 0, or x i s i − B (cid:88) j x j s j = 0 (B18)for all i ∈ { , ..., n + 1 } . Substituting those points in the definition of g ( x, s ) (B5), g = n +1 (cid:88) i =1 x i s i − (cid:18) β + ( n + 1)( n + 1) (cid:19) ( n + 1) n +1 (cid:88) i =1 x i s i = (cid:18) − β n + 1 − (cid:19) n +1 (cid:88) i =1 x i s i ≤ . (B19)This means that all those points are within N ( β ),so the gradient step will be taken towards N ( β ).With this we conclude that with an O ( (cid:15) (cid:48) )-shiftto the output of the already O ( (cid:15) (cid:48) )-precise Correctorstep we can ensure that the point is in N (1 / Appendix C: Convergence of the algorithm
The legitimacy of the previous procedure rests onthe fact that we only have to check that we do notget out of the neighbourhood of the central path.However, a question remains: does the algorithmstill converges, with this modification, in O ( n / L )steps? In this appendix we analyse this questionand find a positive answer. The references we willbe using are [30], mainly, and [1].Let us first review how the convergence is studiedin the original article [1]. The convergence happenswhen the duality gap µ closes. In those references,one proves that in the corrector step µ k = µ k +1 .The convergence happens in the predictor corrector,where the duality gap µ k +1 = (1 − α ) µ k . Lets seethese two things.For convenience define x t concatenating x t and τ t ; s t as the concatenation of s t and κ t ; and similarlywith d x , d τ , and d s and d κ . The starting equationis µ t +1 ( δ ) = ( x t +1 ) T s t +1 n + 1 =( x t ) T s t + δ (( x t ) T d s + ( s t ) T d x ) + δ d T x d s n + 1 . (C1)Since the exact solution fulfils equation (15b), thatwould mean that X t d s + S t d x = γ t µ t − X t s t . (C2)In practice, since we get an approximation solution,we can substitute the term multiplied by δ in (C1)with γ t µ t ( n +1) − ( x t ) T s t +( x t ) T (cid:15) x +( s t ) T (cid:15) s , where (cid:15) x and (cid:15) s are the error vectors in the estimation of d x and d s respectively.On the other hand, using equation (15a) for theexact solution, and the second part of theorem 5 of9[1], we know that ( d x ) T d s = 0. In our case though,the equation will not be exact due to errors, andthus, the term multiplied by δ in (C1) can be sub-stituted with d T x (cid:15) s + d T s (cid:15) x . Therefore, in the approx-imate case, we can rewrite (C1) for the corrector step( δ = 1 = µ t ) as µ t +1 = ( x t ) T s t + ( − ( x t ) T s t + ( x t ) T (cid:15) x + ( s t ) T (cid:15) s ) n + 1+ ( d T x (cid:15) x + d T s (cid:15) x ) n + 1 + µ t . (C3)From this it follows that in the worst case µ t +1 = µ t + O ( (cid:15) ) , (C4)where (cid:15) is the norm of each of the entries of (cid:15) x and (cid:15) s , if the entries of x t , s t , d x and d s are O (1).The predictor step is a bit more involved, since δ is not fixed. The calculation, though, is similar, andwe get µ t +1 ( δ ) = (1 − δ ) µ t + (1 + δ ) O ( (cid:15) ) , (C5)where the term (1 − δ ) µ t is the exact value we wouldrecover, and O ( (cid:15) ) the part due to the error. Thenquestion that remains is, how large can we make δ in the Predictor step? To answer this question,remember that condition for the Predictor step toend up in N (1 /
2) is || X t +1 ( δ ) s t +1 ( δ ) − µ t +1 ( δ )1 || ≤ µ t +1 ( δ ) . (C6)We have already seen that the right hand side of(C6) is O ( (cid:15) )-approximate to the exact µ t +1 . Theleft-hand side on the other hand, is || X t +1 ( δ ) s t +1 ( δ ) − µ t +1 ( δ )1 || = (cid:12)(cid:12)(cid:12)(cid:12) ( X t s t − µ t )+ δ (cid:18) X t d s + S t d x − ( x t ) T d s + ( s t ) T d x n + 1 (cid:19) + δ (cid:18) D x d s − d T x d s n + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , (C7)where D x is the diagonal matrix corresponding to d x . In order to make this work, we will enforce acorrection to make sure that (15b) is fulfilled exactly,not only approximately. This means that we have tomake X t d x + S t d s = − X t s t . Let us expand it withthe errors (recall that µ t = 0 for the predictor step). X t ( d x + (cid:15) x ) + S t ( d s + (cid:15) s ) = − X t s t , (C8)The terms not containing either of the error vectorscancel out. To enforce that the previous equation is fulfilled exactly, we go one by one of the entriesof x t and s t (complexity O ( n + 1)), and choose thelargest value of both. If the modulus of the entry of x t is larger, then substract (cid:15) (cid:48) to d x i such that x ti ( d x i + (cid:15) x i − (cid:15) (cid:48) ) + s ti ( d s i + (cid:15) s i ) = − x ti s ti , ⇐⇒ x ti ( (cid:15) x i − (cid:15) (cid:48) ) + s ti (cid:15) s i = 0 ⇐⇒ (cid:15) (cid:48) = s ti x ti (cid:15) s i + (cid:15) x i . (C9)Clearly (cid:15) (cid:48) = O ( (cid:15) ), so the introduced error will be ofthe size of the original one. We do equivalently forthe case when the modulus of the entry of s t is largerthan that of x t . If both entries are very small, thenno correction is necessary. Then, we rewrite (C7) || X t +1 ( δ ) s t +1 ( δ ) − µ t +1 ( δ )1 || = (cid:12)(cid:12)(cid:12)(cid:12) ( X t s t − µ t ) + δ (cid:0) − X t s t + µ t (cid:1) + δ (cid:18) D x d s + d T x (cid:15) x + d T s (cid:15) x n + 1 (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤|| (1 − δ )( X t s t − µ t ) || + δ || D x d s || + δ O ( (cid:15) ) . (C10)For the next steps we follow the procedure indicatedin lemma 4 and theorem 1 of [30]. According tothem, one chooses δ such that δ || D x d s || ≤ µ t / . (C11)Recall that in the last line of (C10), by hypothesisthe first term is smaller than (1 − δ ) µ t /
4, because theprevious step was a corrector step. With the δ from(C11), (C6) is fulfilled, as proved in Lemma 4 from[30]. Then we must calculate the size of || D x d s || .Using the same methods as theorem 1 of [30], weget || D x d s || ≤ √
24 (( n + 1) µ t +1 + || (cid:15) || ) , (C12)for (cid:15) a vector that also has entries of size O ( (cid:15) ). Thus || (cid:15) || = ( n + 1) O ( (cid:15) ). Using (C11) we deduce that δ ≥ − / ( n +1) − / (cid:112) µ t / ( µ t + O ( (cid:15) )). The fact that δ ≥ O ( n − / ) is key to obtain a number of steps thatgrows as O ( n / ) as explained in theorem 1 of [30]and theorem 6 of [1]. Consequently, we can see thatthe step size in the Predictor step gets almost thesame convergence guarantees as in the exact arith-metic case. Notice that the only caveat is that wewill not be able to reduce the duality gap more than µ = O ( (cid:15) ), but that is fine. This concludes our algo-rithm.0 Appendix D: Overall structure of the algorithm1. Initialization
The initialization procedure consists in preparing the matrix M , and the state f . Algorithm 2
Quantum interior point algorithm initialization. procedure Initialization Problem:
Solve the following dual problemsminimize c T x, subject to Ax ≥ b, x ≥ . (D1)and maximize b T y, subject to A T y ≤ c. (D2) Input:
Sparse matrix A ∈ R m × n , sparse vector c ∈ R m , vector b ∈ R n . Output:
Dual solutions y ∈ R m and x ∈ R n , or a signal that the problem is infeasible. Initialization:
Want to form the matrix (16). Define τ = k = θ = 1. Set x = s = 1 n × , and y = 0 m × . Calculate ¯ z . Calculate ¯ b and ¯ c . Set t = 0. Create the quantum-accessible classical data structure for M .
2. Termination
In the termination we propose one possible way of using Grover to run the termination explained in [1].Any other classical termination is also possible.
Algorithm 3
Quantum interior point algorithm termination. procedure Termination In this section we propose a termination technique using Grover algorithm [19] and [58] to find the optimalsolution. We suppose the search space is small enough to allow for this ‘brute force’ search without affecting thecomplexity class of the main loop. This technique can be nevertheless substituted by any other efficient classicaltermination. if termination of algorithm 4 was due to 2 nd criterion then (2) or (3) do not have feasible solutions such that || ( x , T s T ) || ≤ / (2 (cid:15) ) −
1. The problem is infeasible orunbounded. Check feasibility with the latest available step. if termination of algorithm 4 was due to 1 st criterion then if τ t ≥ k t then Use Grover search algorithm [19] to find all possible solutions to (21b), without reading them out. Use Grover Search minimum finding algorithm [58] to find the minimum of the possible states. if τ t < k t then Use Grover search algorithm [19] to find all possible solutions to (22b), without reading them out.
Use Grover Search minimum finding algorithm [58] to find the minimum of the possible states.
3. Main loop
The main loop consists in two steps called predictor and corrector. The structure of them is very similar:1. Update the data structures for f t and M t .2. Prepare | f (cid:105) and solve M | d (cid:105) = | f (cid:105) with QLSA.3. Read | d (cid:105) → d and calculate the new vector v t +1 = ( y t +1 , x t +1 , τ t +1 , θ t +1 , s t +1 , k t +1 ) Algorithm 4
Quantum interior point algorithm loop. procedure Main Loop Main loop:
Loop O ( L √ n ) times over t until one of the following two criteria are fulfilled: Choose (cid:15) , (cid:15) , (cid:15) smallnumbers and1. ( x t /τ t ) T ( s t /τ t ) ≤ (cid:15) and ( θ t /τ t ) || (¯ b T , ¯ c T ) || ≤ (cid:15) .2. τ t ≤ (cid:15) .We will have to iterate O (¯ t ) times: ¯ t = max[log(( x ) T ( s ) / ( (cid:15) (cid:15) )) , log( || (¯ b T , ¯ c T ) || /(cid:15) (cid:15) )]. Update of the data structures: Update the data structures that save M t and f t with γ t = 0. Predictor step: Use theorem 1 as a QLSA to solve (16) O (( n + m ) (cid:15) − ) times. Read the results using the tomography algorithm 1. Check the classical answer using Swap Test with the correct answer. If not, iterate the two previous steps. Rescale the vector using the norm of the solution obtained using theorem 1.
Correct the global sign multiplying a row of A with the solution and comparing it against one of the entriesof f t . For each coordinate i shift d xi (respectively d si ) (cid:15) (cid:48) when x i > s i ( x i > s i ) so that (15b) is fulfilled exactly. Use binary search to find the δ that fulfills that (17) ∈ N (1 / Calculate the values of ( y t +1 , x t +1 , τ t +1 , θ t +1 , s t +1 , k t +1 ) using (17). t ← t + 1. Update of the data structures:
Update the data structures that save M t and f t with γ t = 1. Corrector step:
Use theorem 1 as a QLSA to solve (16) O (( n + m ) (cid:15) − ) times. Read the results using the tomography algorithm 1.
Check the classical answer using Swap Test with the correct answer. If not, iterate the two previous steps.
Rescale the vector using the norm of the solution obtained using theorem 1.
Correct the global sign multiplying a row of A with the solution and comparing it against one of the entriesof f t . Calculate the values of ( y t +1 , x t +1 , τ t +1 , θ t +1 , s t +1 , k t +1 ) using (18). If the new point ( y t +1 , x t +1 , τ t +1 , θ t +1 , s t +1 , k t +1 ) / ∈ N (1 /
4) use gradient descent with parameter (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) )determined by (B16) to shift it slightly until it is inside N (1 / tt
4) use gradient descent with parameter (cid:15) (cid:48)(cid:48) = O ( (cid:15) (cid:48) )determined by (B16) to shift it slightly until it is inside N (1 / tt ← tt