[PDF] Scheduling of Operations in Quantum Compiler

Abstract

When scheduling quantum operations, a shorter overall execution time of the resulting schedule yields a better throughput and higher fidelity output. In this paper, we demonstrate that quantum operation scheduling can be interpreted as a special type of job-shop problem. On this basis, we provide its formulation as Constraint Programming while taking into account commutation between quantum operations. We show that this formulation improves the overall execution time of the resulting schedules in practice through experiments with a real quantum compiler and quantum circuits from two common benchmark sets.

Full PDF

SScheduling of Operations in Quantum Compiler ∗ Toshinari Itoko

IBM QuantumIBM Research - Tokyo

Takashi Imamichi

IBM QuantumIBM Research - Tokyo

Abstract —When scheduling quantum operations, a shorteroverall execution time of the resulting schedule yields a betterthroughput and higher ﬁdelity output. In this paper, we demon-strate that quantum operation scheduling can be interpreted asa special type of job-shop problem. On this basis, we provideits formulation as Constraint Programming while taking intoaccount commutation between quantum operations. We showthat this formulation improves the overall execution time ofthe resulting schedules in practice through experiments with areal quantum compiler and quantum circuits from two commonbenchmark sets.

I. I

NTRODUCTION

Although rapid progress in quantum computing devicetechnology has dramatically increased the coherence timeof quantum bits (or qubits), the currently available quantumcomputers remain in the so-called noisy intermediate scalequantum regime [1]. For noisy quantum computers, it isimportant to schedule the operations on qubits to be asshort as possible because this increases the probability ofcompleting all of the operations before any qubit decoheres,thus obtaining computational results with higher ﬁdelity. Evenfor fault-tolerant quantum computers, shortening the durationof compiled schedules would increase the throughput.Compilers for quantum computers (or quantum compilers)take a quantum circuit, which is a sequence of quantumoperations, as an input program and generate a correspondingsequence of control instructions that are executable on thetarget hardware. For example, in the case of quantum computersusing superconducting qubits, a quantum operation is compiledinto several controls (e.g., a microwave pulse) for a certainperiod of time. In general, any given quantum operation hasits own processing time and occupies its acting qubits forthe duration as a computational resource. For this reason,scheduling, through which the execution start time of eachquantum operation is determined without any overlapping, is anessential task in quantum compilers. We call this task quantumoperation scheduling . In this paper, we aim to minimize theoverall execution time. In the context of scheduling tasks acrossmultiple resources (qubits, in the case of quantum operationscheduling), the time between the start of the ﬁrst task and *) © 2020 IEEE. Personal use of this material is permitted. Permission fromIEEE must be obtained for all other uses, including reprinting/republishingthis material for advertising or promotional purposes, collecting new collectedworks for resale or redistribution to servers or lists, or reuse of any copyrightedcomponent of this work in other works. Full citation of this paper: ToshinariItoko and Takashi Imamichi. “Scheduling of Operations in Quantum Compiler,”in

Proceedings of the International Conference on Quantum Computing andEngineering . IEEE, 2020, pp. 337-344. the end of the ﬁnal task across all resources is known as themakespan of the schedule. Schedule length, overall executiontime, and makespan are used interchangeably in this work.The rough compilation ﬂow considering the quantum opera-tion scheduling task independently is as follows.1) Gate decomposition: A task decomposes unitary opera-tions called gates with three or more qubits into thosewith one or two qubits.2) Local simpliﬁcation: A task simpliﬁes a speciﬁc sequenceof gates into one gate (or cancels them out).3) Qubit routing: A task transforms a given circuit into anequivalent circuit so that all two-qubit gates are executedon limited pairs of qubits (depending on the physicalimplementation of the quantum computing device).4) Quantum operation scheduling.5) Control instruction mapping: A task maps each quantumoperation to the corresponding control instructions.Most of the previous studies on quantum compilers havehandled quantum operation scheduling within the context of itsbefore and after tasks, e.g., qubit routing and control instructionmapping [2]–[5]. However, in practice, decomposing an entirecompilation job into independent tasks is becoming morecommon in the software architecture of quantum compilers,similar to that of classical compilers, e.g., [6]. Therefore,we focus on the following research question: How muchcan we optimize the resulting schedule in quantum operationscheduling by itself?In this paper, we examine quantum operation scheduling(QOS) and analyze its theoretical properties and practicalusefulness. Our main contributions are as follows. • We show that QOS obtains greater degrees of freedomfor optimizing the resulting schedule by further consider-ing the commutativity of particular quantum operations(Section III-A). • We demonstrate that QOS can be reduced to a specialtype of job-shop problem that has a disjunctive graphrepresentation (Section III-C) so that we can formulateQOS as Constraint Programming and Mixed IntegerProgramming, which are common techniques for the job-shop problem or scheduling in general (Section IV). • We demonstrate through experiments with two commonbenchmark sets that the consideration of commutativityin QOS reduces the schedule length by up to 7.36% (Sec-tion V). a r X i v : . [ qu a n t - ph ] N ov I. R

ELATED W ORK

The job-shop problem, also known as job shop scheduling,is a well known optimization problem in computer scienceand operations research, and many variations of it have beenstudied [7], [8]. See Section III-B for its deﬁnition. Algorithmsto solve this problem include exact ones such as branch-and-bound based on a Mixed Integer Programming (MIP)formulation [9], heuristic ones such as shifting bottleneck [10],and meta-heuristic ones such as simulated annealing [11]. Inthis paper, we mainly focus on the exact algorithm, as wewant to determine the effect of optimizing quantum operationscheduling (QOS).Task scheduling, which is the scheduling of computationaltasks on multiple classical processors, has been extensivelystudied [12], [13]. A special type of task scheduling, DirectedAcyclic Graph (DAG) scheduling, which deals with hetero-geneous processors [14], [15], is most similar to QOS, butit differs in the way that resource constraints are handled. InDAG scheduling, every task can be executed on any processorwith a different cost, i.e., the resource constraint is soft, whilein quantum operation scheduling, every quantum operationhas ﬁxed qubit operands that are not interchangeable, i.e., theresource constraint is hard.Qubit routing is a task that transforms a given circuit intoan equivalent circuit so that all two-qubit operations in it canbe executed on limited pairs of qubits. Schedule length can beapproximated by circuit depth, which is the schedule lengthwhen assuming all operations have the same unit processingtime. Therefore, qubit routing with the objective of minimizingthe circuit depth [16], [17] or two-qubit gate depth [18], [19]can be viewed as approximate quantum operation schedulingwith qubit routing. Although the algorithms for this may beapplicable to scheduling without qubit routing, they provideonly approximate solutions, not the exact ones to QOS.While we deﬁne quantum operation scheduling indepen-dently of hardware technology, there are several studies onscheduling specialized for quantum computers based on iontrap technology [20], [21]. These works consider a combinationof scheduling and qubit routing under a hardware structuremodel, called macroblocks, and propose heuristic algorithmsto solve it.Several studies have considered the commutation of quantumoperations in scheduling [2]–[5]. Venturelli et al. [2] examinedthe scheduling of quantum operations as a subproblem ofqubit routing. They proposed an exact method using a tem-poral planner and showed it works well for QAOA circuits,which have many commuting gates. Although their methodis applicable to quantum operation scheduling without qubitrouting, our methods discussed in Section IV are simplerand perform sufﬁciently well for the speciﬁc schedulingproblem considered in this paper. Guerreschi and Park [4]proposed a two-step solution that decomposes the problemwith qubit routing and solves quantum operation scheduling(without qubit routing) in the ﬁrst step. They provide alist scheduling heuristic algorithm using upward ranking but not any exact algorithm for scheduling. Other studies haveconsidered quantum operation scheduling as a subtask of qubitrouting [3] or control optimization [5]. While they providepractical heuristic algorithms for solving the task that includesscheduling, the exact algorithm for scheduling is not discussed.III. P

ROBLEM

A. Quantum Operation Scheduling

We deﬁne quantum operation scheduling as the problem ofﬁnding a schedule for a given quantum circuit. A quantumcircuit is a sequence of quantum operations. Many of them areunitary operations called gates . Each of the quantum operationshas acting qubits and its own processing time. A quantumcircuit is given as a sequence: e.g., [ H (1) , CX (1 , , X (2)] .Here, H (1) denotes a Hadamard gate acting on qubit , CX (1 , denotes a Controlled-NOT (or CNOT) gate acting oncontrol qubit and target qubit , and X (2) denotes a NOTgate acting on qubit . Quantum circuits are typically depictedin a circuit diagram, as shown in Fig. 1. For simplicity, we Qubits Operations12 1 2 3

Fig. 1: Diagram representation of a quantum circuitassume all of the operations have the same unit processing timein Fig. 1. If we ignore commutation between gates, the gatedependency graph is linear, i.e., H (1) must precede CX (1 , and CX (1 , must precede X (2) , and we obtain a trivialschedule (makespan = 3), as shown in Fig. 2. We call thegraph representing the dependencies among gates in a circuitthe dependency graph . In contrast, if we consider that CX (1 , and X (2) commute, we have a different dependency graph: H (1) must precede CX (1 , , but there is no restriction on X (2) , so we can obtain a shorter schedule (makespan = 2), asshown in Fig. 3. This is compelling evidence that commutationrules should be considered when scheduling circuit operations.

21 3 (a) Standard DAG

Qubit1:2: time (b) Schedule with makespan = 3 Fig. 2: Standard dependency graph and resulting scheduleA schedule is deﬁned by the start times of the operations ina given circuit. Any schedule must satisfy two elementaryconstraints: precedence and non-overlap . The precedenceconstraint restricts the execution order of operations to obey (a) Extended DAG

Qubit1:2: time (b) Schedule with makespan = 2 Fig. 3: Extended dependency graph and resulting schedulea partial order represented as a dependency graph. The non-overlap constraint allows only the processing of one operationon a qubit at a time.Generally, the supported basic operations and the processingtimes depend on the target hardware. Hereafter, we assume thatbasic operations are given and that all circuits have already beendecomposed into them. We also assume that each processingtime of the basic operations is ﬁxed and given as a parameter.The dependency graph of a provided quantum circuit variesdepending on which commutation rules are considered. Takingthese details into account, we formally deﬁne a quantumoperation scheduling problem as follows.

Quantum Operation Scheduling :

Given a quantum circuitas a sequence of basic operations, each processing time ofeach operation, and a set of commutation rules between basicoperations, ﬁnd a schedule that satisﬁes precedence and non-overlap constraints with the minimum makespan.

B. Job-shop Problem and its Disjunctive Graph Representation

We review a basic version of the job-shop problem asfollows. Let J = { J , . . . , J n } be a set of n jobs and M = { M , . . . , M m } be a set of m machines. Each job J j hasan operation sequence O j to be processed in a speciﬁc order,called the precedence constraint . We denote the k -th operationin O j by O jk . Each operation O jk requires exclusive use of aspeciﬁc machine for its processing time p jk , called the non-overlap constraint . A schedule is a set of start (or completion)times for each operation t jk that satisﬁes both constraints.The objective of the job-shop problem is minimization of themakespan.The job-shop problem is often represented by a disjunctivegraph G = ( V, C ∪ D ) , where • V is a set of nodes representing the operations O jk , • C is a set of conjunctive (directed) edges representing theorder of the operations in any job, and • D is a set of disjunctive edges representing pairs ofoperations that must be processed on the same machine.For each node, the processing time and the required machineof its corresponding operation is attached. Conjunctive edges C represent the precedence constraint and disjunctive edges D represent the non-overlap constraint. Note that disjunctiveedges whose direction is ﬁxed by some conjunctive edges canbe omitted. That means any disjunctive edge can be removedif there exists a path from one end of the edge to the otheron a conjunctive graph ( V, C ) . Figure 4 shows an example of a disjunctive graph representing the job-shop problem. Theoperation O must be processed in machine M and it takes time unit. The disjunctive edge ( O , O ) is omitted sinceits direction is ﬁxed by the conjunctive edge at the same place. 𝑂 "" 𝑂 " 𝑂 "$ 𝑂 𝑂 𝑀 " ,𝑝 "" = 1 𝑀 ,𝑝 " = 1 𝑀 ,𝑝 "$ = 1𝑀 " ,𝑝 = 2 𝑀 ,𝑝 = 1 Conjunctive edgeDisjunctive edge

Fig. 4: Disjunctive graph representation of a job-shop problemOn the basis of this disjunctive graph representation, thejob-shop problem can be seen as a problem of determining thedirection of disjunctive edges while keeping the resulting graphacyclic. This is equivalent to determining the ordering of theoperations processed on the same machine, and such orderingyields a unique schedule, called a semi-active schedule [22], bysequencing operations as early as possible. Figure 5 shows two

Machine 𝑀 " : 𝑀 $ : 𝑂 "" time1 2 30 𝑂 $" 𝑂 "$ 𝑂 "& 𝑂 $$ 𝑂 "" 𝑂 "$ 𝑂 "& 𝑂 $" 𝑂 $$ 𝑂 "" 𝑂 "$ 𝑂 "& 𝑂 $" 𝑂 $$ Machine 𝑀 " : 𝑀 $ : 𝑂 "" time1 2 30 𝑂 $" 𝑂 "$ 𝑂 "& 𝑂 $$ Fig. 5: Two solutions to the job-shop problem in Fig. 4solutions to the job-shop problem deﬁned by the disjunctivegraph depicted in Fig. 4. As shown, a different selection ofthe direction of the disjunctive edges results in a differentsolution. Among the operations { O , O , O } , we cannotselect { ( O , O ) , ( O , O ) } because it produces a cycle O → O → O → O . In Solution A, the directed edge ( O , O ) determines the order of operations processed onmachine M , and { ( O , O ) , ( O , O ) } determine that onmachine M . C. Disjunctive Graph Representation of Quantum OperationScheduling

Technically, quantum operation scheduling can be seen as aspecial type of job-shop problem with the following properties:(1) one job has one operation, (2) a precedence constraint isgiven as a partial ordering among all operations instead oftotal ordering of operations per job, and (3) multiple machines(i.e., qubits) can be occupied by a single operation at thesame time. Those properties preserve enough conditions torepresent the problem by a disjunctive graph G = ( V, C ∪ D ) .For property (1), we need to deﬁne the problem on operationswithout jobs, but this does not change the fact that nodes V represent operations. For properties (2) and (3), we need toodify the deﬁnition of conjunctive edges C and disjunctiveedges D , respectively as follows.The quantum operation scheduling can be represented by adisjunctive graph G = ( V, C ∪ D ) , where • V is a set of nodes representing the quantum operationsin a given circuit, • C is a set of conjunctive edges representing dependenciesamong the operations, i.e., edges of the dependency graph,and • D is a set of disjunctive edges representing pairs ofoperations that act on the same qubit and (possibly)commute with one another.For each operation (i.e., node) i ∈ V , the processing time p i and the acting qubits are attached. Note that conjunctive graph ( V, C ) is a DAG given as a dependency graph. Conjunctiveedges C and disjunctive edges D still represent the precedenceconstraint and non-overlap constraint, respectively. It is knownthat the dependency graph for any circuit can be computation-ally constructed under several popular commutation rules [23].When provided with the dependency graph of a quantum circuit,the disjunctive graph representation can be computationallyconstructed. In fact, it is possible to deﬁne disjunctive edgesby all the pairs of nodes that acting on the same qubit. Thisdeﬁnition is redundant, but there is no problem with includingedges (pairs of operations) that do not commute with oneanother. The point is that it must include all of the commutingpairs. Starting from the redundant edges, we can deﬁne theminimal disjunctive edges by removing edges that have a pathon the conjunctive graph. However, this comes at a signiﬁcantcomputation cost. There is an in-between deﬁnition that picksup operations acting on any qubit and splits them into sets ofoperations commuting each other within each of the sets. Weused this deﬁnition for the experiments discussed in Section V.Although this cannot provide the minimal edges because theoperations within a set may not commute each other whenconsidering operations acting on the other qubits, it has feweredges than the most redundant deﬁnition and requires lesscomputation cost than the minimal deﬁnition.With the disjunctive graph representation of the quantumoperation scheduling, we take the commutation of operationsinto account on the basis of the difference of the dependencygraph. We call the dependency graph that considers only thetrivial commutation between operations not sharing their actingqubits standard DAG and the dependency graph that considerscommutation rules in addition to the trivial ones extended DAG .Figure 6 shows disjunctive graphs representing two quantumoperation scheduling problems that have the same operationsbut different dependency (conjunctive) graphs (standard DAGand extended DAG). In the case of extended DAG, we canselect the order of O and O , while in standard DAG, all theorders among operations are ﬁxed.It generally holds that if we consider a standard DAG, thereare no disjunctive edges, i.e., D = ∅ . This means there isa unique semi-active schedule because each ordering of theoperations processed on the same qubit is uniquely determinedby the conjunctive DAG. However, if we consider an extended {1} , 𝑝 & = 1 {1,2} , 𝑝 ( = 1 {2} , 𝑝 ) = 1 Conjunctive edgeDisjunctive edge (a) Standard DAG {1} , 𝑝 & = 1 {1, 2}, 𝑝 * = 1{2}, 𝑝 + = 1 (b) Extended DAG Fig. 6: Disjunctive graphs representing quantum operationscheduling for different dependency graphsDAG, we have fewer edges in C and some edges in D . Thiscreates the room for selecting a better ordering of the operationsprocessed on the same qubit, i.e., optimizing the schedule.IV. F ORMULATION

We provide a Constraint Programming (CP) formulationand a Mixed Integer Programming (MIP) formulation for thequantum operation scheduling problem (QOS) deﬁned in theprevious section. By solving them, we can ﬁnd the optimalsolution of QOS and analyze how much we can improvethe resulting schedule in QOS. In this section, we assumethat the disjunctive graph representation of quantum operationscheduling G = ( V, C ∪ D ) for a given circuit has alreadybeen constructed. A. Constraint Programming Formulation

Let x i be an interval variable describing the start and endtime of operation i ∈ V , whose duration is ﬁxed to itsprocessing time p i . Using functions commonly supported byCP solvers, e.g. interval var , quantum operation schedulingis formulated as a CP:minimize max { end of ( x i ) | i ∈ V } subject to end before start ( x i , x j ) , ∀ ( i, j ) ∈ C, no overlap ( x k , x l ) , ∀ ( k, l ) ∈ D,x i ≡ interval var (duration = p i ) , ∀ i ∈ V. Here end of ( x i ) takes the end time of x i , so the objec-tive is the minimization of the makespan. The constraint end before start ( x i , x j ) means the end time of x i mustprecede the start time x j , so it represents the precedence con-straint. The constraint no overlap ( x k , x l ) means the interval x k must not overlap the interval x l , so it represents the non-overlap constraint. Note that, for any operation pair ( i, j ) (cid:54)∈ D ,the precedence constraint guarantees no overlap betweenthem. All of end of , end before start , no overlap , and interval var are supported in the IBM ILOG CP Optimizer. B. Mixed Integer Programming Formulation

Let x i be a variable representing the start time of operation i ∈ V and p i be a constant parameter representing its processingtime. Let t be a makespan of the schedule deﬁned by x . Let y kl be an indicator (Boolean) variable that takes True (1) ifoperation k precedes operation l and False (0) if l precedes k .sing these, quantum operation scheduling is formulated as aMIP: minimize t subject to x i + p i ≤ x j , ∀ ( i, j ) ∈ C,y kl ⇒ x k + p k ≤ x l , ∀ ( k, l ) ∈ D, ¬ y kl ⇒ x l + p l ≤ x k , ∀ ( k, l ) ∈ D,x i + p i ≤ t, ∀ i ∈ V, ≤ x i ∈ R , ∀ i ∈ V,y kl ∈ { , } , ∀ ( k, l ) ∈ D. The inequality x i + p i ≤ x j represents the precedenceconstraint. The two inequalities using ⇒ represent the non-overlap constraint. Note that the constraints with indicatorvariables y can be translated into linear constraints by applyingthe so-called the big-M technique. However, recent MIP solvershave the capability to handle such indicator constraints verywell, so we leave the formulation with indicator variables.V. E XPERIMENT

We conducted two experiments. In the ﬁrst one, we evaluatehow much the consideration of commutation between opera-tions improves the schedule in quantum operation scheduling.In the second one, we investigate how much the extent ofimprovement in scheduling can be affected by the optimizationlevel in a previous task.

A. Common Experimental Settings

Both of the experiments were conducted in a real compilingenvironment. As the target quantum computing device forcompilation, we used the IBM Q Johannesburg, which has20 qubits (see [24] for the details). We implemented ourscheduling algorithms within Qiskit 0.18.0 (Terra 0.13.0), whichis an open-source quantum computing software developmentframework [6].We ﬁrst transpiled all of the circuits to make them executableon the IBM Q Johannesburg, i.e., we solved the qubit routingproblem to map given circuits onto the device topology. Forthis, we used the “transpile()” function in Qiskit and set theibmq johannesburg backend, ﬁxed the seed transpiler to 1, theoptimization level to 2, and left the other options as the default.During transpiling, all of the circuits were decomposed intothe basis gates { u , u , u , CX } , which are elementary gatessupported by the backend. Here u , u , u are single-qubitgates and CX is a two-qubit gate. The execution time for eachgate (gate length) is provided as the backend properties. Notethat it can differ depending on which qubit(s) the gate acts on,e.g. the length of u can be different from that of u .We used those as of April 11, 2020.We then applied our scheduling algorithms to the transpiledcircuits. We used the real processing time for each of the basisgates provided as backend properties. We considered threecommutation rules on the basis gates— u i ) ↔ CX ( i, j ) , CX ( i, j ) ↔ CX ( i, k ) , and CX ( i, k ) ↔ CX ( j, k ) —in theconstruction of the extended DAGs for the transpiled circuits.We used the IBM ILOG CP Optimizer and CPLEX 12.9.0to solve the scheduling problem based on the CP and MIPformulation described in Section IV, respectively. B. Improvement by Considering Commutation in QuantumOperation Scheduling

In the ﬁrst experiment, we quantiﬁed the signiﬁcance ofconsidering the commutation of operations in quantum opera-tion scheduling. Speciﬁcally, we evaluated the improvement bycomparing the best solutions (makespans) of the formulationconstrained by standard DAG with those by extended DAG asshown in Table I.TABLE I: Comparison of makespans [ dt ] ( dt = 2 / ns)obtained by the formulation based on standard-DAG (Std-DAG)and those based on extended DAG (Ext-DAG) for 16 circuitsfrom the RevLib benchmark. For the Ext-DAG formulation, thesolutions by the Constraint Programming solver with the timelimit of ten seconds are listed. The Qubits and Gates columnslist the number of qubits and gates in the input circuits. The ∆ column lists the improvement rate from Std-DAG to Ext-DAG. Circuit name Qubits Gates Std-DAG Ext-DAG ∆ mini alu 305 10 173 24,940 24,308 2.53%qft 10 10 200 24,358 24,074 1.17%sys6-v0 111 10 215 30,604 30,114 1.60%rd73 140 10 230 35,848 35,484 1.02%ising model 10 10 480 4,210 4,210 0.00%wim 266 11 986 141,914 138,658 2.29%sym9 146 12 328 50,458 50,170 0.57%rd53 311 13 275 40,420 40,164 0.63%ising model 13 13 633 4,210 4,210 0.00%0410184 169 14 211 40,356 39,074 3.18%sym6 316 14 270 47,178 46,404 1.64%rd84 142 15 343 39,812 38,490 3.32%cnt3-5 179 16 175 19,630 19,366 1.34%cnt3-5 180 16 485 69,854 68,326 2.19%qft 16 16 512 50,674 50,088 1.16%ising model 16 16 786 4,370 4,370 0.00% For this experiment, we used quantum circuits from the testdataset provided by Zulehner et al. [25], which originated fromthe RevLib benchmark [26]. We selected 16 circuits with10–16 qubits and less than 1000 gates from among them. Weused the as-soon-as-possible heuristic scheduling algorithmimplemented in Qiskit to ﬁnd the unique solutions from thestandard DAG formulation (Std-DAG). We also used the CPsolver with a 10-sec time limit to ﬁnd the best possible solutionsfrom the extended DAG formulation (Ext-DAG).Looking at the ∆ column in Table I, i.e., the improvementrates from Std-DAG to Ext-DAG, we can see they were non-negative and varied depending on the circuit structures from0.00% to 3.32% (median 1.26%). These results demonstratethat the commutation-aware formulation we proposed inSection III-A can improve the resulting schedule length in apractical situation. Although they may look marginal, they maybe welcomed by those who have abundant time for compilationand need more optimization.We also examined the MIP solver (with the same 10-sectime limit) to ﬁnd solutions of the Ext-DAG; however, all of thesolutions were slightly worse or equal to those by the CP solver.Hence, we omitted these results in Table I. As for the solutions(i.e., makespans) from the standard DAG formulation, weveriﬁed that those by the CP and MIP solvers were exactly theABLE II: Difference in improvement rates ( ∆ column) of makespans from scheduling with standard DAG (Std-DAG)compared to those with extended DAG (Ext-DAG) using CP solver after applying a naive gate decomposition or optimized gatedecomposition. Circuit name Naive gate decomposition Optimized gate decompositionStd-DAG Ext-DAG ∆ Std-DAG Ext-DAG ∆ Mod 5 4 6,328 5,984 5.44% 6,016 5,578 7.28%VBE-Adder 3 19,126 18,852 1.43% 11,416 11,148 2.35%CSLA-MUX 3 22,238 21,438 3.60% 19,774 18,778 5.04%RC-Adder 6 28,606 27,564 3.64% 20,408 18,906 7.36%Mod-Red 21 33,160 32,348 2.45% 28,320 27,494 2.92%Mod-Mult 55 13,942 13,740 1.45% 14,478 14,070 2.82%Toff-Barenco 3 12,388 12,388 0.00% 4,812 4,812 0.00%Toff-NC 3 9,108 9,108 0.00% 3,818 3,818 0.00%Toff-Barenco 4 15,992 15,582 2.56% 9,006 8,678 3.64%Toff-NC 4 12,272 11,834 3.57% 8,016 7,626 4.87%Toff-Barenco 5 21,600 21,378 1.03% 19,424 19,138 1.47%Toff-NC 5 13,536 12,984 4.08% 9,262 8,920 3.69%Toff-Barenco 10 90,940 89,248 1.86% 59,282 57,282 3.37%Toff-NC 10 43,156 42,546 1.41% 28,720 28,322 1.39%GF( )-Mult 33,160 32,492 2.01% 33,646 33,148 1.48%GF( )-Mult 37,948 36,480 3.87% 44,702 44,030 1.50%GF( )-Mult 63,856 61,870 3.11% 64,784 64,180 0.93% same as those by the as-soon-as-possible heuristic schedulingalgorithm implemented in Qiskit as expected. C. Performance Variation by Optimization Level of PreviousTask

In the second experiment, we investigated how a previoustask affects the solution quality in the quantum operationscheduling task. To this end, as a previous task, we picked thegate decomposition task that decomposes gates with three ormore qubits into those with one or two qubits. We changedthe optimization level in the gate decomposition task andobserved how it affects the improvement rates of makespansfrom scheduling with standard DAG (Std-DAG) compared tothose with extended DAG (Ext-DAG), as shown in Table II.For this experiment, we used 17 circuits from the test datasetprovided by Nam et al. [27]. To change the optimization levelin the gate decomposition task, we used both their input circuitdata (with “ before” sufﬁx in their ﬁle names) and outputcircuit data after the heavy optimization proposed in [27] (with“ after heavy” sufﬁx) as our input circuits to be scheduled.Those correspond with the Naive gate decomposition columnand Optimized gate decomposition column, respectively. Notethat, in the Naive case, gates are decomposed by a simplerule-based algorithm implemented in Qiskit before scheduling.As in the previous experiment, we used the as-soon-as-possibleheuristic scheduling algorithm implemented in Qiskit to ﬁndthe unique solutions from the Std-DAG formulation and theCP solver with a 10-sec time limit to ﬁnd the best possiblesolutions from the Ext-DAG formulation.As shown in the two ∆ columns in Table II, we can observeclear improvement from Std-DAG to Ext-DAG no matterwhich gate decomposition algorithm we used before QOS:Min: 0.00%–Median: 2.45%–Max: 5.44% (Naive) and Min:0.00%–Median: 2.82%–Max: 7.36% (Optimized). This againconﬁrms that our commutation-aware formulation proposed inSection III-A can improve the resulting schedule length in apractical situation. When we compare the improvement rates ( ∆ column) fromscheduling after the naive gate decomposition with those afterthe optimized gate decomposition in Table II, there are twokey ﬁndings. First, they have a similar median: 2.45% (Naive)and 2.82% (Optimized). This suggests that, on average, ourcommutation-aware scheduling can stably improve the resultingschedule no matter how much circuits has been optimized in aprevious task (at least in the gate optimization task). Second, theimprovement rates for each individual circuit differs betweenNaive and Optimized. Speciﬁcally, they increase from Naiveto Optimized for ten circuits and decrease for ﬁve circuits.This suggests that the optimization level of the previous tasksigniﬁcantly affects the optimization gain in QOS.Comparing the makespans in the Std- or Ext-DAG columnunder naive gate decomposition with those under optimized gatedecomposition in Table II, we can see that they decrease for 13out of 17 circuits, as expected, but increase for four circuits. Thelatter four exceptional cases stem from negative interferenceamong optimization tasks before scheduling, i.e., between gatedecomposition and some tasks done within ‘transpile()‘ inQiskit, and they are not caused by any errors in the scheduling.VI. D ISCUSSION

The basic version of the job-shop problem as a decisionproblem is known to be NP-complete [28]. Since QOS isa special variant of the job-shop problem, as discussed inSection III-C, it is not necessary for it to be NP-complete.Identifying the theoretical complexity of QOS would be aninteresting avenue for future work.Throughout this paper, we have investigated how to minimizethe overall execution time of the resulting schedule. Althoughthis certainly contributes to obtaining computational resultswith higher ﬁdelity, there should be more direct approachesthat attempt to maximize the output ﬁdelity by consideringgate-dependent errors. In fact, such approaches have recentlyproposed in qubit routing [29]–[31]. Utilizing techniques likethis for QOS is also left for future work.he two formulations (CP/MIP) discussed in Section IVare useful for the theoretical best case analysis because theirsolvers implement exact algorithms that can ﬁnd the optimalsolution in the long run. They may also be sufﬁcient for certainpractical applications, since CP/MIP solvers usually implementproblem-agnostic heuristic algorithms to ﬁnd the best possiblesolution within a limited time. However, for the use cases wherethe compilation time is too critical to use CP/MIP solvers, it isworth considering heuristic algorithms specialized for QOS. Asan example, we developed a heuristic algorithm based on theHeterogeneous-Earliest-Finish-Time (HEFT) algorithm for taskscheduling. We provide its details in Appendix. Such a heuristicalgorithm can complement the CP/MIP-based approach.VII. C

ONCLUSION

We investigated quantum operations scheduling for theproblem of scheduling quantum operations in a given circuitwith the shortest total execution time. We demonstrated thatquantum operations scheduling can be interpreted as a specialtype of job-shop problem where we consider the commutationbetween quantum operations to make room for optimization. Weprovided a Constraint Programming formulation and showedthrough experiments with real circuits and a compiler thatsolving quantum operations scheduling independently improvedthe schedule length by the modest rate up to 7.36%.A

CKNOWLEDGMENT

We thank Dmitri Maslov, Lauren Capelluto, Thomas A.Alexander, and Rudy Raymond for their helpful comments.R

EFERENCES[1] J. Preskill, “Quantum Computing in the NISQ era and beyond,”

Quantum ,vol. 2, p. 79, 2018.[2] D. Venturelli, M. Do, E. Rieffel, and J. Frank, “Temporal planning forcompilation of quantum approximate optimization circuits,” in

Proceed-ings of the International Joint Conference on Artiﬁcial Intelligence , 2017,pp. 89–101.[3] T. S. Metodi, D. D. Thaker, A. W. Cross, F. T. Chong, and I. L. Chuang,“Scheduling physical operations in a quantum information processor,”in

Quantum Information and Computation IV , vol. 6244, InternationalSociety for Optics and Photonics. SPIE, 2006, pp. 210–221.[4] G. G. Guerreschi and J. Park, “Two-step approach to scheduling quantumcircuits,”

Quantum Science and Technology , vol. 3, no. 4, p. 045003,2018.[5] Y. Shi, N. Leung, P. Gokhale, Z. Rossi, D. I. Schuster, H. Hoffmann,and F. T. Chong, “Optimized compilation of aggregated instructionsfor realistic quantum computers,” in

Proceedings of the Twenty-FourthInternational Conference on Architectural Support for ProgrammingLanguages and Operating Systems

European Journalof Operational Research , vol. 93, no. 1, pp. 1–33, 1996.[8] J. Zhang, G. Ding, Y. Zou, S. Qin, and J. Fu, “Review of job shopscheduling research and its new perspectives under industry 4.0,”

Journalof Intelligent Manufacturing , vol. 30, no. 4, pp. 1809–1830, 2019.[9] A. S. Manne, “On the job-shop scheduling problem,”

OperationsResearch , vol. 8, no. 2, pp. 219–223, 1960.[10] J. Adams, E. Balas, and D. Zawack, “The shifting bottleneck procedurefor job shop scheduling,”

Management Science , vol. 34, no. 3, pp. 391–401, 1988.[11] P. J. M. van Laarhoven, E. H. L. Aarts, and J. K. Lenstra, “Job shopscheduling by simulated annealing,”

Operations Research , vol. 40, no. 1,pp. 113–125, 1992. [12] H. Topcuoglu, S. Hariri, and M.-Y. Wu, “Performance-effective andlow-complexity task scheduling for heterogeneous computing,”

IEEETransactions on Parallel and Distributed Systems , vol. 13, no. 3, pp.260–274, 2002.[13] O. Sinnen,

Task scheduling for parallel systems . John Wiley & Sons,2006.[14] L.-C. Canon, E. Jeannot, R. Sakellariou, and W. Zheng, “Comparativeevaluation of the robustness of DAG scheduling heuristics,” in

GridComputing , 2008, pp. 73–84.[15] C. Valouxis, C. Gogos, P. Alefragis, G. Goulas, N. Voros, and E. Housos,“DAG scheduling using integer programming in heterogeneous parallelexecution environments,” in

Proceedings of the Multidisciplinary Inter-national Conference on Scheduling: Theory and Applications (MISTA2013) , 2013, pp. 392–401.[16] D. Maslov, S. M. Falconer, and M. Mosca, “Quantum circuit placement,”

IEEE Transactions on Computer-Aided Design of Integrated Circuitsand Systems , vol. 27, no. 4, pp. 752–763, 2008.[17] D. Bhattacharjee and A. Chattopadhyay, “Depth-optimal quantum circuitplacement for arbitrary topologies,” arXiv preprint arXiv:1703.08540 ,2017.[18] A. Cowtan, S. Dilkes, R. Duncan, A. Krajenbrink, W. Simmons, andS. Sivarajah, “On the qubit routing problem,” in ,2019.[19] A. M. Childs, E. Schoute, and C. M. Unsal, “Circuit transformations forquantum architectures,” in , 2019.[20] N. Mohammadzadeh, M. S. Zamani, and M. Sedighi, “Improvinglatency of quantum circuits by gate exchanging,” in

Proceedings of12th Euromicro Conference on Digital System Design, Architectures,Methods and Tools . IEEE, 2009, pp. 67–73.[21] T. Bahreini and N. Mohammadzadeh, “An MINLP model for schedulingand placement of quantum circuits with a heuristic solution approach,”

ACM Journal on Emerging Technologies in Computing Systems (JETC) ,vol. 12, no. 3, p. 29, 2015.[22] T. Yamada and R. Nakano, “Job-shop scheduling,” in

Genetic algorithmsin engineering systems . The Institution of Electrical Engineers, 1997,ch. 7, pp. 134–160.[23] T. Itoko, R. Raymond, T. Imamichi, and A. Matsuo, “Optimization ofquantum circuit mapping using gate transformation and commutation,”

Integration , vol. 70, pp. 43–50, 2020.[24] A. D. C´orcoles, A. Kandala, A. Javadi-Abhari, D. T. McClure, A. W.Cross, K. Temme, P. D. Nation, M. Steffen, and J. Gambetta, “Challengesand opportunities of near-term quantum computing systems,”

Proceedingsof the IEEE , 2019.[25] A. Zulehner, A. Paler, and R. Wille, “An efﬁcient methodology formapping quantum circuits to the IBM QX architectures,”

IEEE Transac-tions on Computer Aided Design of Integrated Circuits and Systems(TCAD) , 2018, . Implementation with test data set is available athttps://github.com/iic-jku/ibm qx mapping.[26] M. Soeken, S. Frehse, R. Wille, and R. Drechsler, “RevKit: An opensource toolkit for the design of reversible circuits,” in

Proceedings ofthe International Workshop on Reversible Computation npjQuantum Information , vol. 4, no. 1, p. 23, 2018.[28] M. R. Garey and D. S. Johnson,

Computers and intractability . W. H.Freeman and Company, 1979.[29] S. S. Tannu and M. K. Qureshi, “Not all qubits are created equal: a casefor variability-aware policies for NISQ-era quantum computers,” in

Pro-ceedings of the Twenty-Fourth International Conference on ArchitecturalSupport for Programming Languages and Operating Systems , 2019, pp.987–999.[30] P. Murali, J. M. Baker, A. Javadi-Abhari, F. T. Chong, and M. Martonosi,“Noise-adaptive compiler mappings for noisy intermediate-scale quan-tum computers,” in

Proceedings of the Twenty-Fourth InternationalConference on Architectural Support for Programming Languages andOperating Systems , 2019, pp. 1015–1029.[31] S. Nishio, Y. Pan, T. Satoh, H. Amano, and R. V. Meter, “Extractingsuccess from IBM’s 20-qubit machines using error-aware compilation,”

ACM Journal on Emerging Technologies in Computing Systems (JETC) ,vol. 16, no. 3, pp. 1–25, 2020.

ABLE III: Makespans and their improvement rates ( ∆ ) from scheduling with standard DAG (Std-DAG) compared to thosewith extended DAG (Ext-DAG) using HEFT heuristic algorithm or CP solver after applying a naive gate decomposition oroptimized gate decomposition. Circuit name Naive gate decomposition Optimized gate decompositionStd-DAG Ext-DAG Std-DAG Ext-DAGHEFT ( ∆ ) CP ( ∆ ) HEFT ( ∆ ) CP ( ∆ )Mod 5 4 6,328 6,076 (3.98%) 5,984 (5.44%) 6,016 5,670 (5.75%) 5,578 (7.28%)VBE-Adder 3 19,126 18,972 (0.81%) 18,852 (1.43%) 11,416 11,148 (2.35%) 11,148 (2.35%)CSLA-MUX 3 22,238 21,444 (3.57%) 21,438 (3.60%) 19,774 18,914 (4.35%) 18,778 (5.04%)RC-Adder 6 28,606 27,584 (3.57%) 27,564 (3.64%) 20,408 19,136 (6.23%) 18,906 (7.36%)Mod-Red 21 33,160 32,640 (1.57%) 32,348 (2.45%) 28,320 27,570 (2.65%) 27,494 (2.92%)Mod-Mult 55 13,942 13,822 (0.86%) 13,740 (1.45%) 14,478 14,270 (1.44%) 14,070 (2.82%)Toff-Barenco 3 12,388 12,388 (0.00%) 12,388 (0.00%) 4,812 4,998 ( − )-Mult 33,160 32,508 (1.97%) 32,492 (2.01%) 33,646 33,160 (1.44%) 33,148 (1.48%)GF( )-Mult 37,948 36,734 (3.20%) 36,480 (3.87%) 44,702 44,464 (0.53%) 44,030 (1.50%)GF( )-Mult 63,856 62,174 (2.63%) 61,870 (3.11%) 64,784 64,278 (0.78%) 64,180 (0.93%) A PPENDIX

We show how the Heterogeneous-Earliest-Finish-Time(HEFT) algorithm for task scheduling can be used for quantumoperation scheduling with a slight modiﬁcation. The originalHEFT algorithm is designed for scheduling with a soft resourceconstraint, i.e., every operation can be executed on anyprocessor with a different cost. We adjust it here so that it canwork with a hard resource constraint, i.e., every operation hasﬁxed qubit operands that are not interchangeable.The original HEFT algorithm consists of two phases: an operation prioritizing phase for computing the priorities of alloperations based on upward ranking and a processor selectionphase for scheduling the highest priority operation at themoment on the processor, which minimizes the operation’sﬁnish time [12]. In the processor selection phase, the algorithmconsiders the possibility of inserting an operation in the earliestidle time-slot between two already scheduled operations. Onlythe idle time-slots that preserve precedence constraints, i.e.,that comply with the dependency graph, are considered in thisphase. This insertion-based policy allowing the insertion in theidle time-slot characterizes the HEFT algorithm.While keeping this insertion-based policy, we adjust theHEFT algorithm so that every operation is assigned to theﬁxed qubits (i.e., processors in the original term), which meanswe no longer need to select qubits in the processor selectionphase. Note that it is necessary for the adjusted HEFT algorithmto maintain scheduled time-slots across qubits, whereas theoriginal algorithm simply maintains the time-slots by processors.The process ﬂow of the HEFT algorithm for quantum operationscheduling is shown in Algorithm 1.We conducted experiments to check the solution qualityof the adjusted HEFT algorithm with the same benchmarksets and experimental settings as used in Section V. For allof the instances under the formulation with extended DAG(Ext-DAG), the HEFT algorithm always succeeded in ﬁnding

Algorithm 1

HEFT algorithm for QOS G = ( V, C ) : dependency graph of a QOS problem Compute upward rank r ( u ) for each operation u ∈ V by r ( u ) = d ( u ) + max v ∈ succ ( u ) r ( v ) where succ ( u ) is the set of immediate successors of u , d ( u ) isthe duration of u , and r ( e ) = d ( e ) for any exit operation e . ready time ( u ) = 0 for all u ∈ V . for all u ∈ V in descending order of r ( u ) do Insert u at the start time t of the earliest idle time-slot (whoseduration > d ( u ) ) after ready time ( u ) . for all v ∈ succ ( u ) do ready time ( v ) = max( ready time ( v ) , t + d ( u )) . end for end forend for