A Depth-Aware Swap Insertion Scheme for the Qubit Mapping Problem
Chi Zhang, Yanhao Chen, Yuwei Jin, Wonsun Ahn, Youtao Zhang, Eddy Z. Zhang
AA Depth-Aware Swap Insertion Schemefor the Qubit Mapping Problem
Chi Zhang ∗ Yanhao Chen † Yuwei Jin † Wonsun Ahn ∗ Youtao Zhang ∗ Eddy Z. Zhang †∗ University of Pittsburgh { chz54, wahn } @pitt.edu, [email protected] † Rutgers [email protected], [email protected], [email protected]
Abstract —The rapid progress of physical implementation ofquantum computers paved the way of realising the design oftools to help users write quantum programs for any givenquantum devices. The physical constraints inherent to the currentNISQ architectures prevent most quantum algorithms from beingdirectly executed on quantum devices. To enable two-qubit gatesin the algorithm, existing works focus on inserting SWAP gatesto dynamically remap logical qubits to physical qubits. However,their schemes lack the consideration of the depth of generatedquantum circuits. In this work, we propose a depth-aware SWAPinsertion scheme for qubit mapping problem in the NISQ era.
Index Terms —Quantum Computing, Emerging languages andcompilers, Emerging Device Technologies
I. I
NTRODUCTION
Quantum computing has exhibited its theoretical advantageover classical computing by showing impressive speedup onapplications including large integer factoring [1], databasesearch [2], and quantum simulation [3]. It is considered tobe a new computational model that may have a subversiveimpact on the future, and has attracted major interests of alarge number of researchers and companies.With the advent of advanced manufacturing technology,the industry is able to build small-scale quantum computers– Noisy Intermediate-Scale Quantum [4] (NISQ) devices. ANISQ device is equipped with dozens to hundreds of qubits.IBM [5] released its 53-qubit quantum computer in October2019 and has made it available for commercial use. Google [6]released the 72-qubit
Bristlecone quantum computer in March2018. Other companies including Intel [7], Rigetti [8], andIonQ, have released their quantum computing devices withdozens of qubits. The current NISQ technology may not beperfect, but it’s a good first step towards the more powerfulquantum devices in the future.In order to map high level quantum programs to NISQdevices, it is important to overcome two obstacles. First, tobe able to execute a quantum circuit, it is necessary to maplogical qubits to physical qubits with respect to architectureand program coupling constraints. Any quantum program canbe implemented using an universal gate set [9] of a smallnumber of elementary gates. For instance, the { H, CNOT, S,T } set is an universal gate set, in which the { H, S, T } gatesare single qubit gates, the CNOT gate is a two-qubit gate. The two-qubit gate must be mapped to two qubits that arephysically connected. However, in real quantum architecture,qubits may have limited connection and not every two qubitsare connected, as shown in the IBM
QX2 architecture in Fig. 1(a). For this reason, a quantum circuit is not directly executableon a NISQ device, unless circuit transformation is performed.The common practice is to insert
SWAP operations to dynam-ically remap the logical qubit such that the transformed circuitis hardware-compliant for each (set of) two-qubit gate(s).Second, it is critical that the depth of a quantum circuitbe minimized for the NISQ device. A qubit is volatile anderror prone. It gradually decays over time and may have phaseand bit flip errors. It may completely lose its state after acertain period of time, called coherence time . Quantum errorcorrection (QEC) codes can detect error syndromes and fixthem. However, QEC needs to use a large number of redundantphysical qubits. A realistic QEC circuit may need more than10,000 physical qubits, which is not possible for today’s NISQdevice. Without QEC, a program must terminate within athreshold amount of time. The depth of the circuit, which is thenumber of steps the circuit executes, must be optimized. IBMproposed the metric of quantum volume [10] for evaluatingthe effectiveness of quantum computers which accounts fornot only the width of the circuit (the number of qubits), butalso the depth, how many steps the circuit can execute.Transforming the logical circuit into a hardware-compliantone will inevitably result in increased gate count and circuitdepth. Most previous work for qubit mapping [11]–[16] focuson minimizing the number of inserted gates, but not the depthof the transformed circuit. However, even if the gate count issmall, it does not necessarily mean the depth of the circuitis small, due to the dependence between different gates. Wediscover that previous work that aims to minimize number ofinserted gate may significantly increase the depth of the circuit(in Section IV). For instance, the Sabre approach by Li et al. [11] reduces the gate count by 1.1%, but increases the depthof the 10-qubit
QFT circuit by over 44.5%. The two studies[17], [18] stress the importance of taking into considerationthe variability in the qubit (link) error rates, but they do notdirectly address the issue of the increased circuit depth.The depth of the circuit, as mentioned above, is critical and a r X i v : . [ c s . ET ] F e b etermines if a quantum program is executable on a NISQdevice with respect to its physical limits. In this paper, wepropose the first depth-aware qubit mapping scheme for quan-tum circuits running on arbitrary qubit connectivity hardware.Our depth-aware qubit mapper searches for the mapping thatminimizes the transformed circuit depth and keeps the gatecount within a reasonable range. Our results show we canreduce the depth of the transformed circuit by up to 30%compared with two best known qubit mappers [11], [12], andin the meantime, have on average less than 3% additional gatesover a large set of representative benchmarks.II. B ACKGROUND AND M OTIVATION
A. Quantum Computing Basics1) Qubit:
A quantum bit or qubit, is the counterpart toclassical bit in the realm of quantum computing. Differentfrom a classical bit that represents either ‘1’ or ‘0’, a qubit isin the coherent superposition of both states. It is consideredas a two-state quantum system that exhibits the peculiarity ofquantum mechanics [9]. An example is the spin of the electronthat the two states can be spin up and spin down.
2) Quantum Gates:
There are two types of basic quantumgates. One type of basic gates is the single-qubit gate, a unitaryquantum operation that can be abstracted as the rotation aroundthe axis of the Bloch sphere [9] which represents the statespace of one qubit. A single qubit-gate can be parameterizedusing two rotation angles around the axes. There are severalelementary single-qubit gates including the Hadamard (H)gate, the phase (S) gate, and the π/ (T) gate [9]. Theother type of basic gates is the multi-qubit gate. However, allcomplex quantum gates can be decomposed into a sequence ofsingle qubit gates H, S, T, and the two-qubit CNOT gate. Thuswe only focus on the two-qubit CNOT gate. The CNOT gateoperates on two qubits which are distinguished as a controlqubit and a target qubit. If the control qubit is 1, the CNOTgate flips the state of the target qubit, otherwise, the targetqubit remains the same.
3) Quantum Circuit:
Quantum circuit is composed of aset of qubits and a sequence of quantum operations onthese qubits. There are various ways to describe the quantumcircuits. One way is to use the quantum assembly languagecalled OpenQASM [19] released by IBM. Another way isto use the circuit diagram, in which qubits are representedas horizontal lines and quantum operations are the differentblocks on those lines. In Fig. 2 (a), we show a simple exampleof quantum circuit diagram. A single-qubit gate is denoted asa square on the line, and one CNOT gate is represented by aline connecting two qubits and a circle enclosing a plus sign.
B. Qubit Mapping and Depth-Awareness
To enable the execution of a quantum circuit, the logicalqubits in the circuit must be mapped to the physical qubit onthe target hardware. When applying a CNOT gate, the twoqubits connected by the CNOT gate need to be physicallyconnected to each other. Due to the irregular physical qubitlayout of existing devices, it is generally considered impossible Q Q Q Q Q q q q q q (a) IBM QX2 (b) Logic coupling graph Fig. 1. (a) The connectivity structure of IBM QX2, (b) The coupling graphfor logical qubits in the motivation example in Fig. 2 to find an initial mapping that makes the entire circuit CNOT-compliant. The common practice is to insert SWAP operationsto remap the logical qubits. A swap operation exchanges thestates of the two input qubits of interest. As shown in Fig. 3,a SWAP operation is implemented using 3 CNOT gates forarchitecture with bi-directional links, or 3 CNOT gates plus4 Hadamard gates for architecture with single-direction links,where a bi-directional link means both ends of the link can bethe control or target qubit, while single-direction link meansonly one end of it can be the control qubit.IBM’s Qiskit uses a stochastic method to insert SWAPs [15]operations but often results in significant increase in thenumber of inserted gates and depth. Existing works [11], [14],[16] are more efficient than IBM’s Qiskit mapper. They useefficient heuristics to find the mapping rather than a stochasticmethod. However, the main objective of these methods is toreduce the gate count. It makes sense to minimize the gatecount, but it is more important to focus on the depth of circuit,as in the NISQ era the depth is equivalent to the estimatedexecution time. Reducing the depth of the circuit can reducethe likelihood of the circuit failing at an early stage.We show an motivation example in Fig. 2. The hardwaremodel is shown in Fig. 1 (a). It has five qubits and theconnectivity is the same as the IBM QX2 architecture exceptthat the links are all bidirectional. There are 5 physical qubits: Q to Q and six bi-directional edges. One CNOT gate canonly be applied on one of these edges.In the example, the initial mapping between logical qubits(denoted by lower case q ) and physical qubits (denoted by theupper case Q ) is shown next to each qubit (line), which is { { q → Q } , { q → Q } , { q → Q } , { q → Q } , { q → Q } } . With this initial mapping, it starts scheduling gatesone by one until it encounters a (set of) CNOT gate(s) whichcannot be scheduled due to physical constraints. We show theinteraction of logical qubits in Fig. 1(b) such that two logicalqubits are connected if there is a CNOT operation betweenthem. When we encounter the gate “CNOT q , q ” (markedred in the circuit diagram in Fig. 2 and as the dotted line in thelogical coupling graph Fig. 1), the scheduling has to terminatesince this translates into “CNOT Q , Q ” on the hardware,while no physical link exists between Q and Q . Neces-sary SWAP operations are needed. When applying a SWAPoperation, the two input physical qubits will exchange their q (Q )q (Q )q (Q )q (Q )q (Q ) H S HH X q (Q )q (Q )q (Q )q (Q )q (Q ) Q Q Q Q XXX
H S HH X q (Q )q (Q )q (Q )q (Q )q (Q ) Q Q X H T S H (a)(b)(c)
X TX TX
Fig. 2. Motivation Example: (a) the original logical circuit; (b) uses 2 swapsbut the depth of the circuit is not increased; (c) only uses 1 swap but thedepth of the circuit has been increased states. Fig. 2 (b) and (c) provide two options for transformingthe circuit. Fig. 2 (b) inserts 2 SWAPs (
SWAP Q , Q and SWAP Q , Q ) such that “CNOT q , q ” becomes “CNOT Q , Q ”, however the two SWAPs can run in parallel with existingsingle qubit gates in the circuit, without having to increase thedepth of the circuit. Fig. 2 (c) inserts only 1 SWAP ( SWAP Q , Q ) such that “CNOT q , q ” becomes “CNOT Q , Q ”,but it can not overlap with existing single-qubit gates in thecircuit and will only increase the depth of the circuit by 3(assuming we use 3 gates to implement the SWAP operationand each elementary gate takes 1 cycle in this example).In this example, the best two known approaches by Zulehner et al. [14] and Li et al. [11] will both choose to insert 1 SWAPsince they only optimize the number of gates inserted into thecircuit (or the depth of the inserted gates), but not the depthof the entire transformed circuit. This example stresses theimportance of depth-awareness in SWAP insertion schemesand motivates our work. XX HH HH (a) (b) (c) mn nm mn mnnm nm
Fig. 3. Implementation of a SWAP operation
III. P
ROPOSED S OLUTION
A. Metric
As our work is a depth-aware SWAP insertion scheme, wefirst precisely define the metric for characterizing the depth of a circuit. In order to fully explain the metric, we need tointroduce the concepts of dependency graph and critical path .The dependency graph represents the precedence relationbetween quantum gates in a logical quantum circuit. Thedefinition is below:
Definition 1.
Dependency Graph : The dependency graph of aquantum circuit C with a set of gates Ψ is a Directed AcyclicGraph G ψ = (Ψ , E ψ ) , E ψ ⊆ ψ × ψ . A directed edge fromnode ψ to node ψ exists if and only if the output of gate ψ is (part of) the input of gate ψ in the quantum circuit C . The critical path is referred to as the longest path in thedependency graph. And the definition is below:
Definition 2.
Critical Path : Given a dependency graph G ψ = (Ψ , E ψ ) of a quantum circuit. The critical path is CP = M ax ( P ath ( ψ , ψ )) s.t. ψ , ψ ∈ E ψ and ψ (cid:54) = ψ The depth is characterizing the number of execution stepsof a quantum circuit, which is tantamount to the critical pathlength of the circuit. The longest path in the dependencegraph describes the minimal number of steps the circuit needsin order for every gate’s data dependence be resolved. InAlgorithm 1, we show how we calculate the critical path.
Algorithm 1:
Calculate the Critical Path of a Circuit
Input :
The circuit’s dependency graph G ( V, E ) Output:
The critical path CP earliest start = {} ;CP = 0; for n ∈ V in topological order do temp = 0; for p ∈ V’s predecessors doif temp < earliest start[p] + latency[p] then temp = earliest start[p] + latency[p]; endend earliest start[n] = temp; if CP < temp + latency[n] then CP = temp; endend return CP ;We first sort the nodes in the directed acyclic graph intopological order. Then we process the nodes in that order.For each node, we check the earliest start time for each ofits predecessors, and add it by the latency of that predecessor,then we choose the maximum and use it as the earliest starttime of this node. The maximum of all nodes’ earliest starttime added by their latency is the critical path length.We use the critical path length as the metric for rankingdifferent swap insertion options.
B. Framework Design
With the metric precisely explained in previous section, nowwe continue to explain the work flow of our framework andhe intuitions behind it.Before delving into the details of this framework, we needto define the layer and the coupling graph . Qubit ConnectivityOrg. Circuit & Its Initial Layer Process
A Layer Need SWAPs ? Add
SWAPs
N Y
Move to Next Layer
Transformed Circuit
Move to Next Layer
Iterative Mapper
Fig. 4. The Qubit Mapping Framework
Definition 3.
Coupling Graph : The coupling graph of aquantum architecture X with a set of physical qubits Q isa directed graph G = ( Q, E ) , E ⊆ Q × Q . The edge E x = ( Q , Q ) ∈ E if and only if a CN OT gate can beapplied to Q and Q in X with Q being the control qubitand Q being the target qubit. We can divide the set of quantum gates in a circuit intolayers, so that all gates in the same layer can be executedconcurrently. The formal definition of a layer is:
Definition 4.
Layer : A quantum circuit C can be divided intolayers L = l , l , l , ..., l m , while (cid:83) mi =1 l i = C and (cid:84) mi =1 l i = ∅ . The set of gates at layer l i can run concurrently and acton distinct sets of qubits. To divide a circuit into layers, we group the gates that havethe same earliest start time (defined in Algorithm 1) into thesame layer. The order of the layers is thus determined by theorder of the earliest start times .We use an iterative process to find the mapping. Ourframework is depicted in Fig. 4. And this iterative process isexplained as below. We start the framework by taking the inputof the coupling graph (also denoted as
Qubit Connectivity ) andthe original circuit’s initial layer.We process the circuit layer by layer. Given a layer, weperform the following steps. • We check the layer to see if it is hardware-compliantbased on the coupling graph and the qubit mapping beforecurrent layer is scheduled. • If YES, we move on to next layer. • If NO, we invoke our mapping searcher to search for(the set of) swaps that are necessary to solve the currentlayer. We consider depth-awareness during the selectionof the set of swap gates – the resulted mapping of whichgenerates the smallest critical path length (describedin Section III-C). After we find a hardware-compliantmapping, we move to the next layer.After all layers are processed, the mapping terminates.
C. Circuit Mapping Searcher
Here we describe the specific mapping searcher we use toovercome the coupling constraint for a given layer. We build our method upon the
A-star algorithm for findingvalid mappings that minimize the number of only the insertedSWAP gates [14]. We extend it by changing the rankingmetric and allowing it to search for feasible mappings thatdo not necessarily have the smallest SWAP gate counts. Itwill help us search in a way that minimizes the depth whilenot significantly increasing the gate count.We rank the swap options by the increase in the critical pathlength. Since it is an iterative process that handles the gateslayer by layer, it is tempting to consider only minimizing thedepth of the already processed circuit when deciding whichswaps to use.
Q1 Q2Q3 Q4 H q (Q )q (Q )q (Q )q (Q ) S H
Processed Circuit Remaining Circuit H q (Q )q (Q )q (Q )q (Q ) S H
Processed Circuit Remaining CircuitXX (Q )(Q ) Overlaps with remaining circuit (a) (b) (c)Qubit Connectivity
Fig. 5. (a) Layout of an example architecture with 4 physical qubits (b)Example of a quantum circuit, the dashed line separates the processed circuitand the remaining circuit (c) Inserted SWAP overlaps with remaining circuitinstead of existing processed circuit
But the example in Fig. 5 shows that not only the processedcircuit, but also the remaining circuit can help overlap theSWAPs with existing gates in the circuit without affecting thecritical path. As shown in Fig. 5, for the CNOT gate (in red),there is no way it can overlap the necessary SWAPs with theprocessed circuit (dubbed as the circuit before the dashed line).But when we look after the dashed line, the three single-qubitgates can overlap with inserted SWAP. And this renders lessimpact to the depth of the resulting circuit, compared to if weinsert the SWAP on Q and Q .Based on this intuition, we design our scheme of choosingthe SWAP candidate as in Fig. 6. For each of the hardware-compliant remapping candidates that we acquire from the A-star searcher, we calculate the critical path after merging thecandidate (set of) swap(s) with both the processed circuit andthe not-processed circuit. We choose the mapping that yieldsthe shortest critical path.
Processed Circuit (PC) S Remaining Circuit (RC)
S: Circuit with added SWAPs
S1 S2 S3 Sn … Hardware-compliant candidates
Select S x with the smallest CP(X) CP(1) CP(2) CP(3) CP(n) … Calculate critical path:
CP(X) = critical_path(PC, S X , RC) Fig. 6. Choose SWAP Candidates . Optimizations
We use two ways to optimize our proposed solution. One isto expand more nodes during the
A-star search, and anotherone is to search into deeper levels.
1) Expand More Nodes:
In the search process for
A-star ,the normal routine is to expand the one node of least cost ateach step. Here, we can expand more than one node at eachstep and increase the search space. The number of nodes thatcan be expanded at a time can go from 1 to larger number.
2) Deeper Search:
We increase the depth of the A-starsearch tree. In normal case, the search process ends whenit finds the first node that minimizes the number of SWAPs,which is reflected as a certain level of the A-star tree. Tothis end, the second optimization that we applied here is tocontinue the search into a deeper level of the A-star tree. Wecan specify and tune the parameter of the deeper search.By tuning these parameters, there are more possible nodesadded into our search space. With a larger search space, wehave a larger possibility to jump out of one local optima andgo to the global optima.IV. E
VALUATION
In this section, we evaluate our d e p th-aware s wap insertionscheme (denoted as DPS) and compare it with the two state-of-the-art qubit mappers. The experiment setup is listed below: • Benchmarks : We use the quantum circuits fromRevLib [20], IBM Qiskit [15], and ScaffCC [21]. • Hardware Model : We use IBM’s 20-qubit Q20 Tokyoarchitecture, which was used in [11]’s work. The qubitconnectivity graph is shown in Fig. 7. • Evaluation Platform : The mapping experiments areconducted on a Intel 2.4 GHz Core i5 machine, with 8GB 1600 MHz DDR3 memory. The operating system isMacOS Mojave. We use IBM’s Qiskit [15] to evaluatethe depth of the transformed circuit. • Baselines : We compare our work with two best knowqubit mapping solutions, the work by Zulehner and others[14] (denoted as
Zulehner ), the Sabre qubit mapperfrom [11] (denoted as
Sabre ), and IBM’s stochastic map-per in Qiskit. Since IBM’s Qiskit mapper is significantlyworse in terms of gate count and depth than all othermappers we evaluate, as also evidenced in the work byZulehner et al. [14], we do not present Qiskit results. • Metrics : We are comparing the depth and gate count ofthe transformed circuit circuits for all different strategies.
Fig. 7. IBM Q20 Tokyo Physical Layout [11]
Table. I shows a summary experimental results. For gatecount, we compare the total gate count generated in the transformed circuit. For depth, we compare the increased depthfor each benchmark, denoted as “Depth-delta” in Table I.The improvement columns provides the ratio between one ofthe two baseline’s depth-delta and our depth-delta . We usethe term minimum improvement to denote the improvementover the best of the two baselines, and the term maximumimprovement to denote the improvement over the worse of thetwo baselines.We discuss our findings from the following three aspects:depth reduction, gate count change, and the trade-off betweengate count and depth.
A. Depth Reduction
For depth reduction, as shown in Table I, our proposedsolution outperforms the two baselines
Zulehner and
Sabre .Comparing depth-delta , the added depth of the circuit, ourapproach outperforms the better of the two baselines by morethan 20% and up to 3X. For five out of the twenty-threebenchmarks, our improvement on depth-delta is less than 20%compared with the better of the two baselines. However, forthese cases, our approach still achieves considerable improve-ment over the worse of the two baselines. In these cases, it ispossible that one of the two baselines happen to achieve verygood depth in the transformed circuit and there is not muchpotential to improve. But our approach is still able to find agood mapping for these benchmarks and the performance ison par with the better of the two baselines.
B. Gates Count Changes
The primary goal of our depth-aware qubit mapper is tominimize the depth of the circuit. However, we discover thatour qubit mapper can sometimes reduce the gate count. Wediscover that four out of the twenty three (17%) benchmarks,our qubit mapper yields the smallest number of gates amongall three versions of qubit mappers. For 57% of these bench-marks, our method is ranked among top-2 of the three qubitmappers in terms of gate count. For the benchmarks where ourmethod yields the largest gate count, the increased gate countpercentage is negligible. On average, our depth-aware qubitmapper adds 3% gate count. From the experiment results, wecan see that our solution does not greatly increase the numberof gates while reducing the depth of the circuit.
C. Trade-off between Gate Count and Depth
While all previous works focus on reducing the total gatecount (and the depth among the inserted gates themselves)after qubit mapping transformation, it is crucial to think aboutthe trade-off between the resulted gate count and depth. Some-times the choice made during the search process that favorsthe reduced gate count, might adversely affect the critical path.In Table I, the
Sabre mapper reduces the number of gates for10-qubit QFT by 1.1% compared with
Zulehner ’s mapper, butincreases the depth by 44.5%. For the sym 9 246 benchmark,
Sabre reduces the gate count by 3.8% compared with ourapproach, but increases the depth by 25.5%. Therefore asmall reduction in the gate count may not be worthwhile ifit increases the circuit depth significantly.
ABLE IS
UMMARY OF E XPERIMENT RESULTS
Benchmark Total Gate name n Zulehner Sabre
DPS
Original Zulehner Sabre
DPS
Min Max4gt5 75 5 131 122 119 47 44 44 29 1.52 1.52mini-alu 167 5 435 396 432 162 131 125 119 1.05 1.10mod10 171 5 361 328 298 139 117 89 39 2.28 3alu-v2 30 6 804 717 795 285 261 241 201 1.20 1.30decod24-enable 126 6 533 476 509 190 187 150 141 1.06 1.33mod5adder 127 6 849 780 858 302 256 256 222 1.15 1.154mod5-bdd 287 7 94 94 94 41 18 23 17 1.06 1.35alu-bdd 288 7 126 117 135 48 36 36 30 1.2 1.2majority 239 7 915 780 885 344 265 194 182 1.06 1.46rd53 130 7 1619 1508 1619 569 529 482 384 1.26 1.38rd53 135 7 419 410 422 159 116 112 109 1.03 1.06rd53 138 8 186 183 174 56 37 40 21 1.76 1.90cm82a 208 8 899 944 1007 337 219 295 213 1.03 1.38qft 10 10 266 263 281 63 47 96 44 1.07 2.18rd73 140 10 347 329 338 92 84 79 67 1.18 1.25dc1 220 11 2868 2685 3129 1038 820 697 681 1.02 1.20wim 266 11 1505 1415 1511 514 431 450 311 1.39 1.45z4 268 11 4453 4477 4972 1644 1162 1492 1076 1.08 1.39cycle10 2 110 12 9143 8666 10115 3386 2467 2640 2421 1.02 1.09sym9 146 12 493 454 472 127 118 138 86 1.37 1.60adr4 197 13 5299 5017 5530 1839 1439 1599 1210 1.19 1.32rd53 311 13 467 413 446 124 138 157 87 1.59 1.80cnt3-5 179 16 325 238 286 61 79 59 43 1.37 1.84We compare the total gate count generated. For depth, we compare the increased depth for each benchmark, denoted as “Depth-delta” here. Theimprovement represents the ratio of a baseline’s depth-delta divided by DPS’s depth-delta. Min/Max represents the improvement over the best/worst baseline. V. CONCLUSION
The physical layout of contemporary quantum devices im-poses limitations for mapping a high level quantum program tothe hardware. It is critical to develop an efficient qubit mapperin the NISQ era. Existing studies aim to reduce the gatecount but are oblivious to the depth of the transformed circuit.This paper presents the design of the first depth-aware swapinsertion scheme. Experiment results show that our proposedsolution generates hardware-compliant circuits with reduceddepth compared with state-of-the-art mapping schemes, withnegligible overhead of increased gate count.R
EFERENCES[1] P. W. Shor, “Algorithms for quantum computation: Discrete logarithmsand factoring,” in
Proceedings 35th annual symposium on foundationsof computer science . Ieee, 1994, pp. 124–134.[2] L. K. Grover, “A fast quantum mechanical algorithm for databasesearch,” in
Proceedings of the Twenty-eighth Annual ACM Symposiumon Theory of Computing , ser. STOC ’96. New York, NY, USA:ACM, 1996, pp. 212–219. [Online]. Available: http://doi.acm.org/10.1145/237814.237866[3] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou,P. J. Love, A. Aspuru-Guzik, and J. L. OBrien, “A variationaleigenvalue solver on a photonic quantum processor,” in
NatureCommunications , vol. 5, no. 1, 2014, p. 4213. [Online]. Available:https://doi.org/10.1038/ncomms5213[4] J. Preskill, “Quantum computing in the nisq era and beyond,”
Quantum
Physical Review A , vol. 100, no. 3, Sep 2019. [Online].Available: http://dx.doi.org/10.1103/PhysRevA.100.032328[11] G. Li, Y. Ding, and Y. Xie, “Tackling the qubit mapping problemfor nisq-era quantum devices,” in
Proceedings of the Twenty-FourthInternational Conference on Architectural Support for ProgrammingLanguages and Operating Systems . ACM, 2019, pp. 1001–1014.[12] R. Wille, L. Burgholzer, and A. Zulehner, “Mapping quantum circuitsto ibm qx architectures using the minimal number of swap and hoperations,” in
Proceedings of the 56th Annual Design AutomationConference 2019 . ACM, 2019, p. 142.[13] A. Zulehner, S. Gasser, and R. Wille, “Exact global reordering for near-est neighbor quantum circuits using A ∗ ,” in International Conferenceon Reversible Computation . Springer, 2017, pp. 185–201.[14] A. Zulehner, A. Paler, and R. Wille, “Efficient mapping of quantumcircuits to the ibm qx architectures,” in . IEEE, 2018, pp.1135–1138.[15] QISKit: Open Source Quantum Information Science Kit, https://https://qiskit.org/.[16] M. Y. Siraichi, V. F. d. Santos, S. Collange, and F. M. Q. Pereira,“Qubit allocation,” in
Proceedings of the 2018 International Symposiumon Code Generation and Optimization . ACM, 2018, pp. 113–125.[17] S. S. Tannu and M. K. Qureshi, “Not all qubits are created equal:A case for variability-aware policies for nisq-era quantum computers,”in
Proceedings of the Twenty-Fourth International Conference onArchitectural Support for Programming Languages and OperatingSystems , ser. ASPLOS ’19. New York, NY, USA: ACM, 2019, pp. 987–999. [Online]. Available: http://doi.acm.org/10.1145/3297858.3304007[18] P. Murali, J. M. Baker, A. Javadi-Abhari, F. T. Chong, and M. Martonosi,“Noise-adaptive compiler mappings for noisy intermediate-scalequantum computers,” in
Proceedings of the Twenty-Fourth InternationalConference on Architectural Support for Programming Languagesand Operating Systems , ser. ASPLOS ’19. New York, NY,USA: ACM, 2019, pp. 1015–1029. [Online]. Available: http://doi.acm.org/10.1145/3297858.330407519] A. W. Cross, L. S. Bishop, J. A. Smolin, and J. M. Gambetta, “Openquantum assembly language,” arXiv preprint arXiv:1707.03429 , 2017.[20] R. Wille, D. Große, L. Teuber, G. W. Dueck, and R. Drechsler, “Revlib:An online resource for reversible functions and reversible circuits,” in .IEEE, 2008, pp. 220–225.[21] A. JavadiAbhari, S. Patil, D. Kudrow, J. Heckey, A. Lvov, F. T. Chong,and M. Martonosi, “Scaffcc: a framework for compilation and analysisof quantum computing programs,” in