[PDF] Many-Qudit representation for the Travelling Salesman Problem Optimisation

Abstract

We present a map from the travelling salesman problem (TSP), a prototypical NP-complete combinatorial optimisation task, to the ground state associated with a system of many-qudits. Conventionally, the TSP is cast into a quadratic unconstrained binary optimisation (QUBO) problem, that can be solved on an Ising machine. The size of the corresponding physical system's Hilbert space is 2^{N^2}, where N is the number of cities considered in the TSP. Our proposal provides a many-qudit system with a Hilbert space of dimension 2^{N\log_2N}, which is considerably smaller than the dimension of the Hilbert space of the system resulting from the usual QUBO map. This reduction can yield a significant speedup in quantum and classical computers. We simulate and validate our proposal using variational Monte Carlo with a neural quantum state, solving the TSP in a linear layout for up to almost 100 cities.

Full PDF

MMany-Qudit representation for the Travelling Salesman Problem Optimisation

Vladimir Vargas-Calder´on, ∗ Nicolas Parra-A., and Herbert Vinck-Posada

Grupo de Superconductividad y Nanotecnolog´ıa, Departamento de F´ısicaUniversidad Nacional de Colombia, Bogot´a, Colombia

Fabio A. Gonz´alez

MindLab Research Group, Departamento de Ingenier´ıa de Sistemas e IndustrialUniversidad Nacional de Colombia, Bogot´a, Colombia (Dated: March 1, 2021)We present a map from the travelling salesman problem (TSP), a prototypical NP-completecombinatorial optimisation task, to the ground state associated with a system of many-qudits. Con-ventionally, the TSP is cast into a quadratic unconstrained binary optimisation (QUBO) problem,that can be solved on an Ising machine. The size of the corresponding physical system’s Hilbertspace is 2 N , where N is the number of cities considered in the TSP. Our proposal provides a many-qudit system with a Hilbert space of dimension 2 N log N , which is considerably smaller than thedimension of the Hilbert space of the system resulting from the usual QUBO map. This reductioncan yield a signiﬁcant speedup in quantum and classical computers. We simulate and validate ourproposal using variational Monte Carlo with a neural quantum state, solving the TSP in a linearlayout for up to almost 100 cities. I. INTRODUCTION

Combinatorial optimisation problems (COPs) aim toﬁnd an optimal conﬁguration from an usually ﬁnite butintractable set of conﬁgurations . The travelling sales-man problem (TSP) is one of the most famous COPsand attracts plenty of interest from the scientiﬁc com-munity. It is easy to state, but hard to solve: given alist of cities and the distances between them, what is theshortest route to visit them all and return to the origincity? This problem has a great number of applications,most notably in operational research . The TSP is anNP-hard problem , meaning that no known classical al-gorithm can solve it in polynomial time as a functionof the number of cities. In fact, the brute-force way tosolve this problem is to consider ( N − / . Other eﬃcient heuris-tic solvers have been built taking advantage of particulartopological features of the expected solution , such ashaving no crossing paths in a TSP deﬁned on an Eu-clidean plane .Quantum devices are promising platforms to solveCOPs due to two main reasons: there could be so-lutions that are signiﬁcantly faster than the best classi-cal algorithm for a speciﬁc COP, providing a quantumspeedup ; and there could be solutions that scale inpolynomial time with respect to the COP size, possi-bly providing a strong quantum speedup . Notice thatthe ﬁrst reason is concerned with ﬁnding solutions of theCOP in less time, but the time still scales exponentiallywith respect to the COP size. Such an advantage fromquantum devices over the best classical algorithms forNP-hard problems has already been demonstrated forIsing spin-glasses , searching a marked item within anunstructured database , among others .A strategy to solve COP such as TSP in a quantumcomputer is to map the TSP to a Hamiltonian such that the solution tour can be deduced from the ground stateof the corresponding Hamiltonian. Usually, the TSP iscast into a quadratic unconstrained binary optimisation(QUBO) problem, which can be easily mapped to anIsing spin-glass model , taking N qubits to solve theTSP for N cities. This means that the Hilbert space ofthe corresponding Ising spin-glass model is of size 2 N .In this paper, we propose a diﬀerent map from the TSPto a physical system composed of qudits instead of qubits,which has a corresponding Hilbert space size of 2 N log N for N cities. This reduction of the Hilbert space size isexpected to facilitate the problem of ﬁnding the groundstate (which corresponds to the TSP solution) both onquantum and classical computers . A future experi-ment of our proposal on a qudit quantum-chip is expectedto be superior to the best classical algorithms, just asIsing machines have proven to be superior to general-purpose classical computers for the TSP on complemen-tary metal-oxide-semiconductor ﬁeld programmable gatearrays , on quantum processing units such as the onesdeveloped by D-Wave , and on other devices such asa nuclear-magnetic-resonance quantum simulator . Weargue that an implementation of our proposal should alsobe superior to these Ising machines, this time not becauseof a quantum speedup (since both Ising machines and qu-dit quantum-chips are quantum machines), but becauseof the considerable reduction of the Hilbert space size.The paper is organised as follows. Section II presentsthe TSP and reviews the usual map to a QUBO problem,or equivalently, an Ising spin-glass problem. Then, in sec-tion III we construct the many-qudit Hamiltonian whoseground state solves the TSP. Then, in section IV we showa validation of our proposal using a state-of-the-art classi-cal algorithm for ﬁnding the ground state of a many-bodyproblem, namely, the variational Monte Carlo (VMC)method with a neural quantum state (NQS) as a varia-tional ansatz. Finally, we conclude in section V. a r X i v : . [ qu a n t - ph ] F e b II. QUBO FORMULATION OF THE TSP

Conventionally, the TSP can be mapped to a QUBOproblem, which is then straight-forwardly mapped to anIsing Hamiltonian . In particular, following the ex-planation by Smelyanskiy et al. , we deﬁne a binaryvariable z iα that is 1 if the i -th city is the α -th locationvisited in a tour, and is 0 otherwise.The length of the tour is (cid:80) i,j,α d i,j z i,α z j,α +1 , where d i,j is the distance between the i -th an the j -th city.We must also impose that (cid:80) i z i,α = 1 for any α and (cid:80) α z i,α = 1 for any i to ensure that every city is visitedexactly once. These constraints, however, are only usefulconceptually. They can be rewritten as ( (cid:80) i z i,α − = 0and ( (cid:80) α z i,α − , so that ﬁnding the minimum-lengthtour of the TSP is equivalent to minimising the quantity (cid:88) i,j,α d i,j z i,α z j,α +1 + (cid:32)(cid:88) i z i,α − (cid:33) + (cid:32)(cid:88) α z i,α − (cid:33) , (1)which is a QUBO problem. If we map the binary variableto a spin/qubit σ via z (cid:55)→ σ = 2 z −

1, we obtain theexpression of an Ising spin-glass Hamiltonian. Moreover,the ground state of the Hamiltonian is the solution ofthe TSP, and the corresponding ground energy matchesthe length of the solution tour. This approach takes N qubits, meaning that the Hilbert space’s size is 2 N .The QUBO representation of the TSP gives rise to acompletely connected graph, with a qubit on each nodeof the graph, and each edge representing interaction be-tween the connected nodes. In ﬁg. 1 we depict this sit-uation for a 4 city tour example, where the QUBO map-ping induces qubit-qubit interactions between qubits thatrepresent a single city (connections between qubits of thesame colour), and between qubits that represent diﬀerentcities (connections between qubits of diﬀerent colours).The interpretation of these interactions becomes cum-bersome. Instead, the proposal that we will explain nextis more naturally related to the way of representing thecities in a tour and its interactions. FIG. 1. Pictorial representation of the TSP map to a systemof 4 qubits as a QUBO problem, and to a system of 4 4-levelqudits of a toy-example tour with 4 cities. The top arrowdiagram shows a tour starting at city 1. Colours encode theposition of a city in the tour. Lines connecting physical sub-systems represent a coupling between them. III. MANY-QUDIT FORMULATION OF THETSP

In this work, we propose to use

N N -level systems orqudits of N dimensions to map the TSP of N cities tothe Hamiltonian of a physical system. The correspondingHilbert space size is N N = 2 N log N , which provides anadvantage over the qubit proposal.We keep the 4 city example shown in ﬁg. 1. We can use4 4-level qudits to encode any tour of 4 cities. Essentially,the tour which can be described by a string of consecu-tive cities to be visited 1 → → → →

1, can beencoded in an ordered set of 4 4-level qudits, where theﬁrst qudit is in the ﬁrst-level state, the second qudit isin the third-level state, the third qudit is in the second-level state, and the fourth qudit is in the fourth-levelstate. This is a more natural representation of the tourthan the one-hot encoding into qubits produced by theQUBO representation. In fact, from the string represen-tation of the tour, we can assume a quantum analogueproblem where the tour is simply depicted as the purestate | (cid:105) ⊗ | (cid:105) ⊗ | (cid:105) ⊗ | (cid:105) , which is a tensor product of the4-level occupation of each of the 4 qudits.This is easily generalised to a TSP of N cities. Let | n i (cid:105) be the state of the i -th qudit. In this setup, the i -th qu-dit occupation refers to the i -th visited city. Therefore,any tour can be represented by a vector ( n , n , . . . , n N ),where n i (cid:54) = n j for i (cid:54) = j , which states that the tour be-gins at city n , then continues to city n , and so on,reaching city n N and then returns to city n . As dis-cussed, we assume a quantum analogue problem wherethe tour vector can be represented as a pure state of N qudits, depicted by a tensor product state of the form (cid:78) i | n i (cid:105) ≡ | n , n , . . . , n N (cid:105) ≡ | n (cid:105) . This allows us to de-ﬁne the Hamiltonian via its element matrices as (cid:104) n | ˆ H | n (cid:105) = (cid:26) d n ,n + d n ,n + . . . + d n N ,n if ( n , . . . , n N ) is a permutation of (1 , , . . . , N ) ,p otherwise , (2)where p (cid:29) max { d i,j } is a term that penalises conﬁgu-rations that do not correspond to valid tours. Such apenalty term can be compared to an eﬀective exclusionprinciple, where invalid tours cannot exist. Moreover, (cid:104) n | ˆ H | n (cid:48) (cid:105) = p for n (cid:54) = n (cid:48) .A Hamiltonian of this form may arise from a sum oflocal Hamiltonians which are two-qudit operators, whosematrix elements are (cid:104) i, j | ˆ D | (cid:96), m (cid:105) = d i,j δ i,(cid:96) δ j,m + p (cid:48) (2 − δ i,(cid:96) − δ j,m ) , (3)where δ i,j is the Kronecker delta and p (cid:48) (cid:29) max { d i,j } .Thus, the Hamiltonian of the N -qudits system would beˆ H = ˆ D ( H ⊗H ) + ˆ D ( H ⊗H ) + . . . + ˆ D ( H N ⊗H ) , (4)where H i is the Hilbert space of the i -th qudit andˆ D ( H i ⊗H j ) is the operator in eq. (3) acting on the space H i ⊗H j . Notice that the Hamiltonian in eq. (4) is slightlydiﬀerent than the one presented in eq. (2) because thepenalty term becomes a collection of penalty terms, de-pending on how many repeated occupations there are inthe state. Again, by construction, any state | n (cid:105) corre-sponding to a valid city tour will have a correspondingenergy equal to the tour distance, which is why minimis-ing the energy yields the ground state of the Hamiltonianin eq. (4), which corresponds to the TSP solution.There is a practical down-side of our proposed Hamil-tonian, and it is that in order to build it in a quantumcomputer, occupation-dependent (non-linear) couplingsare needed, which are diﬃcult to engineer nowadays.Nonetheless, qubit-based quantum computers have al-ready been able to reproduce non-linear behaviour, eventhough this kind of interaction is not native in those com-puters . Despite the diﬃculties, the ﬁeld of qudit-basedquantum computers has seen steady progress in recentyears towards universal quantum computers , whichare promising and relevant for our proposal.Even though we propose a many-qudit Hamiltonian tosolve the TSP, it is possible to map it to a qubit-basedcomputer using binary encoding. Such a map preservesthe size of the Hilbert space (2 N log N ), at the cost ofnot being able to deﬁne the TSP as a QUBO problem,but as a higher order binary optimisation (HOBO) prob-lem. Proposals of physical systems that can solve HOBOproblems are also starting to ﬂourish, such as the workby Stroev and Berloﬀ , where the possibility of con-trolling k -body couplings between the binary nodes of acoherent network is suggested. IV. NUMERICAL VALIDATION

In order to validate our proposal, we solve the Hamilto-nian in eq. (4) using a recent and powerful technique forﬁnding the ground state of a many-body physical system.The technique is variational Monte Carlo (VMC) witha variational wavefunction deﬁned by a neural network,also called a neural quantum state (NQS) . Details ofVMC and NQSs are given in appendix A and appendix B,respectively.For the sake of illustrating what kind of advantage wecan get with our proposal, we perform VMC+NQS ex-periments on two diﬀerent setups. The ﬁrst setup corre-sponds to the QUBO representation of the TSP, mappedto an Ising Hamiltonian, i.e. a qubit Hamiltonian. Forthis representation we use a restricted Boltzmann ma-chine (RBM) as the NQS because it naturally takes asinput binary variables. The second setup corresponds toour many-qudit representation of the TSP. For this rep-resentation we use a convolutional neural network (CNN)as the NQS because it naturally expresses translationalsymmetry, which exists in the TSP . For each of thesetups, we consider a toy-TSP problem, where cities areplaced on a line with coordinates x n = n . This class ofTSP allows us to easily benchmark the solutions obtainedbecause the minimum tour length is 2( N − N is the total number of cities considered in the city chain.

10 20 30 40 50 60 70 80 90 % O f e x p e r i m e n t s t h a t c o rr e c t l y c o n v e r g e d QubitsQudits method

FIG. 2. Percentage of experiments that converged to the de-sired solution for the many-qubit (red) versus the many-qudit(blue) representation of the TSP, as a function of the numberof cities. The lines are shown to guide the eye only. Time [s]0123 E n e r gy p e r c it y

12 cities Time [s]0246 E n e r gy p e r c it y

16 cities Time [s]0246 E n e r gy p e r c it y

19 cities Time [s]0246 E n e r gy p e r c it y

22 cities Time [s]02468 E n e r gy p e r c it y

25 cities Time [s]0.02.55.07.510.0 E n e r gy p e r c it y

29 cities Time [s]0.02.55.07.510.012.5 E n e r gy p e r c it y

34 cities Time [s]051015 E n e r gy p e r c it y

40 cities Time [s]05101520253035 E n e r gy p e r c it y

46 cities53 cities61 cities71 cities82 cities96 cities

FIG. 3. Energy convergence as a function of processing time in seconds. The ﬁrst two rows show the energy convergenceof the best experiment for the qubit (orange) and the qudit (blue) representations of the TSP, with respect to the baselineenergy, which is 2( N −

1) for N cities. The bottom panel shows the energy convergence of the best experiments for the quditrepresentation of the TSP for several other number of cities. In all plots, the shaded areas correspond to 2 standard deviationsof the energy of each Metropolis-Hastings sample. Since the VMC+NQS method has hyper-parameters, weperformed 400 experiments with diﬀerent values of thosehyper-parameters for each of the two setups, as it is dis-cussed in detail in appendix C.Figure 2 shows the percentage of experiments that cor-rectly converge to the ground energy as a function of thenumber of cities considered in the linear layout. Interest-ingly, we see a large drop of the percentage of experimentsthat converged to the expected solution around 40 citiesfor the qubit representation of the TSP (notice that thiscorresponds to a system with 1600 qubits, which is indeeda very challenging problem), whereas the drop is locatedaround 70 cities for the qudit representation. Such a dropindicates how rapidly the TSP solvability decreases withthe number of cities, exposing its computational hard-ness. Moreover, we hypothesise that this drop might beconnected to a phase transition of the VMC+NQS algo-rithm when exposed to the TSP, as this behaviour haspreviously been seen in other algorithms for the TSP . Another important feature of the experiments carriedout is that experiments that converge take less time inthe many-qudit representation than in the many-qubitrepresentation. In ﬁg. 3 we exemplify this fact for somenumber of cities, where the convergence of the best (mostaccurate and fast) experiments of the qubit (orange) andqudit (blue) representations of the TSP are shown. Notonly are the best experiments of the qudit representationbetter than those of the qubit representation (except inthe case of 16 and 22 cities ), but the diﬀerence in per-formance also grows as the number of cities grows. V. CONCLUSIONS

We presented a map from the travelling salesman prob-lem (TSP) to the problem of ﬁnding the ground state ofa many-qudit system. The main feature of this proposalis that the Hilbert space of the system has size 2 N log N ,where N is the number of cities in the TSP. This con-tribution is likely to provide an advantage over the con-ventional map of the TSP to a QUBO problem, whichthen can be easily mapped to an Ising spin-glass Hamil-tonian. This conventional representation of the TSP hasa Hilbert space of size 2 N . Therefore, our proposal sig-niﬁcantly reduces the size of the Hilbert space.The main diﬃculty in building a quantum device ableto simulate our many-qudit system is that we requireto control occupation-dependent couplings between thequdits, which demands a precise control of non-linearcouplings. However, we experimentally validate that ourproposal yields correct solutions of the TSP for severalconﬁgurations of cities on a line on a classical computerwith state-of-the-art ground state solvers such as the vari-ational Monte Carlo with neural quantum states as vari-ational wavefunctions.An interesting perspective is that even though themany-qudit representation is more succinct than theQUBO representation (this is seen from the Hilbert spacesize, 2 N log N = N N ), it is not the most compact way ofencoding all the possible tours. The number of possi-ble tours is of the order of N ! −−−−−→ N →∞ N N e − N (cid:28) N N .Thus, a natural question is, which quantum system cansupport N ! states so that the tour conﬁgurations can bemapped one-to-one to these states?Also, although the aim of this paper is not to pro-vide a classical algorithm competitive with the best al-gorithms for ﬁnding TSP solutions, several network ar- chitectures can be tested to provide faster solutions. Apromising candidate is the class of transformer-like archi-tectures, which have proven to yield interesting results onthe TSP as well as on quantum annealing setups to ﬁndthe ground state of random Ising spin-glasses . Fur-thermore, TSP solvers based on ground state ﬁnding canbe integrated into meta-heuristic solvers, to solve smallerTSP problems with accuracy.Finally, we highlight that it remains a challenge tostudy the induced quantum correlations in the many-qudit system (or the Ising spin-glass corresponding tothe QUBO representation of the TSP), as it is not clearhow these might aﬀect positively or negatively the solu-tion of the TSP. Furthermore, in realistic quantum de-vices, the impact of dissipation onto the solution qualityof the TSP might be an interesting phenomenon to takeinto account, especially with dissipative channels such asqubit dephasing or phonon-assisted tunnelling , whichare excitation-preserving, thus, maintaining a valid tourconﬁguration. CODE AVAILABILITY

We provide an open-source library to build a Hamil-tonian using both the QUBO and the many-qudit rep-resentations of the TSP. The library ﬁnds the groundstate of the respective Hamiltonian, which coincides withthe TSP solution. The library, called Hamiltonian Trav-elling Salesman Problem (htsp), is available at https://gitlab.com/ml-physics-unal/htsp . ∗ [email protected] C. H. Papadimitriou and K. Steiglitz,

Combinatorial op-timization: algorithms and complexity (Courier Corpora-tion, 1998). J. K. Lenstra and A. H. G. R. Kan, Journal ofthe Operational Research Society , 717 (1975),https://doi.org/10.1057/jors.1975.151. R. A. Palhares and M. C. B. Ara ´Ujo, in (2018) pp. 1421–1425. S. Chatterjee, C. Carrera, and L. A. Lynch, EuropeanJournal of Operational Research , 490 (1996). R. G. Bland and D. F. Shallcross, Operations ResearchLetters , 125 (1989). N. Agatz, P. Bouman, and M. Schmidt, TransportationScience , 965 (2018). “The traveling salesman problem,” in Combinatorial Opti-mization: Theory and Algorithms (Springer Berlin Heidel-berg, Berlin, Heidelberg, 2008) pp. 527–562. D. B. Fogel, Biological Cybernetics , 139 (1988). T. Kohonen, Biological Cybernetics , 59 (1982). A. Bertagnon and M. Gavanelli, Proceedings of the AAAIConference on Artiﬁcial Intelligence , 1412 (2020). H. Ushijima-Mwesigwa, R. Shaydulin, C. F. A. Negre,S. M. Mniszewski, Y. Alexeev, and I. Safro, ACM Transac- tions on Quantum Computing (2021), 10.1145/3425607. T. Albash and D. A. Lidar, Reviews of Modern Physics , 015002 (2018). A. Papageorgiou and J. F. Traub, Phys. Rev. A , 022316(2013). K. Karimi, N. G. Dickson, F. Hamze, M. H. Amin,M. Drew-Brook, F. A. Chudak, P. I. Bunyk, W. G.Macready, and G. Rose, Quantum Information Processing , 77 (2012). F. Barahona, Journal of Physics A: Mathematical andGeneral , 3241 (1982). J. Roland and N. J. Cerf, Physical Review A , 042308(2002). V. N. Smelyanskiy, E. G. Rieﬀel, S. I. Knysh, C. P.Williams, M. W. Johnson, M. C. Thom, W. G. Macready,and K. L. Pudenz, arXiv preprint arXiv:1204.2821 (2012). K. Someya, R. Ono, and T. Kawahara, in (IEEE, 2016) pp. 1–4. A. Minamisawa, R. Iimura, and T. Kawahara, in (IEEE, 2019) pp. 670–673. J. Hertz, A. Krogh, and R. G. Palmer, Santa Fe Insti-tute Studies in the Sciences of Complexity; Lecture Notes(1991). M. A. Kastner, Proceedings of the IEEE , 1765 (2005). R. H. Warren, Quantum information processing , 1781(2013). A. Lucas, Frontiers in Physics , 5 (2014). H. Lin, J. Gubernatis, H. Gould, and J. Tobochnik, Com-puters in Physics , 400 (1993). U. Schollw¨ock, Annals of physics , 96 (2011). A. Kandala, A. Mezzacapo, K. Temme, M. Takita,M. Brink, J. M. Chow, and J. M. Gambetta, Nature ,242 (2017). M. Yamaoka, C. Yoshimura, M. Hayashi, T. Okuyama,H. Aoki, and H. Mizuno, in (IEEE, 2015) pp. 1–3. R. H. Warren, SN Applied Sciences , 1 (2020). R. H. Warren, Journal of Advances in Applied Mathemat-ics (2017). D-Wave Systems Inc.,

Getting Started with the D-WaveSystem , D-Wave Systems Inc. (2017). H. Chen, X. Kong, B. Chong, G. Qin, X. Zhou, X. Peng,and J. Du, Physical Review A , 032314 (2011). Y. Shi, A. R. Castelli, I. Joseph, V. Geyko, F. R. Graziani,S. B. Libby, J. B. Parker, Y. J. Rosen, and J. L. DuBois,arXiv preprint arXiv:2004.06885 (2020). Y. Wang, Z. Hu, B. C. Sanders, and S. Kais, Frontiers inPhysics , 479 (2020). X. Wu, S. Tomarken, N. A. Petersson, L. Martinez, Y. J.Rosen, and J. L. DuBois, Physical Review Letters ,170502 (2020). P. Imany, J. A. Jaramillo-Villegas, M. S. Alshaykh, J. M.Lukens, O. D. Odele, A. J. Moore, D. E. Leaird, M. Qi,and A. M. Weiner, npj Quantum Information , 1 (2019). N. Stroev and N. G. Berloﬀ, Phys. Rev. Lett. , 050504(2021). G. Carleo and M. Troyer, Science , 602 (2017). It does not matter which city is the ﬁrst one to be visited,the tour length will be the same if we shift the positionsof the tour. I. P. Gent and T. Walsh, Artiﬁcial Intelligence , 349(1996). Note that the experiments shown in ﬁg. 3 are those thatare in a local minimum of the hyper-parameter space. C. K. Joshi, Q. Cappart, L.-M. Rousseau, T. Laurent, andX. Bresson, arXiv preprint arXiv:2006.07054 (2020). M. Hibat-Allah, E. M. Inack, R. Wiersema, R. G. Melko,and J. Carrasquilla, arXiv preprint arXiv:2101.10154(2021). B. McNaughton, M. Miloˇsevi´c, A. Perali, and S. Pilati,Physical Review E , 053312 (2020). B. Berghoﬀ, S. Suckow, R. R¨olver, B. Spangenberg,H. Kurz, A. Dimyati, and J. Mayer, Applied Physics Let-ters , 132111 (2008). J. Kempe, A. Kitaev, and O. Regev, SIAM Journal onComputing , 1070 (2006), copyright: Copyright 2011Elsevier B.V., All rights reserved. M. Le Bellac,

Quantum physics (Cambridge UniversityPress, 2011). F. Becca and S. Sorella,

Quantum Monte Carlo approachesfor correlated systems (Cambridge University Press, 2017). F. P. Laussy, E. del Valle, and C. Tejedor, Phys. Rev.Lett. , 083601 (2008). N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H.Teller, and E. Teller, The journal of chemical physics ,1087 (1953). L. Yang, Z. Leng, G. Yu, A. Patel, W.-J. Hu, and H. Pu,Physical Review Research , 012039 (2020). Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada,Phys. Rev. B , 205152 (2017). Y. Nomura, Journal of Physics: Condensed Matter (2021). V. Vargas-Calder´on, H. Vinck-Posada, and F. A.Gonz´alez, Journal of the Physical Society of Japan ,094002 (2020). There exists a one-to-one correspondence between validtours in the qubit representation and the qudit represen-tation, namely, a one-hot encoding of the qudit represen-tation yields the qubit representation. D. P. Kingma and J. Ba, arXiv preprint arXiv:1412.6980(2014). I. Hen, J. Job, T. Albash, T. F. Rønnow, M. Troyer, andD. A. Lidar, Physical Review A , 042325 (2015). The particular layout of our cities-in-a-line problem in-duces another symmetry resulting in a degeneracy of theground state, which we do not take into account explicitlyby either of the variational wavefunctions. The degener-acy, considering 4 cities, is seen by checking that the tour1 → → → → → → → → → → → → T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, in

Proceedings of the 25th ACM SIGKDD international con-ference on knowledge discovery & data mining (2019) pp.2623–2631. J. Bergstra, R. Bardenet, Y. Bengio, and B. K´egl, in , Vol. 24 (Neural Information ProcessingSystems Foundation, 2011). K. Jamieson and A. Talwalkar, in

Artiﬁcial Intelligenceand Statistics (PMLR, 2016) pp. 240–248.

Appendix A: Variational Monte Carlo

In general, the quantum many-body wave function of a physical system can be written as | Ψ (cid:105) = (cid:80) n ,n ,..,n N ψ ( n , n , .., n N ) | n , n , .., n N (cid:105) ≡ (cid:80) n ψ ( n ) | n (cid:105) where n is a set of fermionic or bosonic degrees of freedom.One is usually interested in the ground state, which is a particular | Ψ (cid:105) (i.e. a particular set of coeﬃcients ψ ( n ))that minimises the expected value of the system’s Hamiltonian. Finding the ground state is a QMA problem thatbecomes exponentially hard with the number of degrees of freedom. Variational Monte Carlo (VMC) is a method thattries to solve this problem, by leveraging the well-known variational method in quantum mechanics to quantum me-chanical systems with intractable Hilbert spaces . VMC considers a variational wave function with a set of variationalparameters θ , meaning that the coeﬃcients ψ ( n ) are parameterised, i.e. ψ θ ( n ). Then, as in the variational method,we minimise the expected value of the Hamiltonian (cid:104) Ψ θ | ˆ H | Ψ θ (cid:105) / (cid:104) Ψ θ | Ψ θ (cid:105) with respect to the variational parameters θ . However, this expectation value is practically impossible to compute, so VMC provides a way to approximate it.By using the completeness relation (cid:80) n | n (cid:105) (cid:104) n | = ˆ1, (cid:104) ˆ H (cid:105) Ψ θ = (cid:80) n , n (cid:48) ψ ∗ θ ( n ) (cid:104) n | ˆ H | n (cid:48) (cid:105) ψ θ ( n (cid:48) ) (cid:80) n | ψ θ ( n ) | . (A1)Multiplying the addends in the numerator by ψ θ ( n ) /ψ θ ( n ) yields (cid:104) ˆ H (cid:105) Ψ θ = (cid:80) n , n (cid:48) | ψ θ ( n ) | (cid:104) n | ˆ H | n (cid:48) (cid:105) ψ θ ( n (cid:48) ) ψ θ ( n ) (cid:80) n | ψ θ ( n ) | . (A2)The term | ψ θ ( n ) | / (cid:80) n | ψ θ ( n ) | is the probability P ( n ) of the conﬁguration n , which displays the expected valuein eq. (A2) as an expectation value of a random variable, i.e. it has the form (cid:104) ˆ H (cid:105) Ψ θ = (cid:80) n P θ ( n ) f θ ( ˆ H, n ), which canbe approximated by considering only a subset of the conﬁgurations n in a sample M . Thus, we have the approximateexpectation value of ˆ H via (cid:104) ˆ H (cid:105) ≈ (cid:88) n ∈M (cid:88) n (cid:48) P θ ( n ) (cid:104) n | ˆ H | n (cid:48) (cid:105) ψ θ ( n (cid:48) ) ψ θ ( n ) , (A3)which is a good approximation as long as (cid:80) n ∈M P θ ( n ) ≈

1. The sum over n (cid:48) in eq. (A3) can be performed exactlybecause ˆ H is usually a sparse operator, meaning that for ﬁxed n , (cid:104) n | ˆ H | n (cid:48) (cid:105) = 0 for almost every | n (cid:48) (cid:105) .Note that VMC is eﬀectively truncating the Hilbert space basis, which is why any method that builds samples M can be seen as an automatic truncation algorithm. Truncation is necessary most of the times, even with seeminglysimple physical systems such as a qubit interacting with a light mode . In such a system, we can order the statesby number of excitations in the system and crop the the states tower at a given number of excitations where theground-state (or even steady-state for open quantum systems) calculation converges. However, in general, it is notstraight-forward to order the basis with a simple criterion. For this reason, VMC becomes a useful tool, especiallyfor intractable Hilbert spaces, allowing us to discard states that are not relevant for the description of the quantummechanical system at hand.In this work we use the Metropolis-Hastings (MH) algorithm to prepare the sample M as follows. In the ﬁrstMH iteration we propose a initial state n . In the j -th MH iteration, we propose a new state n (cid:48) i from n i using asome update rule. We accept the new state with probability (cid:12)(cid:12)(cid:12) ψ θ ( n (cid:48) i ) ψ θ ( n i ) (cid:12)(cid:12)(cid:12) . If we accept the new state, then n i +1 ← n (cid:48) i ,else n i +1 ← n i . We stop iterating after a ﬁxed number of iterations. Appendix B: Neural Quantum States

A neural quantum state | Ψ θ (cid:105) deﬁnes a wavefunction through the coeﬃcients { ψ θ ( n ) } that result from the evaluationof a neural network with inputs n and parameters θ . In other words, a quantum state is encoded into a neural network,and the wavefunction coeﬃcient corresponding to the city conﬁguration, or city tour n (in the case of the many-quditsystem with the Hamiltonian given by eq. (4)) is obtained by feeding the tour to the neural network. In this work,we consider two diﬀerent neural networks: one for the qubit representation of the TSP using a spin-glass model, andanother one for the many-qudit representation. For the qubit representation, we will use a binary variable neuralnetwork that has shown outstanding results in many-body problems, namely the Restricted Boltzmann Machine (RBM), and for the qudit representation, we will use a convolutional neural network (CNN). a) b) FIG. 4. Neural quantum states for a) the qubit and b) the qudit representation of the TSP, depicted for 4 cities. The black linewith 4 nodes represents, in both panels, a tour conﬁguration of 4 cities. a) shows an RBM that uses 4 qubits, which matchthe visible neurons of the RBM, and has 3 hidden neurons. b) shows a CNN whose output is sum-reduced and fed into a fullyconnected layer . In both cases, the neural network parameters are complex numbers. As in many relevant physical scenarios, tours are unmodiﬁed by translational symmetry due to periodic boundaryconditions: ˆ T | n , n , . . . , n N (cid:105) = | n , . . . n N , n (cid:105) = e id | n , n , . . . , n N (cid:105) . This also applies to the qubit representation ofthe TSP . This is why a natural choice of neural network for the qudit representation of the TSP is a 1-dimensionalCNN with periodic padding. As per the qubit representation, translational symmetries can be also imposed toRBMs .The speciﬁc architectures of both the RBM and the CNN are shown in ﬁg. 4. The coeﬃcients ψ θ RBM ( σ ) for thequbit representation ( σ is a vector of N qubit values where σ i ∈ {− , } ) are directly given by the expression ψ θ RBM ( σ ) = e (cid:80) j a j σ j N H (cid:89) (cid:96) =1  b (cid:96) + (cid:88) j W (cid:96),j σ j  , (B1)where { a , b , W } are the complex-valued visible bias, hidden bias and connection matrix of the RBM, respectively,and N H is an hyper-parameter called the number of hidden neurons. On the other hand, the coeﬃcients of the CNNare determined by a 1-dimensional convolutional layer whose output is a matrix with as many rows as cities in theTSP, and as many columns as channels of the convolutional layer (i.e. the number of ﬁlters to be applied). Morespeciﬁcally, the output of the convolutional layer is O i,f = g (cid:32) K (cid:88) k =1 W k,f n ( i + k ) mod N + b f (cid:33) , (B2)where there are a total of F channels, g is the so-called activation function, which we take to be a rectiﬁed-linear unit( g ( x ) = max { , x } ), K is the kernel size of each ﬁlter f , and W and b are the ﬁlter matrix and the bias vector of theconvolutional layer. Then, a vector o is obtained through o f = (cid:80) i O i,f . This vector is an input to a fully-connectedlayer with one output neuron, which returns log( ψ θ CNN ( n )).NQSs tend to induce complicated non-linear dependencies between the parameters θ and the coeﬃcients ψ θ ( n ),which is why techniques based on stochastic gradient descent or stochastic reconﬁguration are needed to minimise theHamiltonian expectation value. In particular, we use the Adam optimiser . This makes VMC an iterative method,where on each Monte Carlo step a sample is built through MH, and parameters θ are updated. Thus, VMC allowsus to navigate the Hilbert space, taking into account only the states that have high probability . Each MC stepwill therefore sample a portion of the Hilbert space of the physical system, and will minimise the Hamiltonian. Thealgorithm converges after repeating MC steps a certain number of times to a local minimum of the energy, which hasbeen empirically shown to coincide with the global minimum of the energy in many VMC+NQS studies . Appendix C: Experimental Setup

We elaborate a TSP problem that allowed us to “plant solutions” , which means that there is, by construction, aknown ground-state conﬁguration of eq. (4). This is useful for benchmarking purposes, as we do not need to use anexact solver to ﬁnd the correct solution of a TSP problem. We set N cities to be on a straight line. Each city i has acoordinate x i = N . Without loss of generality, we can set the ﬁrst city in the line at x = 1 to be the ﬁrst city in thetour, as this only restricts the salesman to be in a (translational) symmetry sector of the TSP . A solution to theTSP of N cities in this setup is straight-forward to obtain: (1 , , . . . , N ) is a tour that solves the TSP.The RBM and CNN variational wavefunctions, as well as the VMC, MH and Adam algorithms possess somehyper-parameters which we examine thoroughly. In particular, we have the following hyper-parameters: • RBM. i) number of hidden units N H . • CNN. i) number of channels F ; kernel-size of the ﬁlter K . • MH. i) number of Markov chains N MC , which indicates the number of parallel MH processes on a single MCstep; ii) number of city-swaps N S , which indicates the number of swaps of the MH update rule, meaning thattwo sites are picked at random from a state n and then are swapped; iii) maximum length of swap (cid:96) S , whichmeans that in the update rule, one city is chosen at random, but the other one is also chosen at random, butmust be at most (cid:96) S sites away from the ﬁrst chosen city. • MC. i) sample size S = |M| . • Adam. i) learning rate α , which controls the amount of change in the neural network’s parameters for everyMC step.For benchmarking purposes, each Markov chain is initialised in a state built as follows. Using the distance matrixof the cities, take one city at random. Then, pick the city farthest from the ﬁrst one, and visit it. Repeat until yourun out of cities. We use this initialisation method to have a reproducible tour that is certainly not the shortest tour,and allows us to benchmark the VMC+NQS technique.For a chosen value of number of cities N , we comprehensively study the hyper-parameters of our method byperforming 400 experiments with diﬀerent values. The hyper-parameter values for each of the 400 experiments wereproposed by the Optuna optimiser , which uses sampling and pruning strategies such as the tree-structured Parzenestimator and the asynchronous successive halving method60