[PDF] An end-to-end trainable hybrid classical-quantum classifier

Abstract

We introduce a hybrid model combining a quantum-inspired tensor network and a variational quantum circuit to perform supervised learning tasks. This architecture allows for the classical and quantum parts of the model to be trained simultaneously, providing an end-to-end training framework. We show that compared to the principal component analysis, a tensor network based on the matrix product state with low bond dimensions performs better as a feature extractor for the input data of the variational quantum circuit in the binary and ternary classification of MNIST and Fashion-MNIST datasets. The architecture is highly adaptable and the classical-quantum boundary can be adjusted according the availability of the quantum resource by exploiting the correspondence between tensor networks and quantum circuits.

Full PDF

AAn end-to-end trainable hybrid classical-quantumclassiﬁer

Samuel Yen-Chi Chen

E-mail: [email protected]

Computational Science Initiative, Brookhaven National Laboratory

Chih-Min Huang

E-mail: [email protected]

Department of Physics, National Taiwan University, Taipei 10617, Taiwan

Chia-Wei Hsing

E-mail: [email protected]

Department of Physics, National Taiwan University, Taipei 10617, Taiwan

Ying-Jer Kao

E-mail: [email protected]

Department of Physics, National Taiwan University, Taipei 10617, Taiwan

Abstract.

We introduce a hybrid model combining a quantum-inspired tensornetwork and a variational quantum circuit to perform supervised learning tasks.This architecture allows for the classical and quantum parts of the model to betrained simultaneously, providing an end-to-end training framework. We showthat compared to the principal component analysis, a tensor network based onthe matrix product state with low bond dimensions performs better as a featureextractor for the input data of the variational quantum circuit in the binary andternary classiﬁcation of MNIST and Fashion-MNIST datasets. The architecture ishighly adaptable and the classical-quantum boundary can be adjusted accordingthe availability of the quantum resource by exploiting the correspondence betweentensor networks and quantum circuits.

Submitted to:

Machine Learning: Science and Technology a r X i v : . [ qu a n t - ph ] F e b n end-to-end trainable hybrid classical-quantum classiﬁer

1. Introduction

Quantum computing (QC) has demonstrated superiority in problems intractable onclassical computers [1, 2], such as factorization of large numbers [3] and searchin an unstructured database [4]. Recent growth of the quantum volume in noisyintermediate-scale quantum (NISQ) [5] devices has stimulated rapid development incircuit-based quantum algorithms. Due to the noise associated with the quantumgates and lack of quantum error correction on NISQ devices, performing quantumcomputation with large circuit depth is impossible currently. It is therefore highlydesirable to develop quantum algorithms that are resilient to noise with moderatecircuit depth. Variational quantum algorithms [6] are a class of algorithms currentlyunder rapid development in many ﬁelds. In particular, quantum machine learning(QML) [7, 8, 9] using variational quantum circuits (VQC) shows great potential insurpassing the performance of classical machine learning (ML). One of the majoradvantages of VQC-based QML compared to its classical counterpart is the drasticreduction in the number of model parameters, potentially mitigating the problemof overﬁtting common in the classical ML. Moreover, it has been shown that undercertain conditions, QML models may learn faster or achieve higher testing accuraciesthan its classical counterpart [10, 11]. A modern QML architecture typically includes aclassical and a quantum part. Famous examples in this hybrid genre include quantumapproximate optimization algorithm [12], and quantum circuit learning [13], wherethe VQC plays an crucial role as the quantum component with the circuit parametersupdated via a classical computer. Various architectures and geometries of VQChave been suggested for tasks ranging from binary classiﬁcation [11, 14, 15, 13] toreinforcement learning [16, 17, 18].One of the key challenges in the NISQ era is that available quantum hardwareshave limited quantum volume and are only capable of executing quantum operationswith small circuit-depth. That means that most of the dataset commonly used forclassical ML tasks are too large for the NISQ devices. To process the data withinput dimension exceeding the number of available qubits, it is necessary to applydimensional reduction techniques to ﬁrst compress the input data. For example, inRef. [19], pre-trained classical deep convolutional neural network is used to compressthe high-resolution images into a low dimension representation. However, since thepre-trained model there has a huge number of parameters, it is not clear what is thecontribution of the quantum circuit in the whole workload.On the other hand, a major challenge in building a QML model is how to encodehigh-dimensional classical data into a quantum circuit eﬃciently. With the limitationimposed by NISQ in mind, the encoding process should be designed to consume as fewgate operations as possible.

Amplitude encoding is one of the encoding method whichcan provide signiﬁcant advantage in terms of the number of qubits required to handlethe input data. For an N -dimensional vector, amplitude encoding requires only log N qubits; however, the quantum circuit depth to prepare such encoded state exceeds thecurrent limits of NISQ devices. Other approaches like single-qubit rotations requireonly a shallow circuit but it is unclear how to employ such encoding schemes to loadhigh-dimensional data into a quantum circuit. This can be potentially mitigated bypreprocessing the input data with classical methods to perform dimension reduction.Principal component analysis (PCA) is a simple dimension reduction method and hasbeen widely used in the QML research; yet it lacks the representation power to retainenough information. More powerful and expressive models such as neural networks n end-to-end trainable hybrid classical-quantum classiﬁer

2. Tensor Network

Tensor networks are eﬃcient representation of data residing in high-dimensional space.Originally developed in the context of condensed matter physics, TNs have gainedattention in the deep learning community, for both theoretical understanding andcomputationally eﬃciency [25]. They have provided new inspiration for machinelearning algorithms and showed encouraging success in both discriminative [26, 27, 28,29] and generative learning tasks [30]. In addition, the quantum entanglement inherentin the formulation of tensor networks points to a new direction in understanding themechanism of deep neural networks and may provide a better way to design newnetwork architectures [31, 26].It is common to use graphical notation to express tensor networks. A tensoris represented as a closed shape, typically a circle, with emanating lines representingtensor indices (Fig. 1). The joined line indicates the corresponding index is contracted,as in the Einstein convention where repeated indices are summed over. The MPS,also known as tensor train, is the simplest TN, and the most widely used tensornetworks in physics to study low-dimensional quantum systems, and has recentlyfound application in the ﬁeld of machine learning [32, 33, 34, 35, 28, 36] In an MPS,tensors are contracted through the “virtual” indices ( α ’s in Fig. 1(d)). The dimensionof these virtual indices are called bond dimension and is indicated by χ . In the MPSrepresentation of a quantum wave function, the bond dimension indicates the amountof quantum entanglement the MPS can represent in the bond. In the context of ML,this corresponds to the representation power of the MPS.In the current study, we choose the MPS as our TN for simplicity; thereare other examples of TN with distinct entanglement structures such as the treetensor network (TTN), multi-scale entanglement renormalization ansatz (MERA) andprojected entangled pair state (PEPS). The successful application of a speciﬁc TN canalso give insights into the hidden correlations in the data. The quantumness inherent inthe TN gives it great advantage over other architectures in the application of QML. In n end-to-end trainable hybrid classical-quantum classiﬁer TN for Hybrid Quantum-Tensor Network Architecture for TNVQCClassiﬁer i v i (a) (b) (c) (d) i j M ij T ijk i kj α α α N −1 i i i N A (1) i , α A (2) i , α α ⋯ A ( N ) i N , α N −1 Figure 1. Graphical Notation for Tensors and Tensor Networks.

Graphical tensor notation for (a) a vector, (b) a matrix, (c) a rank-3 tensorand (d) a MPS. Here we follow the Einstein convention that repeated indices,represented by internal lines in the diagram, are summed over.

Figure 2. MPS as a feature extractor.

Data is encoded into a product state(blue nodes) which is contract with an MPS (yellow node) and a class label isgenerated. particular, since each TN can be mapped to a quantum circuit, it means that althoughin the current scheme, the TN is treated classically, it is possible to replace the wholeor part of the TN component by an equivalent quantum circuit when more qubits areavailable. This gives the current scheme the ﬂexibility to move the quantum-classicalboundary based on the available resources.We will use the MPS as a feature extractor to compress the input data. FollowingRef. [33], we approximate a feature extractor by the MPS decomposition as T li i ··· i N = (cid:88) { α } A (1) i ,α A (2) i ,α α A (3) i ,α α · · · A ( j ) ,li j ,α j − α j +1 · · · A ( N ) i N ,α N − , (1)illustrated in Fig. 2.

3. Variational Quantum Circuit

Variational quantum circuits originates from a quantum algorithm called variationalquantum eigensolver [37], which is originally used to compute ground states. Thisfamily of algorithms have recently drawn signiﬁcant attention and numerous eﬀortshave been made to extend their applications [6]. VQCs have been successfully appliedto function approximation [13, 10], classiﬁcation [13, 15, 38, 39, 40, 19, 14, 41, 42, 11],generative modeling [43, 44, 45, 46, 47], metric learning [48, 49], deep reinforcementlearning [16, 17, 50] , sequential learning [10, 51], speech recognition [52] and transferlearning [19]. It has been shown that VQCs are more expressive than conventionalneural networks [53, 54, 55, 56] with respect to the number of parameters or thelearning speed. It is demonstrated that with similar number of parameters, VQC-based models outperform classical models on testing accuracies [11], and achiveoptimal accuracy in function approximation tasks with fewer training epochs thantheir classical counterparts [10]. Of particular interests for NISQ applications, ithas been shown that such circuits are potentially resilient to noises in quantum n end-to-end trainable hybrid classical-quantum classiﬁer | (cid:105) H R y (arctan( x )) R z (arctan( x )) • R ( α , β , γ ) | (cid:105) H R y (arctan( x )) R z (arctan( x )) • R ( α , β , γ ) | (cid:105) H R y (arctan( x )) R z (arctan( x )) • R ( α , β , γ ) | (cid:105) H R y (arctan( x )) R z (arctan( x )) • R ( α , β , γ ) Figure 3. Generic variational quantum circuit architecture.

The VQCcomponent used in this work consists of three parts: the encoding part, thevariational part with parameters to be learned and the measurement part whichwill output the Pauli- Z expectation values via multiple run of the quantum circuit.The quantum measurement would be performed on ﬁrst k qubits where k is thenumber of classes. hardware [57, 12, 58], and such robustness has been demonstrated empirically oneither noisy simulators or real quantum hardware [49, 16]. This strongly suggests thatVQC-based architectures are suitable for building ML applications on NISQ devices.The VQC used in this work consists of three parts (Fig. 3): The ﬁrst part isthe encoding part which consists of Hadamard gate H and single qubit rotation gates R y (arctan( x i )) and R z (arctan( x i )), representing rotations along y -axis and z -axis bythe given angle arctan( x i ) and arctan( x i ), respectively. The Hadamard gate H is isused to create unbiased initial state as described Appendix A. Notice the rotationangles arctan( x i ) and arctan( x i ) are for state preparation and come directly fromthe input classical data. The data encoding part should be designed with respectto the problem of interest and plays a crucial role in the overall architecture [59].Potential quantum advantage depends heavily on the encoding scheme together withthe hardware limitations incorporated in the design. The second part is the variational part which consists of CNOT gates used to entangle quantum states from each qubitand R ( α, β, γ ) representing the general single qubit unitary gate with three parameters α i , β i and γ i to be learned. These circuit parameters can be regarded the weights inthe classical neural networks. The ﬁnal part is the measurement part which will outputthe Pauli- Z expectation values via multiple run of the quantum circuit. The retrievedvalues (logits) will go through classical processing such as softmax to generate the probability of each possible class. The quantum measurement would be performed onﬁrst k qubits where k is the number of classes.

4. Hybrid TN-VQC Architecture

Figure 4 shows the architecture of the hybrid TN-VQC model. The input image of N = 28 ×

28 = 784 pixels from MNIST or Fashion-MNIST is ﬂatten into a 784-dimensional vector x = ( x , x , . . . , x N ), and each component is normalized such that x i ∈ [0 , x → | Φ( x ) (cid:105) = (cid:20) x − x (cid:21) ⊗ (cid:20) x − x (cid:21) ⊗ · · · ⊗ (cid:20) x N − x N (cid:21) , (2)and further process by the MPS to generate a compressed representation. Thefeature vector is then encoded into the quantum circuit using the variational encoding n end-to-end trainable hybrid classical-quantum classiﬁer Network Architecture

Quantum ComputerTensor Network

Compressed Representation [ ]

Input Data [ ]

28 × 28

Dim: Dim: E n c o d i n g θ θ Figure 4. Hybrid TN-VQC Architecture.

The tunable parameters labeledwith θ and θ are the parameters for the MPS and VQC respectively. See textfor details. (See Appendix A). At the end of the VQC, the quantum measurement would beperformed to generate the logits for classiﬁcation. Both the TN and VQC havetunable parameters, labeled as θ and θ respectively in Fig. 4, which are optimized viagradient descend methods. Gradients of the quantum circuit parameters are calculatedusing the parameter-shift method (See Appendix B), which avoids the use of ﬁnitediﬀerence calculation. This method is similar to the computation of gradients inneural networks; therefore, the end-to-end training of this TN-VQC model follows thestandard backpropagation method as in the training of deep neural networks, and nopre-trained classical model is needed.Since the classical and quantum parts of the model can be trained simultaneously,it allows for more ﬂexibility in terms of implementation on the quantum hardware.When more quibits are available, one simply increases the dimension of the featurevector out of the MPS to match the input of the VQC and retrain the model. On theother hand, the modular architecture also has the advantage that the classical andquantum parts can be reused independently. For example, it is possible to performtransfer learning by freezing the parameters in the MPS/VQC and training the otherpart to tackle diﬀerent types of problems.

5. Experiments and Results

We study the capabilities of the hybrid TN-VQC architecture by performingclassiﬁcation tasks on the standard benchmark dataset MNIST [60] and Fashion-MNIST [61]. Results for the binary classiﬁcation of MNIST have been presentedin Ref. [24]. Here, we perform ternary classiﬁcation for MNIST and both binary andternary classiﬁcations for Fashion-MNIST. As a baseline, we perform the same tasks ona hybrid PCA-VQC model, where the PCA part serves as the simple feature extractorand the VQC as the classiﬁer. As a comparison, we also present results using the MPSas a classiﬁer to demonstrate the role of VQC in the workload. The computationaltools we use for the simulation of variational quantum circuits and tensor networksare PyTorch [62], PennyLane [63] and Qulacs [64]. Details of the simulations such as n end-to-end trainable hybrid classical-quantum classiﬁer A cc u r ac y MPS ( χ = 1) PCA-VQC

MPS-VQC ( χ = 1) Figure 5. Binary classiﬁcation of class 5 and 7.

Results of the binaryclassiﬁcation of Fashion-MNIST (class 5 vs 7). Solid line: training accuracy.Dashed line: test accuracy the hyperparameters and optimizers for each experiment are summarized in AppendixC.

We perform binary classiﬁcation of the Fashion-MNIST dataset (class 5 vs 7), whichis a more diﬃcult task than the binary classiﬁcation of MNIST performed in Ref. [24].The results from diﬀerent models are shown in Fig. 5. For the MPS classiﬁer and theMPS-VQC model, bond dimension χ = 1 is demonstrated. It is evident from boththe training and test accuracy that such a bond dimension, while insuﬃcient for theMPS classiﬁer to yield good results, is enough for the MPS-VQC hybrid model tolearn properly and reach a test accuracy around 96%. As the number of parameters ofthe VQC part is far fewer than that of the MPS, this suggests that our VQC possessgreater power in classiﬁcation and dominates the workload. It is also clear that MPS,compared to PCA, serves as a better feature extractor for a VQC discriminator. Fora more diﬃcult dataset, we expect that a higher bond dimension will be needed. In the ternary classiﬁcation, we consider both the MNIST (class 0, 3, 6) and Fashion-MNIST (class 5, 7, 9) datasets. The results for the MPS-VQC model and the baselinemodel PCA-VQC are shown in Figure 6. Since ternary classiﬁcation is a more diﬃculttask, a larger bond dimension is required for the MPS part. In terms of performance,all show that the MPS is superior to the PCA as a feature extractor. With χ = 2,MPS-VQC is able to reach a test accuracy over 98% in MNIST and 92% in Fashion-MNIST. Furthermore, the representation power of an MPS feature extractor is tunable n end-to-end trainable hybrid classical-quantum classiﬁer A cc u r ac y PCA-VQCMPS-VQC ( χ = 2)MNIST (0, 3, 6) 0 20 40 60 80 100Epochs0.50.60.70.80.91.0 PCA-VQC M PS-VQC ( χ = 2) MPS-VQC ( χ = 3)FashionMNIST (5, 7, 9) Figure 6. Ternary classiﬁcation of MNIST and Fashion-MNIST.

Resultsof the ternary classiﬁcation of MNIST (class 0, 3, 6) and Fashion-MNIST (class5, 7, 9). Solid line: training accuracy. Dashed line: test accuracy. via χ , which is an advantage absent in PCA. In the case of Fashion-MNIST, we observea 2% increase in test accuracy as the bond dimension increases, indicating a betterdata compression capability due to the increased representation power of the MPS. Itis clear that for more complex classiﬁcation problems, the performance of PCA-VQCshould further deteriorate while that of MPS-VQC can be increased by increasing χ .

6. Conclusion

In this work, we present a hybrid quantum-classical classiﬁer by integrating a quantum-inspired tensor network and a variational quantum circuit. Such a hybrid TN-VQC architecture enables researchers to build QML applications capable of dealingwith larger dimensional inputs and potentially to implement these QML models onNISQ devices with limited number of qubits and shallow circuit depth. We furtherdemonstrate the superiority of this framework by comparing it with the baseline studyof a PCA-VQC model on ternary classiﬁcation task on the MNIST and Fashion-MNIST dataset as well as a binary classiﬁcation task of the Fashion-MNIST dataset.One clear advantage is that the representation power of the trainable MPS featureextractor is tunable with the bond dimension of the tensors. Our results pointto future application of the hybrid TN-VQC model in diﬀerent quantum machinelearning scenarios and potentially implementation on NISQ devices. Extension ofthis architecture to more complicated datasets such as CIFAR-10 should further testthe robustness and capability of the model. We note this requires more computingresources and better optimized simulators. The proposed hybrid model can beintegrated with various kinds of TNs or VQCs, given the suitable encoding methods.For example, one can replace the MPS by other TNs with other entanglement structuresuch as TTN, MERA and PEPS, whose potential in the supervised learning contexthas been demonstrated [27, 29, 65]. They may serve also good feature extractorsfor datasets that contain special structure and correlations. Another way to build amore sophisticated feature extractor is to stack multiple TN layers. It has been shown n end-to-end trainable hybrid classical-quantum classiﬁer

Appendix A. Encoding into Quantum States

In our hybrid framework, the outputs from the classical parts need to be encoded suchthat they can used by the quantum circuit. A general N -qubit quantum state can berepresented as: | ψ (cid:105) = (cid:88) ( q ,q ,...,q N ) ∈{ , } N c q ,...,q N | q (cid:105) ⊗ | q (cid:105) ⊗ | q (cid:105) ⊗ ... ⊗ | q N (cid:105) , (A.1)where c q ,...,q N are complex numbers. They are amplitudes of each quantum stateand q i ∈ { , } . The square of the amplitude c q ,...,q N represents the probability ofmeasurement results in | q (cid:105) ⊗ | q (cid:105) ⊗ | q (cid:105) ⊗ ... ⊗ | q N (cid:105) , and the total probability shouldsum to 1, i.e., (cid:88) ( q ,q ,...,q N ) ∈{ , } N || c q ,...,q N || = 1 . (A.2)In this work, we choose the variational encoding method to encode our classical datainto the quantum states. The initial quantum state | (cid:105) ⊗ · · · ⊗ | (cid:105) ﬁrst undergoes the H ⊗ · · · ⊗ H operation to create the unbiased state | + (cid:105) ⊗ · · · ⊗ | + (cid:105) , where H is theHadamard gate. Consider a n -qubit system, the corresponding unbiased initial stateis, ( H | (cid:105) ) ⊗ n = H | (cid:105) ⊗ · · · ⊗ H | (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) n = | + (cid:105) ⊗ · · · ⊗ | + (cid:105) (cid:124) (cid:123)(cid:122) (cid:125) n = (cid:20) √ | (cid:105) + | (cid:105) ) (cid:21) ⊗ · · · ⊗ (cid:20) √ | (cid:105) + | (cid:105) ) (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) n = (cid:20) √ | (cid:105) + | (cid:105) ) (cid:21) ⊗ n = 1 √ n ( | (cid:105) + | (cid:105) ) ⊗ n = 1 √ n ( | (cid:105) ⊗ . . . ⊗ | (cid:105) + . . . + | (cid:105) ⊗ . . . ⊗ | (cid:105) ) n end-to-end trainable hybrid classical-quantum classiﬁer

10= 1 √ n n − (cid:88) k =0 | k (cid:105) , (A.3)This initial quantum state will ﬁrst go through the encoding part which consists of R y and R z rotations. These rotation operations are parameterized by the input vector (cid:126)x = ( x , x , · · · x n ). On the i -the qubit with i = 1 · · · n , R y rotates the state byan angle of arctan( x i ) and R z by arctan( x i ). The encoded state is then processedwith the variational quantum circuits with optimizable parameters, as shown in thedashed-box in Fig. 3. Appendix B. Calculation of Gradients of Quantum Functions

Here the models are trained via gradient-descent methods widely used in trainingthe deep neural network. To calculate the gradients with respect to the parametersof quantum circuits, we employ the parameter-shift method [76, 63, 13]. Giventhe knowledge of computing the expectation values of an observable ˆ P on quantumfunction, f ( x ; θ i ) = (cid:68) (cid:12)(cid:12)(cid:12) U † ( x ) U † i ( θ i ) ˆ P U i ( θ i ) U ( x ) (cid:12)(cid:12)(cid:12) (cid:69) = (cid:68) x (cid:12)(cid:12)(cid:12) U † i ( θ i ) ˆ P U i ( θ i ) (cid:12)(cid:12)(cid:12) x (cid:69) , (B.1)where x is the classical input vector (e.g. the output values from the PCA or MPSparts), U ( x ) is the quantum encoding routine to prepare the classical value x intoa quantum state, i is the circuit parameter index for which the gradient is to beevaluated, and U i ( θ i ) represents the single-qubit rotation generated by the Paulioperators X, Y, Z . It can be shown [13] that the gradient of this quantum function f with respect to the parameter θ i is ∇ θ i f ( x ; θ i ) = 12 (cid:104) f (cid:16) x ; θ i + π (cid:17) − f (cid:16) x ; θ i − π (cid:17)(cid:105) . (B.2) Appendix C. Computational details

Here we summarize the optimizer and hyperparameters in our experiments. • MPS classiﬁer: In experiments where MPS alone is used as a classiﬁer, theoptimizer is Adam [77] with a learning rate of 0 .

001 and batch size of 100. • PCA-VQC model: In our baseline model, PCA is used to reduce the inputdimension of 28 ×

28 = 784 into a four-dimensional vector and is implementedwith the Python package scikit-learn [78]. For the VQC classiﬁer, the optimizeris RMSProp [79] with the hyperparameters: learning rate = 0 . α = 0 .

99 and (cid:15) = 10 − . • MPS-VQC hybrid model: In the hybrid TN-VQC architecture, the optimizer isAdam [77] with a learning rate of 0 . × × n end-to-end trainable hybrid classical-quantum classiﬁer References [1] Harrow A W and Montanaro A 2017

Nature

SIAM review Physical review letters Quantum et al. arXiv preprint arXiv:2012.09265 [7] Schuld M and Petruccione F 2018 Supervised learning with quantum computers vol 17 (Springer)[8] Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N and Lloyd S 2017

Nature

Reports on Progress in Physics arXiv preprint arXiv:2009.01783 [11] Chen S Y C, Wei T C, Zhang C, Yu H and Yoo S 2020 arXiv preprint arXiv:2012.12177 [12] Farhi E, Goldstone J and Gutmann S 2014 arXiv preprint arXiv:1411.4028 [13] Mitarai K, Negoro M, Kitagawa M and Fujii K 2018 Physical Review A arXiv preprintarXiv:2006.12270 [15] Schuld M, Bocharov A, Svore K M and Wiebe N 2020 Physical Review A

IEEE Access Proceedingsof the AAAI Conference on Artiﬁcial Intelligence and Interactive Digital Entertainment vol 16 pp 245–251[18] Wu S, Jin S, Wen D and Wang X 2020 arXiv preprint arXiv:2012.10711 [19] Mari A, Bromley T R, Izaac J, Schuld M and Killoran N 2019 arXiv preprint arXiv:1912.08278 [20] ¨Ostlund S and Rommer S 1995

Phys. Rev. Lett. (19) 3537–3540 URL https://link.aps.org/doi/10.1103/PhysRevLett.75.3537 [21] Schollw¨ock U 2011 Annals of physics

Annals of Physics

117 – 158 ISSN 0003-4916 URL [23] Huggins W, Patil P, Mitchell B, Whaley K B and Stoudenmire E M 2019

Quantum Science andTechnology Preprint )[24] Chen S Y C, Huang C M, Hsing C W and Kao Y J 2020 Hybrid quantum-classical classiﬁerbased on tensor network and variational quantum circuit (

Preprint )[25] Or´us R 2019

Nature Reviews Physics https://doi.org/10.1038/s42254-019-0086-7 [26] Levine Y, Yakira D, Cohen N and Shashua A 2018 Deep learning and quantum entanglement:Fundamental connections with implications to network design International Conference onLearning Representations [27] Stoudenmire E M 2018

Quantum Science and Technology New Journal ofPhysics Preprint )[29] Reyes J and Stoudenmire M 2020 arXiv ( Preprint )[30] Han Z Y, Wang J, Fan H, Wang L and Zhang P 2018

Phys. Rev. X (3) 031012 URL https://link.aps.org/doi/10.1103/PhysRevX.8.031012 [31] Levine Y, Sharir O, Cohen N and Shashua A 2019 Physical Review Letters

Preprint )[32] Cohen N, Sharir O and Shashua A 2016 On the expressive power of deep learning: A tensoranalysis

Conference on learning theory pp 698–728[33] Stoudenmire E M and Schwab D J 2016

Advances in Neural Information Processing Systems Preprint ) URL http://arxiv.org/abs/1605.05775 [34] Bengua J A, Phien H N and Tuan H D 2015 Optimal feature extraction and classiﬁcation oftensors via matrix product state decomposition pp 669–672[35] Novikov A, Podoprikhin D, Osokin A and Vetrov D 2015 Tensorizing neural networks (

Preprint )[36] Efthymiou S, Hidary J and Leichenauer S 2019 Tensornetwork for machine learning (

Preprint )[37] Peruzzo A, McClean J, Shadbolt P, Yung M H, Zhou X Q, Love P J, Aspuru-Guzik A andO’brien J L 2014

Nature communications n end-to-end trainable hybrid classical-quantum classiﬁer [38] Havl´ıˇcek V, C´orcoles A D, Temme K, Harrow A W, Kandala A, Chow J M and Gambetta J M2019 Nature arXiv preprint arXiv:1802.06002 [40] Benedetti M, Lloyd E, Sack S and Fiorentini M 2019

Quantum Science and Technology arXiv preprintarXiv:2008.12616 [42] Sarma A, Chatterjee R, Gili K and Yu T 2019 arXiv preprint arXiv:1909.04226 [43] Dallaire-Demers P L and Killoran N 2018 Physical Review A arXiv preprintarXiv:2010.09036 [45] Zoufal C, Lucchi A and Woerner S 2019 npj Quantum Information arXiv preprint arXiv:1807.01235 [47] Nakaji K and Yamamoto N 2020 arXiv preprint arXiv:2010.13727 [48] Lloyd S, Schuld M, Ijaz A, Izaac J and Killoran N 2020 arXiv preprint arXiv:2001.03622 [49] Nghiem N A, Chen S Y C and Wei T C 2020 arXiv preprint arXiv:2010.13186 [50] Jerbi S, Trenkwalder L M, Nautrup H P, Briegel H J and Dunjko V 2019 arXiv preprintarXiv:1910.12760 [51] Bausch J 2020 arXiv preprint arXiv:2006.14619 [52] Yang C H H, Qi J, Chen S Y C, Chen P Y, Siniscalchi S M, Ma X and Lee C H 2020 arXivpreprint arXiv:2010.13309 [53] Sim S, Johnson P D and Aspuru-Guzik A 2019 Advanced Quantum Technologies et al. Physical Review X Phys. Rev. Research (3) 033125 URL https://link.aps.org/doi/10.1103/PhysRevResearch.2.033125 [56] Abbas A, Sutter D, Zoufal C, Lucchi A, Figalli A and Woerner S 2020 arXiv preprintarXiv:2011.00027 [57] Kandala A, Mezzacapo A, Temme K, Takita M, Brink M, Chow J M and Gambetta J M 2017 Nature

New Journal of Physics Supervised Learning with QuantumComputers (Cham: Springer International Publishing) pp 139–171 ISBN 978-3-319-96424-9URL https://doi.org/10.1007/978-3-319-96424-9_5 [60] LeCun Y http://yann. lecun. com/exdb/mnist/ [61] Xiao H, Rasul K and Vollgraf R 2017 Fashion-mnist: a novel image dataset for benchmarkingmachine learning algorithms (

Preprint )[62] Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N,Antiga L et al.

Advances in neural information processing systems pp 8026–8037[63] Bergholm V, Izaac J, Schuld M, Gogolin C, Blank C, McKiernan K and Killoran N 2018 arXivpreprint arXiv:1811.04968 [64] Suzuki Y, Kawase Y, Masumura Y, Hiraga Y, Nakadai M, Chen J, Nakanishi K M, Mitarai K,Imai R, Tamiya S et al. arXiv preprint arXiv:2011.13524 [65] Glasser I, Pancotti N and Cirac J I 2019 From probabilistic graphical models to generalizedtensor networks for supervised learning (

Preprint )[66] Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V andRabinovich A 2015 Going deeper with convolutions

Proceedings of the IEEE conference oncomputer vision and pattern recognition pp 1–9[67] Simonyan K and Zisserman A 2015 Very deep convolutional networks for large-scale imagerecognition

International Conference on Learning Representations [68] Blagoveschensky P and Phan A H 2020 arXiv preprint arXiv:2005.14506 [69] Cong I, Choi S and Lukin M D 2019

Nature Physics Quantum Science and Technology arXiv preprint arXiv:2009.09423 [72] Kerenidis I, Landman J and Prakash A 2019 arXiv preprint arXiv:1911.01117 [73] Henderson M, Shakya S, Pradhan S and Cook T 2020 Quantum Machine Intelligence arXiv preprintarXiv:1911.02998 [75] Pesah A, Cerezo M, Wang S, Volkoﬀ T, Sornborger A T and Coles P J 2020 arXiv preprintarXiv:2011.02966 n end-to-end trainable hybrid classical-quantum classiﬁer [76] Schuld M, Bergholm V, Gogolin C, Izaac J and Killoran N 2019 Physical Review A arXiv preprint arXiv:1412.6980 [78] Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, PrettenhoferP, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M andDuchesnay E 2011 Journal of Machine Learning Research12