[PDF] One qubit as a Universal Approximant

Abstract

A single-qubit circuit can approximate any bounded complex function stored in the degrees of freedom defining its quantum gates. The single-qubit approximant presented in this work is operated through a series of gates that take as their parameterization the independent variable of the target function and an additional set of adjustable parameters. The independent variable is re-uploaded in every gate while the parameters are optimized for each target function. The output state of this quantum circuit becomes more accurate as the number of re-uploadings of the independent variable increases, i. e., as more layers of gates parameterized with the independent variable are applied. In this work, we provide two different proofs of this claim related to both the Fourier series and the Universal Approximation Theorem for Neural Networks, and we benchmark both methods against their classical counterparts. We further implement a single-qubit approximant in a real superconducting qubit device, demonstrating how the ability to describe a set of functions improves with the depth of the quantum circuit. This work shows the robustness of the re-uploading technique on Quantum Machine Learning.

Full PDF

OOne qubit as a Universal Approximant

Adri´an P´erez-Salinas,

1, 2

David L´opez-N´u˜nez,

1, 3, 2

Artur Garc´ıa-S´aez,

1, 4

P. Forn-D´ıaz,

3, 4 and Jos´e I. Latorre

4, 5, 6 Barcelona Supercomputing Center (BSC), Spain Departament de F´ısica Qu`antica i Astrof´ısica and Institut de Ci`encies del Cosmos (ICCUB), Universitat de Barcelona, Spain Institut de F´ısica d’Altes Energies (IFAE), Spain Qilimanjaro Quantum Tech, Barcelona, Spain Center for Quantum Technologies (CQT), Singapore Quantum Research Centre, Technology Innovation Institute (TII), Abu Dhabi

A single-qubit circuit can approximate any bounded complex function stored in the degrees offreedom deﬁning its quantum gates. The single-qubit approximant is operated through a series ofgates that take as their input the independent variable of the target function and an additional set ofadjustable parameters. The independent variable is re-uploaded in every gate while the parametersare optimized for each target function. The result of this quantum circuit becomes more accurateas the number of re-uploadings of the independent variable increases. In this work, we providetwo diﬀerent proofs stating that a single-qubit circuit is a universal approximant, ﬁrst by a directcasting of a series of exponentials to standard Fourier analysis and, second, by an analogous proof forquantum systems of the universal approximation theorem for neural networks. We also benchmarkthe performance of both methods and compare them to their classical counterparts. We furtherimplement a single-qubit approximant in a real superconducting qubit device, demonstrating howthe ability to describe a set of functions improves with the depth of the quantum circuit.

I. INTRODUCTION

A quantum computer can be viewed as a machine thatreceives some inputs and delivers outputs through theread-out of qubits. The design of the sequence of quan-tum gates forming the circuit will determine the kindof processing which is performed. A fundamental ques-tion to pose is whether a quantum circuit can deliver anypossible functionality and, if so, what number of qubitsand depth are required to achieve a given accuracy whendelivering a speciﬁc function.This problem is reminiscent to a series of classical the-orems that establish that a given function can be re-expressed as a linear combination of other functions [1–4].In particular, in classical machine learning the UniversalApproximation Theorem (UAT) [3] proves that a neu-ral network with a unique intermediate hidden layer canconverge to approximate any continuous function. Theaccuracy of the approximation increases with the numberof neurons in the intermediate layer. It is important tonotice that each of these neurons is fed with the originaldata of the problem. The query complexity of the pro-cess increases linearly with the number of neurons. Thisobservation is critical to ﬁnd out an equivalent result ina quantum formulation in order to support progress inquantum machine learning [5–13]. Previous works havealready made relevant contributions in this area [14, 15],in particular assessing the universal expressive power ofquantum circuits.In this paper, we present two independent proofs thatany bounded complex function can be approximated ina convergent way by a quantum circuit acting on onequbit, constituting a single-qubit universal approximant.This demonstrates the precise representation power of asingle-qubit circuit, which increases as more layers areadded. The essential element of the present construction is the re-uploading of the input variables along the actionof the quantum gates citereuploading-perezsalinas2020.Thus, in analogy to neural networks, query complexity isattached to accuracy. The ﬁrst way to prove this result isto make contact with harmonic analysis. This is a naturalstep as single-qubit gates are expandable in Fourier seriesthat can be rearranged to ﬁt existing theorems. Thisconnection between a single-qubit quantum circuit andFourier analysis allows then to establish that the formercan serve as universal approximant to any function pointby point, that is, in the most demanding way. The secondmethod is analogous to the UAT using a translation intoquantum circuits. A series of speciﬁc gates leads to anoutput state that approximates functions uniformly.The practical way to approximate any function witha quantum circuit requires ﬁnding an explicit set of pa-rameters to deﬁne the unitary gates. This can be ac-complished in a variational way. Compared to Fourierseries, this approach brings more power to a quantumcomputer in practise. The possibility of taking angleswhich are free of being multiples of a given basic fre-quency provides a larger representation capability for aquantum circuit. However, this is analogue to neuralnetworks with weights which are not constrained to takespeciﬁc values.We provide numerical benchmarks to these theoremscomputed via classical simulation of quantum comput-ers. Our simulated results show how an increasing querycomplexity can improve the accuracy of the approxima-tion of a number of test functions. The way a single-qubitcircuit can approximate any function in an experimentalsetup by using a superconducting qubit is explicitly il-lustrated. Results show how an increasing numbers ofsingle-qubit gates can progressively approximate a num-ber of test functions, up to a point where the accumula-tion of errors dominates the experiment. a r X i v : . [ qu a n t - ph ] F e b This article is organized as follows. In Sec. II we intro-duce the idea of a single-qubit approximant and presenta sketch of the proof. Sec. III is devoted to the numer-ical benchmarks to test the approximation algorithms.The experimental implementation using a superconduct-ing qubit is described in Sec. IV. Results are presentedin Sec. V. We leave conclusions for Sec. VI. There is alsoan appendix with detailed explanations on some aspectsof this work.

II. UNIVERSALITY OF THE SINGLE-QUBITAPPROXIMANT

In this section, we propose a simple mapping betweenfunctions and qubits. Two diﬀerent architectures for cir-cuits are created to guarantee the convergence of the ap-proximation in diﬀerent degrees.

A. Set-up of the problem

The most general representation of a single-qubit quan-tum state stores a single complex number. That is, | ψ (cid:105) = (cid:112) − f | (cid:105) + f e iφ | (cid:105) , (1)with f, φ real numbers and f ∈ [0 ,

1] and φ ∈ [0 , π ). Ouraim is to encode a complex function within the values( f, φ ) by deﬁning them as f : R m → [0 ,

1] and φ : R m → [0 , π ]. To do so, we design a circuit U f,φ ( x ) such that itsoutput state approximates the desired complex functionas (cid:12)(cid:12)(cid:12) f ( x ) e iφ ( x ) − (cid:104) | U f,φ ( x ) | (cid:105) (cid:12)(cid:12)(cid:12) < ε, (2)for arbitrary ε >

0. Note that building an approximationto a bounded complex function is enough to address anybounded complex function by a simple shifting and re-scaling. In addition, approximating a complex functionincludes real-valued functions setting φ ( x ) = 0. One canalso relate real-valued functions to the modulus of thecorresponding complex-values ones. Deﬁnition 1.

The k -th approximating circuit is deﬁnedas U ( k ) ,sf,φ = k (cid:89) i =1 U s ( x, (cid:126)θ i ) , (3) where U s ( x, (cid:126)θ ) is a fundamental gate depending on x anda set of parameters (cid:126)θ , and the index s stands for themodel of single-qubit gate used in every case. The models chosen in our construction will be madeexplicit later. The expected behavior of this deﬁnitionas that the error will decrease as the number k increases,that is, as the independent variable is re-uploaded moretimes. As we shall see, the appropriate choice of these parameters θ i , that will determine the shape of f ( x ) and φ ( x ), will allow a systematic approximation of any func-tionality.The set of parameters for a given gate (cid:126)θ i is, in general,made out of a set of angles, so that a more adequate butcumbersome notation is (cid:126)θ i . The quest for the optimal setof parameters { (cid:126)θ , (cid:126)θ , . . . , (cid:126)θ k } is driven by a loss function { (cid:126)θ , (cid:126)θ , . . . , (cid:126)θ k } = argmin θ L ( θ ; f, φ, x ) (4)such that L → (cid:126)(cid:119)(cid:127) (cid:104) | U ( k ) ,sf,φ | (cid:105) → f ( x ) e iφ ( x ) . (5)Therefore, the proposed quantum procedure for storingfunctions within the output state of a given circuit be-longs to the class of hybrid quantum-classical variationalalgorithms. Variational algorithms are quantum algo-rithms whose global structure is deﬁned, but the exactgates are not. These gates depend on some parameters tobe found variationally using classical optimization meth-ods, see Fig. 1, [16, 17]. B. Two theorems on universality

We complete the structure of the algorithm sketchedabove with the design of the single-qubit gates U s afore-mentioned in Deﬁnition 1. In the following, we presenttwo diﬀerent choices for single-qubit gates to constructquantum circuits that are able to represent arbitrarycomplex functions. Each one is based in a diﬀerentknown result from the theory of function approximations,namely Fourier series [1, 2] and Universal ApproximationTheorem (UAT) [3, 4]. The range of applicability of theseproposals for quantum circuits and the conditions for uni-versality are thus inherited from the original theorems. Quantum Fourier series

Fourier series as a constructive method permits ex-pressing a great range of target functions deﬁned withinan interval as a sum of trigonometric functions with ﬁxedfrequencies for real functions, and as a sum of imaginaryexponentials e iωx in the complex space. The conditionsof applicability of this theorem are very broad since tar-get functions are only required to be integrable with aﬁnite number of ﬁnite discontinuities. Theorem 1. Fourier series [1, 2, 18, 19]Let z be any function z : R → C with a ﬁnite number ofﬁnite discontinuities integrable within an interval [ a, b ] ∈ R of length P . Then z N ( x ) = N (cid:88) n = − N c n e i πnxP , (6) | (cid:105) U ( x, (cid:126)θ ) U ( x, (cid:126)θ ) · · · U ( x, (cid:126)θ k ) ⇒ •⇑ • • ⇐ • Cl . Optimizer ⇐ L ( θ ; f, φ, x )FIG. 1. Scheme for the hybrid algorithm. The gates U ( x, (cid:126)θ i ) deﬁne the operation to be performed by the quantum circuit.All (cid:126)θ i are independent among each other. Using the measured output state, a loss function L is constructed using thesemeasurements. A classical optimizer looks for the set of parameters minimizing L . where c n = 1 P (cid:90) P z ( x ) e − i πnxP dx, (7) approximates z ( x ) as lim N →∞ z N ( x ) = z ( x ) . (8)Now we present an extension of Fourier series to aquantum circuit as explicit in Def. 1. First, we deﬁnethe Fourier gate Deﬁnition 2.

Let the fundamental Fourier gate U F be U F ( x ; ω, α, β, ϕ, λ (cid:124) (cid:123)(cid:122) (cid:125) (cid:126)θ ) = R z ( α + β ) R y (2 λ ) R z ( α − β ) R z (2 ωx ) R y (2 ϕ ) , (9) with α, β, ϕ, λ, ω ∈ R . Note that a unitary matrix has three degrees of free-dom, which are here ﬁxed by 5 parameters. An intuitionbehind the role of these parameters is that α, β, ϕ, λ willbe related to the coeﬃcients of one Fourier step, while ω can be identiﬁed with the corresponding frequency. Theorem 2. Quantum Fourier series

Let f, g be any pair of functions f : R → [0 , and φ : R → [0 , π ) , such that z ( x ) = f ( x ) e iφ ( x ) is a complexfunction with a ﬁnite number of ﬁnite discontinuities in-tegrable within an interval [ a, b ] ∈ R of length P . Then,there exists a set of parameters { (cid:126)θ , (cid:126)θ , . . . , (cid:126)θ N } such that (cid:104) | N (cid:89) i =0 U F ( x, (cid:126)θ i ) | (cid:105) = z N ( x ) , (10) with z N ( x ) the N -terms Fourier series. When the building blocks are the U F ( x, θ i ) deﬁned inEq. (9), the unitary operation as deﬁned in Eq.(3) gener-ates a total unitary gate that outputs a N -term Fourierseries when it is applied to an initial | (cid:105) state. Taking | (cid:105) as the initial state implies no loss of generality, since wecan transform | (cid:105) into any other initial state by adjustingthe ﬁrst U F to be a simple rotation. The Fourier seriesbehavior can be achieved only if all { (cid:126)θ i } take a speci-ﬁed value making the ﬁnal result to match exactly theFourier series. However, since this procedure relies on quantum-classical variational methods, we will look forthe optimal parameters by means of a classical optimizer.This freedom gives room to conﬁgurations surpassing theperformance of the standard Fourier series, especially forshallow circuits. In exchange, the recipe to construct theFourier series by performing well-deﬁned calculations islost.For details on the proof of this theorem we refer thereader to Appendix A 1. Quantum UAT

The Universal Approximation Theorem (UAT) demon-strates that any continuous function of a m -dimensionalvariable can be uniformly approximated as a sum of func-tions with adjustable parameters. The ﬁrst formulationrestricted the functions to be sigmoidal functions [3],although later works extended the result to any non-constant bounded continuous function [4]. This theoremis directly applied to neural networks with one hiddenlayer. Theorem 3. Universal Approximation Theo-rem [3, 4, 20] Let I m denote the m -dimensional cube [0 , m . The space of continuous functions on I m is de-noted by C ( I m ) , and we use | · | to denote the uniformnorm of any function in C ( I m ) . Let σ be any non-constant bounded continuous function. Given a function f ∈ C ( I m ) there exists a function G ( (cid:126)x ) = N (cid:88) n =1 α n σ ( (cid:126)w n · (cid:126)x + b n ) (11) such that | G ( (cid:126)x ) − f ( (cid:126)x ) | < ε ∀ (cid:126)x ∈ I m (12) for (cid:126)w n ∈ R d and b n , α n ∈ R This theorem is an existence theorem, and thus thereis no contribution on how many terms from Eq. (11) areneeded for reaching an accuracy ε . Note that UAT canbe immediately applied to complex functions by substi-tuting the real-valued function σ ( · ) with some complex-valued function. In particular, it works if σ ( · ) → e i ( · ) . Aproof is shown in App. B.We can now translate the UAT to the proposed quan-tum circuit by deﬁning the following single-qubit gate Deﬁnition 3.

Let the fundamental UAT gate U UAT be U UAT ( x ; (cid:126)ω, α, ϕ (cid:124) (cid:123)(cid:122) (cid:125) (cid:126)θ ) = R y (2 ϕ ) R z (2 (cid:126)ω · (cid:126)x + 2 α ) , (13) with { (cid:126)ω, α, ϕ } ∈ { R m , R , R } Note that in this case the most possible general unitarymatrix constructed with two rotations is deﬁned by onlytwo degrees of freedom, which are here ﬁxed by 3 param-eters. The intuition behind the role of these parametersis that (cid:126)ω and α will be the equivalent to the weights andbias, while ϕ plays the role of the coeﬃcient. Theorem 4. Quantum UAT

Let f, φ be any pair of functions f : R m → [0 , and φ : R m → [0 , π ) , such that z ( (cid:126)x ) = f ( (cid:126)x ) e iφ ( (cid:126)x ) is a complexcontinuous function on I m . Then there is an integer N and a set of parameters { (cid:126)θ , (cid:126)θ , . . . , (cid:126)θ N } such that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) f ( (cid:126)x ) e iφ ( (cid:126)x ) − (cid:104) | N (cid:89) i =1 U UAT ( (cid:126)x, (cid:126)θ i ) | (cid:105) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < (cid:15), (14) for any (cid:15) > . This theorem is analogous to the classical one, and theproof can be achieved covering the steps developed inRef. [3]. All the theorems supporting the original formu-lation of the UAT hold for the quantum version of it.For more details on the demonstration of the quantumUAT, we refer the reader to App. A 2.

Diﬀerences between approaches

The quantum universality theorems proposed here in-herit the range of applicability, advantages and limita-tions of their classical counterparts. The Fourier ap-proach is guaranteed to work for all integrable functionswith a ﬁnite number of ﬁnite discontinuities. This rangeof functions includes –but is not limited to– continuousfunctions. The UAT only gives support to continuousfunctions, which is useful from a practical perspective,but it is less robust than the Fourier series.The Fourier theorem holds for functions dependingon one variable. However, the extension to multi-dimensional spaces is complicated and requires a spaceof parameters whose size increases exponentially with thedimension [2]. However, in the UAT case the use of multi-variable (cid:126)x arises naturally by adjusting the dimension ofthe weights.It is also interesting to note that, in the single vari-able case, a simple readjustment of parameters suﬃcesto match the formulations of the quantum Fourier theo-rem and UAT. This observation opens up the possibilityto extend the quantum Fourier approach to a recipe hold-ing for multi-variable approximations in a simple manner,by extending ω → (cid:126)ω . However, if this extension is per-formed, the classical Fourier theorem will not be validanymore, and the consequence of this change will be analternative expression of the quantum UAT. III. NUMERICAL EXPERIMENTS

In this section we numerically explore the applicabilityof the theorems explained in Sec. II. We perform two dif-ferent kinds of benchmarks, dealing with real and com-plex functions. These benchmarks collect results usingboth U F and U UAT gates. In the UAT case, the initialstate we take is | + (cid:105) , that is, a Hadamard gate is appliedat the beginning of the circuit. The reason to this adjustis that all parameters aﬀect the ﬁnal measurement, whilethe validity of any result is not compromised. All simula-tions are performed using the framework QIBO [21]. Thecode computing the numerical experiments as well as theﬁnal results can be viewed in

GitHub [22]. Z benchmark for real functions For the ﬁrst benchmark we consider a single-variablereal-valued function − ≤ f ( x ) ≤ (cid:104) Z (cid:105) ≈ f ( x ). The quantum state we want torepresent is deﬁned as | ψ ( x ) (cid:105) Z = (cid:114) f ( x )2 | (cid:105) + e iφ (cid:114) − f ( x )2 | (cid:105) , (15)where φ is a phase that may be x -dependent, but we willnot care about it at this stage. The χ function thatdrives the optimization is then χ = 1 M M (cid:88) j =1 ( (cid:104) Z ( x j ) (cid:105) − f ( x j )) , (16)where M is the total numbers of samples of x .The Z benchmark will be used to compare quantumand classical results both for the Fourier and UAT ap-proaches. In the classical case, the Fourier approachdetailed in Theorem 1 provides a constructive recipe tocompute the approximation of f ( x ). This approximationis the minimal guaranteed performance for an approxima-tion procedure, and can be considered as a lower bound.In the UAT classical case, the σ functions are trigono-metric functions that serve as pieces to reconstruct theobjective real function.The Z benchmark is ﬁrst tested against 4 diﬀerentfunctions of interestReLU( x ) = max(0 , x ) (17)tanh( ax ) for a = 5 (18)step( x ) = x/ | x | ; 0 if x = 0 (19)poly( x ) = | x (1 − x ) | (20)and they will be conveniently rescaled to ﬁt the limits − ≤ f ( x ) ≤

1. In all cases, x ∈ [ − , adjiman, brent, himmelblau, threehump [23].In the 2D case, the functions are conveniently rescaledto ﬁt the limits − ≤ f ( x ) ≤ x, y ) ∈ [ − , . Adeﬁnition on these functions as used in this work can beseen in Appendix C. X − Y benchmark for complex functions Results from Sec. II ensure that the proposed circuitsare able to ﬁt complex functions, and not only real-valuedones. Now, we propose a tomography-like benchmarktesting the capacity to do that. Since complex functionhas real and imaginary parts, one needs to measure atleast two observables in the qubit space. In this case, wechose the observables to be (cid:104) X (cid:105) and (cid:104) Y (cid:105) for the real andimaginary parts, respectively.The quantum state used here is deﬁned as | ψ ( x ) (cid:105) XY = (cid:115) (cid:112) − f ( x )2 | (cid:105) + e ig ( x ) (cid:115) − (cid:112) − f ( x )2 | (cid:105) . (21)This deﬁnition permits the following identiﬁcation (cid:104) X (cid:105) = f ( x ) cos( g ( x )) , (22) (cid:104) Y (cid:105) = f ( x ) sin( g ( x )) , (23)and it is then possible to construct a χ function as χ = 1 M M (cid:88) j =1 (cid:12)(cid:12)(cid:12) (cid:104) X ( x ) (cid:105) + i (cid:104) Y ( x ) (cid:105) − f ( x ) e ig ( x ) (cid:12)(cid:12)(cid:12) . (24)For the X − Y benchmark we test the algorithm againstall possible combinations, a total amount of 16, for realand imaginary parts of the functions deﬁned in Eqs. (17)and subsequents, conveniently renormalized to ensurethat (cid:104) X (cid:105) + (cid:104) Y (cid:105) ≤

1. For the sake of comparison, theclassical UAT is in this case applied using σ ( · ) → e i ( · ) . A. Optimization techniques

An optimization process is required to ﬁnd the opti-mal parameters for all quantum cases and in the classicalUAT case, while the classical Fourier case does not needany optimization procedure since the theorem provides aconstructive recipe to ﬁnd the approximations. For theclassical case, we use standard techniques that are knownto work well in similar cases [24]. Since the number of pa-rameters we deal with in this problem is relatively smallcompared to the parameter space obtained for instance inDeep Learning, the

BFGS and

L-BFGS algorithms [25, 26]as implemented by scipy [27] are used. These algorithms belong to the gradient-based class of optimizers. They it-eratively approximate the Hessian matrix of the loss func-tion numerically to look for an optimal value. The extra L- in the second algorithm stands for a limited-memoryversion of the ﬁrst one.In the quantum scenario, optimization brings moreproblems that are yet to be solved. In particular, thelandscape of the loss function in the parameter space re-mains unknown, and thus it is not easy to infer whatkind of classical optimizers perform well for this partic-ular problem. For this reason we look for the best-ﬁtparameters using the aforementioned L-BFGS algorithmand the genetic option

CMA [28, 29]. The quantum ap-proach has been considered using

L-BFGS since it is com-putationally less costly. Genetic algorithms are knownto explore vast regions of the parameter space and donot depend on gradients. However, they usually requiremore function evaluations to converge to the minimum.It is not possible to guarantee that the solution found byany optimization algorithm is the global minimum of afunction if its landscape is unknown.

IV. EXPERIMENTAL IMPLEMENTATION OFTHE APPROXIMATION THEOREMS

We implement the single-qubit universal approximantin a superconducting qubit circuit cooled to the base tem-perature of a dilution refrigerator (20mK). The qubit is a3D transmon geometry [31] located inside an aluminumthree-dimensional cavity. The lowest mode of the cavityis used to read out the qubit state. The cavity bare fre-quency, ω c = 2 π × .

89 GHz, is greatly detuned from thequbit frequency, ω q = 2 π × .

81 GHz. Hence, there is aqubit state-dependent dispersive shift on the cavity reso-nance, 2 | χ | = 2 π × . α = − π ×

324 MHz. Figs. 2a, 2b display, respectively,measurements of the qubit relaxation and spin-echo de-cay times. The extracted values following exponentialﬁts are T = 15 . µs and T Echo = 12 . µs . These timescales exceed the operation times needed to implementthe algorithm up to 6 layers by 2 orders of magnitude.In order to implement the gate sequences deﬁned inthe previous section we follow the correspondence be-tween logical and physical gates as shown in Fig. 2c.The Y rotations are realized with microwave pulses sentto the cavity-qubit system. The phase of each pulse isselected at the pulse generator to modify the rotationaxis, producing either X or Y rotations as required. TheZ rotations are, in turn, virtual [30]. These rotationsare added as a phase oﬀset to the following pulses, thuschanging the rotation axis. The microwave pulses incor-porate a DRAG correction [32, 33] which leads to an errorper gate (cid:15) = 0 .

01 found with randomized benchmarking[34]. In order to achieve better resolution and shorter op-eration times, a reset protocol is performed prior to themain sequence [35]. More details on the experimentalprocedure can be found in appendix D. V ( V ) T = s t ( s ) V ( V ) T E = s ExperimentFit

FIG. 2. Coherence times and sequence: a) T measurementwith exponential ﬁt. b) Spin-echo measurement, T E , withexponential ﬁt. c) Sequence performed in the experiment.Blue boxes represent actual pulses. Logical Y and Z rotationsare explicitly shown below the blue boxes [30]. Note thatZ pulses do not correspond to any microwave pulse, insteadsubsequent pulses change rotation axis, indicated by a prime, Y ( N ) N . As detailed earlier, the single-qubit approximant algo-rithm rotates the qubit state to directly produce the valueof a certain mathematical function by measuring the ex-pectation value of the qubit population. The theoreticaloptimal parameters are found through an optimizationmethod simulating the qubit evolution assuming closedsystem dynamics for diﬀerent layers of the algorithm forboth Fourier and UAT functions[36]. The parametersfound theoretically deﬁne the angle of all the rotationsfor a speciﬁc x value. The experiment is performed forall diﬀerent x values considered in the simulations. A spe-ciﬁc set of x values for the ReLU function from Eq. (17)can be found in App. D. V. RESULTS

We devote this section to the results obtained for thebenchmark of the universality theorems. In all resultspresented we provide three diﬀerent ﬁnal values. First,we use classical methods, namely Fourier and UAT, toapproximate a target function. Second, we approximatethe same function using the quantum procedures deﬁnedin this work, simulating the wave function evolution withclassical methods. In both cases, we retain the best out-come obtained with diﬀerent initial conditions used inthe optimization step. Finally, we use the parameters ob-tained using the simulation of the quantum procedure toexecute that circuit in the actual superconducting quan-tum device.We show in Fig. 3 the resulting ﬁt for all four single-variable real-valued functions from Eqs. (17) and sub-sequents using the Z benchmark and 5 layers. Fig. 5also shows a comparison between analogous classical andquantum theorems and its experimental validation. Weobserve that all methods follow the shape of the targetfunction. The classical Fourier approximation return lessaccurate predictions on the value of f ( x ) due to the pe-riodic nature of the model. The quantum Fourier andboth UAT models return predictions following the shapeof the target of function in all diﬀerent values of x . Thisbehaviour can be observed in all diﬀerent benchmarks.An analysis of experimental errors is also depicted at theUAT ReLU plot from Fig. 3, returning low uncertaintiesfor the experimental results. We assume similar resultsfor all other experiments.The values obtained for χ in the Z benchmark aresummarized in Fig. 5 for all diﬀerent models. A generaltrend towards better approximated results with largernumbers of layers is observed. In general, every layergrants the model more ﬂexibility, and thus the capabil-ity of ﬁtting the target function increases. However, thespace of parameters also increases and it is harder to ﬁnda good minimum.It is also possible to observe that the quantum Fouriermodel performs better than its classical counterpart.This is due to the fact that the classical Fourier seriesdoes not have tunable parameters, but a constructive wayto build an approximation function that gets asymptot-ically closer to the target one. The N -term Fourier se-ries is just a truncation of the inﬁnite one. Since bothfrequencies and coeﬃcients are ﬁxed, the approximationachieved by means of the N -term Fourier series is pro-vided by the truncation, but it is not the best possibleperformance for a given number and conﬁguration of pa-rameters. On the other hand, the constructive propertiesof Fourier series constitute a lower bound for any approx-imation method based on optimization.In the UAT case, it is not clear which approach returnsbetter results. The classical algorithm performs betterin the poly( x ) case, but the results with the quantummethod improve the classical method in the tanh(5 x )case. Both models present similar trends as the num- f ( x ) tanh(5 x ) step( x ) poly( x ) ReLU( x ) x f ( x ) x x x F o u r i e r U A T ClassicalSimulationExperimentTarget

FIG. 3. Fittings for 4 real-valued functions using the Z benchmark with 5 layers. Blue triangles represent classical models,namely Fourier and UAT, while red dots represent its quantum counterparts computed using a classical simulator. Greensquares are the experimental execution of the optimized quantum model using a superconducting qubit. The target functionis plotted in black for comparison. The analysis for experimental errors is plotted for the ReLU function and the UAT model.Similar uncertainties are expected to occur at other experiments. x y Target xClassical xSimulation xExperiment

Himmelblau(x, y)

FIG. 4. Fittings for the 2D function Himmelblau properly normalized using the Z benchmark for 5 layers. The blue plotrepresent the classical UAT model, while the red plot represent its quantum counterparts simulated. The green plot is theexperimental execution of the optimized quantum model. The target function is painted in black. In all drawings, the linescorresponds to the same levels in the Z axis. ber of layers increases.Finally, notice that the Fourier quantum approach isnot clearly better than the UAT one. This can be viewedas an advantage for the latter, since the number of pa-rameters for the Fourier model grows as 5 × layers, whilein the UAT case it grows as 3 × layers. That is, one wouldexpect the Fourier model to perform better since it con-tains more parameters, but it is clear that the ﬂexibilityachieved through the UAT model is enough to represent any function.The experimental results are also depicted in Fig. 5. Itis immediate to see that the experimental realization ofthe quantum approximation models suﬀers from circuitnoise and sampling uncertainties, and therefore degradesthe quantity χ . This is more prominent as more lay-ers are added to the model. As a direct consequence,the approximation of the quantum model to the targetfunction loses accuracy. That being said, even though f(x) = tanh(5 x ) f ( x ) = step( x ) f ( x ) = poly( x ) Layers f ( x ) = ReLU( x ) Classical FourierSimulation FourierExperiment Fourier Classical UATSimulation UATExperiment UAT

FIG. 5. Values of χ for the Z benchmark in all 4 test func-tions using classical computation (blue scatter), classical sim-ulation of the quantum algorithm (red scatter) and experi-mental implementation with a superconducting qubit (greenscatter). Fourier models are depicted with triangles, whileUAT models are crosses. the model does not return exact results, it preserves theinformation well enough to reproduce the shape of thetarget functions, as seen in Fig. 3. It is also possible tosee that the inherent sampling uncertainty sets a lowerbound in the value of χ obtained through experiments.We deal now with the two-dimensional Z benchmark.These results comprehends only the UAT version of theuniversality theorem since we only have extended this oneto the multi-dimensional case. The simulation providesﬁnal results summarized in Figs. 4 and 6. The resultsprovided by all diﬀerent models for the Himmelblau(x , y)function is depicted in Fig. 4. It is clear that all dif-ferent executions capture the overall shape of the func-tion, but some diﬀerences can be seen in the diﬀerentdrawings. Classical simulations can return values for Z < , Z >

1, and thus the there are three clear min- Himmelblau(x, y) Brent(x, y) Threehump(x, y)

Layers Adjiman(x, y)

Classical UATSimulation UATExperiment UAT

FIG. 6. Values of χ for the Z benchmark in all 4 test 2Dfunctions using classical computation (blue scatter), classicalsimulation of the quantum algorithm (red scatter) and exper-imental implementation with a superconducting qubit (greenscatter). Only UAT models are considered. ima in this case. On the other hand, simulation cannotclearly distinguish those minima. The experimental ex-ecution presents sharp contours because of the inherentnoise and sampling uncertainty.The goodness of the approximation is measured againby means of χ in Fig. 6. As before, we see that a largernumber of layers provides better approximations to thetarget function. In the Adjiman case, the quantum modelis more accurate, while the classical one suits all otherfunctions. However, as seen in the one-dimensional Z benchmark, the scaling is similar for both cases.A representation of approximation for the function inthe X − Y benchmark is depicted in Fig. 7. In thatcase, the X measurement is related to the tanh(5 x ) whileReLU( x ) stands for the imaginary piece. The observa-tions made for the Z benchmark holds in this case. f ( x ) x f ( x ) x F o u r i e r U A T Real part = tanh(5 x ) Imag part = ReLU( x ) ClassicalSimulationExperimentTarget

FIG. 7. Fittings for the complex function f ( x ) = tanh(5 x ) + i ReLU( x ) properly normalized using the X − Y benchmark for 5layers. Blue triangles represent a classical model, while red dots represent its quantum counterparts computed using a classicalsimulator. Green squares are the experimental execution of the optimized quantum model using a superconducting qubit. Thetarget function is plotted in black for comparison. Layers Layers Layers Layers X = t a nh ( x ) Y = tanh(5 x ) Y = step( x ) Y = poly( x ) Y = ReLU( x ) X = s t e p ( x ) X = p o l y ( x ) X = R e L U ( x ) Classical FourierSimulation FourierExperiment Fourier Classical UATSimulation UATExperiment UAT

FIG. 8. Values of χ for the X-Y benchmark in all possible combinations for real and imaginary parts of the 4 test functionsusing classical computation (blue scatter), classical simulation of the quantum algorithm (red scatter) and experimental im-plementation with a superconducting qubit (green scatter). Fourier models are depicted with triangles, while UAT models arecrosses. χ when the target functions are all possiblecombinations for real and imaginary parts with the func-tions described in Eq.(17). In this case it is possible tosee a common advantage for the quantum models. BothFourier and UAT quantum approaches present better re-sult than their classical counterparts. In particular, thefunction tanh(5 x ) and ReLU( x ) functions work better inany combination. This reﬂects the behaviour already ob-served in Fig. 5, namely the more layers are used thebetter approximation is feasible. Experimental measure-ment are also depicted in those cases involving tanh 5 x and ReLU( x ) functions for comparison. VI. CONCLUSIONS

We have shown that a single-qubit circuit has enoughﬂexibility to encode any complex function z ( x ) in itsquantum gates degrees of freedom. This universal rep-resentation is achieved by acting with a quantum circuitbase on a single-qubit gate that depends on input vari-ables and additional paratemters that are ﬁxed by ma-chine learning techniques.This result guarantees that a single-qubit circuit, asdeﬁned in this work, is able to store two diﬀerent andindependent real functions. Our present results providethe highest degree of compression of data in a single-qubit state, since there are no more degrees of freedomavailable in a circuit. In addition, the target functions tobe encoded within the quantum circuit have no limitationin the dimensionality of its independent variable.The proof for universality was shown following two dif-ferent approaches. This led to two corresponding sets ofsingle-qubit gates. In the ﬁrst method, we found a linkbetween quantum circuits and Fourier series. We havedeﬁned a Quantum gate tuned by 5 parameters such thatthe row of N gates applied to an initial state provides aﬁnal state where an N -terms Fourier series is encoded.This only holds for a given conﬁguration of all 5 N gateparameters. In addition, the independent variable x mustbe one-dimensional. This universality proof inherits theassumptions of Fourier series supporting functions witha ﬁnite number of ﬁnite discontinuities. For the secondmethod, we link the gate description with the UniversalApproximation Theorem. A single-qubit quantum gatedepending on 3 input parameters can be repeated severaltimes to the input state to achieve a ﬁnal state whoseform is compatible with UAT. The input state does notcompromise the validity of the approximation theoremsbut it aﬀects the parameters deﬁning the circuit. Thisproof holds for any continuous function for one- or multi- dimensional x and guarantees uniform approximations.We also provide numerical evidence on the ﬂexibilityand approximation capabilities of these quantum circuits.The benchmarks have been obtained using simulationsand classical minimizers to ﬁnd optimal parameters fora set of test functions. As benchmark, we have included1D and 2D real functions and 1D complex functions. Theﬁnal results have also been compared to its classical coun-terparts. In all cases, it is possible to see an equivalentscaling for both classical and quantum methods. Thisensures numerically that the quantum procedure is com-parable to the standard classical ones.Simulations have later been implemented in an ac-tual quantum device with one superconducting trasmonqubit. The experimental results conﬁrm the same trendobtained with the classical simulations. The ﬁnite qubitcoherence does not seem to impact the results signiﬁ-cantly. In the computer, the results already obtainedhave been run to check the consistency of the procedurewhen applied on real hardware.The present work can serve as a starting point forstudying the representation capability of quantum sys-tems beyond one qubit (see also [14]). It is clear thatcircuits working with n qubits explore an exponentiallylarge Hilbert space, where entanglement should play arole still to be fully understood. ACKNOWLEDGEMENTS

We thank Martin Weides and Marco Pﬁrrman at Glas-gow University for fabricating the superconducting trans-mon qubit device used in this work at the Karlsruhe Insti-tute of Technology (KIT), and Prof. Sergio O. Valenzuelafrom the Catalan Institute of Nanoscience and Nanotech-nology (ICN2) for giving access to the dilution refrigera-tor during the initial measurement phase. We acknowl-edge ﬁnancial support from Secretaria d’Universitats iRecerca del Departament d’Empresa i Coneixement dela Generalitat de Catalunya, co-funded by the EuropeanUnion Regional Development Fund within the ERDFOperational Program of Catalunya (project Quantum-Cat, ref. 001-P-001644). P. F.-D. acknowledges supportfrom ”la Caixa” Foundation - Junior leader fellowship(ID100010434-LCF/BQ/PR19/11700009), MISTI pro-gram (LCF/PR/MIT17/11820008), Ministry of Econ-omy and Competitiveness and Agencia Estatal de In-vestigaci´on (FIS2017-89860-P; SEV-2016-0588; PCI2019-111838-2), and European Commission (Fet-open AVaQusGA 899561; QuantERA). IFAE is partially funded by theCERCA program of the Generalitat de Catalunya. [1] P. G. L. Dirichlet, Sur la convergence des s´eriestrigonom´etriques qui servent `a repr´esenter une fonctionarbitraire entre des limites donn´ees, Journal f¨ur die reineund angewandte Mathematik , 157 (1829). [2] B. Riemann, ¨Uber die darstellbarkeit einer func-tion durch eine trigonometrische reihe, Abhandlun-gen der K¨oniglichen Gesellschaft der Wissenschaften zuG¨ottingen (1867). [3] G. Cybenko, Approximation by superpositions of a sig-moidal function, Mathematics of Control, Signals, andSystems , 303 (1989).[4] K. Hornik, Approximation capabilities of multilayer feed-forward networks, Neural Networks , 251 (1991).[5] A. P´erez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, andJ. I. Latorre, Data re-uploading for a universal quantumclassiﬁer, Quantum , 226 (2020).[6] A. P´erez-Salinas, J. Cruz-Martinez, A. A. Alhajri, andS. Carrazza, Determining the proton content with aquantum computer (2020), arXiv:2011.13934 [hep-ph].[7] C. Bravo-Prieto, Quantum autoencoders with enhanceddata encoding (2020), arXiv:2010.06599 [quant-ph].[8] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Quan-tum circuit learning, Physical Review A (2018).[9] D. Zhu, N. M. Linke, M. Benedetti, K. A. Landsman,N. H. Nguyen, C. H. Alderete, A. Perdomo-Ortiz, N. Ko-rda, A. Garfoot, C. Brecque, L. Egan, O. Perdomo,and C. Monroe, Training of quantum circuits on a hy-brid quantum computer, Science Advances , eaaw9918(2019).[10] S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, and N. Killo-ran, Quantum embeddings for machine learning (2020),arXiv:2001.03622 [quant-ph].[11] Y. Liu, S. Arunachalam, and K. Temme, A rigorous androbust quantum speed-up in supervised machine learning(2020), arXiv:2010.02174 [quant-ph].[12] P. Rebentrost, M. Mohseni, and S. Lloyd, Quantum sup-port vector machine for big data classiﬁcation, PhysicalReview Letters (2014).[13] S. Lloyd, M. Mohseni, and P. Rebentrost, Quantum algo-rithms for supervised and unsupervised machine learning(2013), arXiv:1307.0411 [quant-ph].[14] M. Schuld, R. Sweke, and J. J. Meyer, The eﬀect of dataencoding on the expressive power of variational quan-tum machine learning models (2020), arXiv:2008.08605[quant-ph].[15] T. Goto, Q. H. Tran, and K. Nakajima, Universal ap-proximation property of quantum feature map (2020),arXiv:2009.00298 [quant-ph].[16] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q.Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien,A variational eigenvalue solver on a photonic quantumprocessor, Nature Communications (2014).[17] E. Farhi, J. Goldstone, and S. Gutmann, A quan-tum approximate optimization algorithm (2014),arXiv:1411.4028 [quant-ph].[18] L. Carleson, On convergence and growth of partial sumsof fourier series, Acta Math. , 135 (1966).[19] P. Tur´an, ¨Uber die fouriersche reihe, in Leopold Fej´erGesammelte Arbeiten I (Birkh¨auser Basel, 1970) pp.302–317.[20] M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, Mul-tilayer feedforward networks with a nonpolynomial ac-tivation function can approximate any function, NeuralNetworks , 861 (1993).[21] S. Efthymiou, S. Ramos-Calderer, C. Bravo-Prieto,A. P´erez-Salinas, D. Garc´ıa-Mart´ın, A. Garcia-Saez,J. I. Latorre, and S. Carrazza, Qibo: a framework forquantum simulation with hardware acceleration (2020),arXiv:2009.01845 [quant-ph].[22] A. P´erez-Salinas and D. L´opez-N´u˜nez, Universal-Approximator, https://github.com/UB-Quantic/ Universal-Approximator (2021).[23] M. A. Ardeh, Benchmarkfcns toolbox (2016).[24] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-napeau, M. Brucher, M. Perrot, and E. Duchesnay,Scikit-learn: Machine learning in Python, Journal of Ma-chine Learning Research , 2825 (2011).[25] Numerical Optimization (Springer New York, 2006).[26] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, A limitedmemory algorithm for bound constrained optimization,SIAM Journal on Scientiﬁc Computing , 1190 (1995).[27] P. Virtanen et al. , SciPy 1.0: fundamental algorithms forscientiﬁc computing in python, Nature Methods , 261(2020).[28] N. Hansen, The CMA evolution strategy: A compar-ing review, in Towards a New Evolutionary Computation (Springer Berlin Heidelberg) pp. 75–102.[29] Niko, Youhei Akimoto, Yoshihikoueno, D. Brockhoﬀ,M. Chan, and ARF1, Cma-es/pycma: r3.0.3 (2020).[30] D. C. McKay, C. J. Wood, S. Sheldon, J. M. Chow, andJ. M. Gambetta, Eﬃcient z gates for quantum comput-ing, Phys. Rev. A , 022330 (2017).[31] H. Paik, D. I. Schuster, L. S. Bishop, G. Kirchmair,G. Catelani, A. P. Sears, B. R. Johnson, M. J. Reagor,L. Frunzio, L. I. Glazman, S. M. Girvin, M. H. De-voret, and R. J. Schoelkopf, Observation of high coher-ence in josephson junction qubits measured in a three-dimensional circuit qed architecture, Phys. Rev. Lett. , 240501 (2011).[32] F. Motzoi, J. M. Gambetta, P. Rebentrost, and F. K.Wilhelm, Simple pulses for elimination of leakage inweakly nonlinear qubits, Phys. Rev. Lett. , 110501(2009).[33] J. M. Chow, L. DiCarlo, J. M. Gambetta, F. Motzoi,L. Frunzio, S. M. Girvin, and R. J. Schoelkopf, Optimizeddriving of superconducting artiﬁcial atoms for improvedsingle-qubit gates, Phys. Rev. A , 040305 (2010).[34] E. Magesan, J. M. Gambetta, and J. Emerson, Scalableand robust randomized benchmarking of quantum pro-cesses, Phys. Rev. Lett. , 180504 (2011).[35] K. Geerlings, Z. Leghtas, I. M. Pop, S. Shankar, L. Frun-zio, R. J. Schoelkopf, M. Mirrahimi, and M. H. Devoret,Demonstrating a driven reset protocol for a supercon-ducting qubit, Phys. Rev. Lett. , 120501 (2013).[36] A direct optimization of the universal approximant us-ing the qubit is currently being carried out and will bereported in future work.[37] R. B. Ash, Real Analysis and Probability (Elsevier, 1972).[38] H. Hahn, ¨Uber lineare gleichungssysteme in linearenr¨aumen., Journal f¨ur die reine und angewandte Math-ematik , 214 (1927).[39] S. Banach, Sur les fonctionnelles lin´eaires ii, Studia Math-ematica , 223 (1929).[40] F. Riesz, D´emonstration nouvelle d’un th´eor`eme con-cernant les op´erations fonctionnelles lin´eaires, in An-nales scientiﬁques de l’ ´Ecole Normale Sup´erieure , Vol. 31(1914) pp. 9–14.[41] A. J. Weir,

General integration and measure , Vol. 2 (CUPArchive, 1974). Appendix A: Proof of universality theorems

We prove here the results claimed in Theorems 2 and 4.

1. Demonstration for the quantum Fourier series

The quantum circuit proposed in Theorem 2 fulﬁlls therequirement that every new gate plays the role of a new step in the original Fourier series. The proof is basedon an inductive procedure and can be then decomposedin two steps. First, we show that the ﬁrst gate of thecircuit is equivalent to the 0-th constant Fourier term.Then, we show that if there are N gates in a row forminga N -term Fourier series, then adding a new gate providesa ( N + 1)-terms Fourier series.Let the fundamental gate U F ( x, (cid:126)θ ) gate deﬁned in Eq.(9) be U F ( x ; (cid:126)θ ) = U F ( x ; ω, α, β, ϕ, λ ) = R z ( α + β ) R y (2 λ ) R z ( α − β ) R z (2 ωx ) R y (2 ϕ ) == (cid:18) cos λ cos ϕe iα e iωx − sin λ sin ϕe iβ e − iωx − cos λ sin ϕe iα e iωx − sin λ cos ϕe iβ e − iωx sin λ cos ϕe − iβ e iωx + cos λ sin ϕe − iα e − iωx − sin λ sin ϕe − iβ e iωx + cos λ cos ϕe − iα e − iωx (cid:19) , (A1)It is possible to recast the above choice of fundamentalgate using the following redeﬁnition of parameters, a + = cos λ cos ϕe iα , (A2) a − = − sin λ sin ϕe iβ , (A3) b + = − cos λ sin ϕe iα , (A4) b − = − sin λ cos ϕe iβ . (A5)A more compact representation of the fundamentalgate follows Lemma 1.

The fundamental gate can be expressed as U F ( x ; ω, α, β, ϕ, λ ) = (cid:18) a + e iωx + a − e − iωx b + e iωx + b − e − iωx − b ∗− e iωx − b ∗ + e − iωx a ∗− e iωx + a ∗ + e − iωx (cid:19) , (A6) as can be veriﬁed by simple substitution from Deﬁnition2. Note that this expression corresponds to a unitary ma-trix, due to the relations involved in the deﬁnition of thecoeﬃcients a ± and b ± . Note also that a unitary matrixhas three degrees of freedom, which are here ﬁxed by 5parameters. An intuition behind the role of these param-eters is that α, β, ϕ, λ are related to the coeﬃcients of oneFourier step, that is a ± , b ± , while ω can be identiﬁed withthe corresponding frequency.A total circuit can be constructed by multiplying k fundamental gates to obtain U ( k ) ,sf,φ as in Deﬁnition 1.Starting with this composite gate, we can now proof themain Fourier approximation theorem. Theorem 5.

There exists a series of k single-qubit gatesforming a k -th approximant circuit that delivers a unitaryoperation where all its coeﬃcients are written as Fourierseries. Proof. The proof of this constructive theorem consists inmaking contact with harmonic analysis and proceeds byinduction. i) The ﬁrst circuit consists only of one fundamentalgate, chosen with frequency ω = 0, that is U F = (cid:18) A B − B ∗ A ∗ (cid:19) , (A7)This, indeed corresponds to the ﬁrst constant term ofFourier series. ii) We now assume that the N -th approximant circuittakes the form N (cid:89) i =0 U F i = (cid:18) (cid:80) Nn = − N A n e i Ω n x (cid:80) Nn = − N B n e i Ω n x − (cid:80) Nn = − N B ∗ n e − i Ω n x (cid:80) Nn = − N A ∗ n e − i Ω n x (cid:19) . (A8) where the frequencies are (Ω n ± ω ). The result of addinga new fundamental gate corresponds to N +1 (cid:89) i =0 U F i = (cid:32) (cid:80) N +1 n = − N − ˜ A n e i ˜Ω n x (cid:80) N +1 n = − N − ˜ B n e i ˜Ω n x − (cid:80) N +1 n = − N − ˜ B ∗ n e − i ˜Ω n x (cid:80) N +1 n = − N − ˜ A ∗ n e − i ˜Ω n x (cid:33) (A9) where we need to ﬁx the new coeﬃcients ˜Ω n and fre-quencies in terms of the old ones Ω n and the new singlegate frequency ω added to the circuit. It is easy to seethat the addition of a gate changes the frequency in oneunit, that is, ˜Ω = Ω n ± ω . Then, the general structureof the series can be adapted to a Fourier expansion bychoosing Ω n = (2 n + 1) π . (A10)After ﬁxing the values that the frequencies must take,it is straightforward to re-arrange terms in the matrix3and reach ˜ A = A a − − B ∗ b − (A11)˜ A ± n = A ± n a − − B ∗∓ n b − + A ± ( n − a + − B ∗∓ ( n − b + (A12)˜ A ± ( N +1) = A ± N a + − B ∗∓ N b + (A13)˜ B = B a − + A ∗ b − (A14)˜ B ± n = B ± n a − + A ∗∓ n b − + B ± ( n − a + + A ∗∓ ( n − b + (A15)˜ B ± ( N +1) = A ∗∓ N a + + A ∗∓ N b + (A16)This provides the explicit connection between approxi-mant circuits and Fourier expansions for the coeﬃcientsof the global unitary matrix. The above constructive theorem is suﬃcient to provethat the output probability of a series of approximantcircuits can reproduce any functionality.

2. Demonstration for the quantum UAT

An alternative manner to design a single-qubit univer-sal approximant is related to the equivalent Universal Ap-proximation Theorem broadly used in Neural Networks[3]. The idea is to start from a diﬀerent fundamentalgate.Let the fundamental gate U UAT ( (cid:126)x ; (cid:126)θ ) deﬁned in Eq.(13) be explicitly U UAT ( x ; (cid:126)ω, α, ϕ ) = R z (2 ( (cid:126)ω · (cid:126)x + α )) R y (2 ϕ ) = (cid:18) cos( ϕ ) e i ( (cid:126)ω · (cid:126)x + α ) − sin( ϕ ) e i ( (cid:126)ω · (cid:126)x + α ) sin( ϕ ) e − i ( (cid:126)ω · (cid:126)x + α ) cos( ϕ ) e − i ( (cid:126)ω · (cid:126)x + α ) (cid:19) , (A17)A total circuit can be constructed by multiplying k fun-damental gates to obtain U ( k ) , UAT f,φ as in Deﬁnition 1. Wecan now prove the quantum UAT using this fundamentalgate.

Theorem 6.

There exists a series of k single-qubit gatesforming a k -th approximant circuit that delivers a uni-tary operation where all its coeﬃcients are written as anapproximation as deﬁned by Theorem 3, UAT.Proof. Let us take the U UAT deﬁned in Eq. (A17). R z (2 (cid:126)ω · (cid:126)x + 2 α ) R y (2 ϕ ) = (cid:18) cos( ϕ ) e i ( (cid:126)ω · (cid:126)x + α ) − sin( ϕ ) e i ( (cid:126)ω · (cid:126)x + α ) sin( ϕ ) e − i ( (cid:126)ω · (cid:126)x + α ) cos( ϕ ) e − i ( (cid:126)ω · (cid:126)x + α ) (cid:19) By direct inspection it is straigthforward to check thatevery entry in this matrix can be understood as one termof ¯ f N in Eq. (11). From this deﬁnition we can infer therecursive rule that deﬁnes all steps. If A N = (cid:104) | N (cid:89) n =1 U UATn | (cid:105) (A18) B N = (cid:104) | N (cid:89) n =1 U UATn | (cid:105) (A19)then the updating rule is A N +1 = A N cos( ϕ N +1 ) e i(cid:126)ω N +1 · (cid:126)x e iα N +1 − B N sin( ϕ N +1 ) e i(cid:126)ω N +1 · (cid:126)x e iα N +1 (A20) B N +1 = A N sin( ϕ N +1 ) e − i(cid:126)ω N +1 · (cid:126)x e α N +1 + B N cos( ϕ N +1 ) e − i(cid:126)ω N +1 · (cid:126)x e iα N +1 (A21) Having this updating rule in mind, it is possible towrite B N = N − (cid:88) m =0 c m ( ϕ , . . . , ϕ N ) e iδ m ( α ,...,α N ) e i (cid:126)w m ( (cid:126)ω ,...,(cid:126)ω N ) · (cid:126)x , (A22)where the inner dependencies of c m are products of sinesand cosines of ϕ n , and those of δ m and (cid:126)w m are linearcombinations of α n and ω n .Let us proceed now as in the proof of the UAT in Ref.[3]. Let us take S as the set of functions of the form B N ( (cid:126)x ), and C C ( I m ) the set of continuous complex-valuedfunctions in I m , deﬁned as in Theorem 3. We assume that S ⊂ C C ( I m ), and S (cid:54) = C C ( I n ). We can now apply theTheorem 7, known as Hahn-Banach theorem. This the-orem allows to state that there exists a linear functional L acting on C C ( I n ) such that L ( S ) = L ( ¯ S ) = 0 , L (cid:54) = 0 . (A23)Notice that this theorem is applicable as there are norestriction in working only with real numbers.We call now Theorem 8, known as Riesz representationtheorem. We can write the functional L as L ( h ) = (cid:90) I n h ( x ) dµ ( x ) (A24)for µ ∈ M ( I n ) non-null and ∀ h ∈ C C ( I n ). In particular, L ( h ) = A N ( (cid:126)x ) dµ ( (cid:126)x ) = 0 , (A25)and thus (cid:90) I n e i (cid:126)v m ( ω ,...,ω N ) · (cid:126)x dµ ( (cid:126)x ) = 0 . (A26)4This is the usual Fourier transform of µ . We can concludeby calling Theorem 9, Lebesgue Bounded Convergencetheorem, that if the FT ( µ ) = 0, then µ = 0, and wecome into a contradiction with the only assumption wemade.The measure of all half-planes being 0 implies that µ =0. Let us ﬁx (cid:126)w , and for a bounded measurabe function h we deﬁne the linear functional F ( h ) = (cid:90) I n h ( (cid:126)w · (cid:126)x ) dµ ( x ) , (A27)which is bounded on L ∞ ( R ) since µ is a ﬁnite signedmeasure. Let h be an indicator of the half planes h ( u ) = 1if u ≥ − b and h ( u ) = 0 otherwise, then F ( h ) = (cid:90) I n h ( (cid:126)w · (cid:126)x ) dµ ( x ) = µ (Π (cid:126)w,b ) + µ ( H (cid:126)w,b ) = 0 . (A28)By linearity, F ( h ) = 0 for any simple function, such assum of indicator functions of intervales [37].In particular, for the bounded measurable functions s ( u ) = sin( (cid:126)w · (cid:126)x ) , c ( u ) = cos( (cid:126)w · (cid:126)x ) we can write F ( c + is ) = (cid:90) I n exp { i (cid:126)w · (cid:126)x } dµ ( (cid:126)x ) = 0 . (A29)The Fourier Transform of this F is null, thus µ = 0.For the sake of completeness, we cover now the threetheorems required for the proof. Theorem 7. : Hahn-Banach [38, 39]Set K = R or C . Let V be a K − vector space with aseminorm p : V → R . If ϕ : U → K is a K − linearfunctional on a K − linear subspace U ⊂ V such that | ϕ ( x ) | ≤ p ( x ) ∀ x ∈ U, (A30) then there exists a linear extension ψ : V → K of ϕ tothe whole space V such that ψ ( x ) = ϕ ( x ) ∀ x ∈ U (A31) | ψ ( x ) | ≤ p ( x ) ∀ x ∈ V (A32) Theorem 8. : Riesz Representation [40]Let X be a locally compact Hausdorﬀ space. Forany positive linear functional ψ on C ( X ) , there existsa uniruq regular Borel measure µ such that ∀ f ∈ C c ( X ) : ψ ( f ) = (cid:90) X f ( x ) dµ ( x ) (A33) Theorem 9. : Lebesgue Bounded Convergence [41]Let { f n } be a sequence of complex-valued measurablefunctions on a measure space ( S, Σ , µ ) . Suppose that { f n } converges pointwise to a function f and is domi-nated by some integrable function g ( x ) in the sense | f n ( x ) | ≤ g ( x ) , (cid:90) S | g | dµ < ∞ (A34) then lim n →∞ (cid:90) S f n dµ = (cid:90) S f dµ (A35)

3. Link to output of quantum circuits

Last sections were devoted to prove that speciﬁc seriesof circuits return functionalities able to represent a widerange of functions. In this last step we relate previousresults to the output of quantum circuits.

Theorem 10.

The computational basis output of asingle-qubit quantum circuit can provide a convergent ap-proximattion to any desired function.Proof.

The output of a k -th approximant circuit can becast a an approximation expansion of an arbitrary func-tion. It is suﬃcient to initalize a register in the | (cid:105) stateand measure the output in the computational basis. Itfollows (cid:104) | N (cid:89) i =0 U si | (cid:105) = z N ( x ) (A36)where z N ( x ) can take diﬀerent forms.If the fundamental gate is U F , then the output is thetruncated Fourier series z N ( x ) = N (cid:88) n = − N B n e i πnx , (A37)where B n are free complex coeﬃcients. This result holdsfor single-variable functions.If the fundamental gate is U UAT , then the output is afunction z N ( (cid:126)x ) = N − (cid:88) m =0 c m ( ϕ , . . . , ϕ N ) e iδ m ( α ,...,α N ) e i(cid:126)w m ( (cid:126)ω ,...,(cid:126)ω N ) · (cid:126)x , (A38) according to Eq. (A22). This result holds for single- andmulti-variable functions.According to theorems 5 and 6, both expressions canapproximate any desired function.5 Appendix B: UAT for complex functions

In this Appendix we show that the standard formula-tion of the UAT supports the approximation of complexfunction using e i ( · ) as the activation function.Let us follow the approximations according to the UATof the function z ( (cid:126)x ) = a ( (cid:126)x ) + ib ( (cid:126)x ) , (B1)using trigonometric functions as σ ( · ), a ( x ) = N (cid:88) j =1 α i cos( (cid:126)w j · (cid:126)x + a j ) (B2) b ( x ) = N (cid:88) j =1 β i sin( (cid:126)v j · (cid:126)x + b j ) . (B3)Then z ( x ) = N (cid:88) j =1 α i cos( (cid:126)w j · (cid:126)x + a j )+ i N (cid:88) j =1 β i sin( (cid:126)v j · (cid:126)x + b j ) , (B4)and this equation is can be rearranged as z ( x ) = N (cid:88) j =1 α j (cid:16) e i ( (cid:126)w j · (cid:126)x + a j ) + e − i ( (cid:126)w j · (cid:126)x + a j ) (cid:17) + β j (cid:16) e i ( (cid:126)v j · (cid:126)x + b j ) − e − i ( (cid:126)v j · (cid:126)x + b j ) (cid:17) , (B5)what encourages the UAT formulation for complexfunctions as an analogous to Eq. (11) G ( (cid:126)x ) = N (cid:88) n =1 γ n e iδ n e i(cid:126)u n · (cid:126)x . (B6) Appendix C: 2D functions for benchmark

The deﬁnitions used for the 2-dimensional functions[23] that serve for benchmarking our proposed algorithmsare deﬁne asHimmelblau( x, y ) == ( x + y − + ( x + y − , (C1)Brent( x, y ) == (cid:16) x (cid:17) + (cid:16) y (cid:17) + e − (cid:16) ( x − ) + ( y − ) ) (cid:17) , (C2) Himmelblau z Brentx y z Threehump x y Adjiman

FIG. 9. Graphical representation of 2-dimensional functionsutilized for benchmarking. A regularization is applied to ob-tain values between − Threehump( x, y ) == 2 (cid:18) x (cid:19) − . (cid:18) x (cid:19) + 16 (cid:18) x (cid:19) ++ (cid:18) x (cid:19) (cid:18) y (cid:19) + (cid:18) y (cid:19) , (C3)Adjiman( x, y ) = cos( x ) sin( y ) − xy + 1 , (C4)where a normalization to − ≤ f ( x, y ) ≤ Appendix D: Experimental methods

The experiment was realized in a dilution fridge with abase temperature of approximately 20 mK. The qubit ro-tation pulses were deﬁned by an arbitrary waveform gen-erator (AWG) and then upconverted with a microwavesignal generator to the gigahertz frequency range beforebeing sent to the qubit/cavity system. The signal waslow-pass ﬁltered and attenuated by a total of 50dB beforereaching the aluminum cavity. The input port of the cav-ity was undercoupled while the output port was overcou-pled in order to maximize the readout signal amplitude.The outgoing signal was ampliﬁed by a cryogenic lownoise ampliﬁer and a second ampliﬁcation stage at roomtemperature. The downconversion is performed with the6

Optimal p p p p p p p p p p p p parameters -2.501 1.685 1.757 2.105 3.822 -1.788 -1.507 -4.640 0.430 1.875 5.038 -1.906 Rotational Z Y Z Y Z Y Z Y angles ∗ p + p x p p + p x p p + p x p p + p x p x = − . x = 0 3.782 1.757 2.105 4.495 4.776 0.430 1.875 4.377 x = 1 5.467 1.757 5.927 4.495 0.136 0.430 0.630 4.377 ∗ Angles between 0 and 2 π TABLE I. Optimal parameters and angles obtained for ReLU( x ) and 4 layers. Above the 12 parameters that deﬁne therotational angles obtained through simulations. Below the corresponding angles of the 8 rotations for three diﬀerent values of x . Note that Y rotations are not x -dependent, hence they are equal for all three x values.FIG. 10. Complete pulse sequence. First, the reset protocolis performed which corresponds to two pulses at the cavityand the qubit frequencies, respectively. Note that the qubitpulse is of considerably lower amplitude than the cavity pulse.Also, both pulses have a longer duration than the qubit ro-tation sequence (timings not to scale). The “Y” pulses areshown to have diﬀerent amplitudes to determine each rota-tion angle. Finally, the readout corresponds to a pulse at thecavity frequency which is later read out by a digitizing card. same microwave generator as used in the upconversionof the measurement pulse, guaranteeing phase coherencein the downconversion process. The signal is read outin a digitizer, with a FPGA that demodulates and av-erages the results before sending the data to the mainmeasurement computer.Figure 10 shows the total pulse sequence, which in-cludes preparation and measurement pulses in additionto the pulse sequence shown in the main text. The Yrotations are performed through microwave pulses at thequbit frequency while the Z rotations, as already stated,are phase changes in subsequent pulses. An example ofthe rotation angles for the ReLU( x ) function in the 4-layer case is shown in Table I. The readout consists of acavity tone at the frequency of the cavity for the qubit inthe | (cid:105) state. High/low transmission corresponds to thequbit being in the ground/excited state, assuming thesystem does not escape from the computational basis.Each data point requires around 50000 measurements inorder to average out the ampliﬁer noise. A reset pro-tocol that drives the qubit into the ground state is im-plemented prior to each individual sequence. This has two beneﬁts. The ﬁrst one allows us to start with aqubit state nearly polarized into the ground state. Asecond beneﬁt is the reduction in the overall duration ofthe experiment, since the waiting time between individ-ual measurements is not limited by the qubit relaxationtime.Both qubit and cavity pulses are generated at 70 MHzand then upconverted to the gigahertz range. The qubitpulses are Gaussian pulses with a total duration of 21 ns.A proper DRAG correction is performed with a resultingerror per gate of (cid:15) = 0 .

01 as shown in Fig. 11. Thecavity pulse has a total length of around 2 µ s. The resetprotocol consists of a pulse driving the qubit and a pulsedriving the cavity mode, with a total duration of around2 µ s. V ( m V ) DRAGDRAG fitNo DRAGNo DRAG fit

FIG. 11. Randomized benchmarking of the DRAG correctedpulses. The ﬁt corresponds to the expression Ap n + B , where A and B have dimensions of voltage, n is the number of Clif-ford gates, and p is the ﬁdelity per gate. (cid:15) = 1 − pp