[PDF] A Scalable Photonic Computer Solving the Subset Sum Problem

Abstract

The subset sum problem is a typical NP-complete problem that is hard to solve efficiently in time due to the intrinsic superpolynomial-scaling property. Increasing the problem size results in a vast amount of time consuming in conventionally available computers. Photons possess the unique features of extremely high propagation speed, weak interaction with environment and low detectable energy level, therefore can be a promising candidate to meet the challenge by constructing an a photonic computer computer. However, most of optical computing schemes, like Fourier transformation, require very high operation precision and are hard to scale up. Here, we present a chip built-in photonic computer to efficiently solve the subset sum problem. We successfully map the problem into a waveguide network in three dimensions by using femtosecond laser direct writing technique. We show that the photons are able to sufficiently dissipate into the networks and search all the possible paths for solutions in parallel. In the case of successive primes the proposed approach exhibits a dominant superiority in time consumption even compared with supercomputers. Our results confirm the ability of light to realize a complicated computational function that is intractable with conventional computers, and suggest the subset sum problem as a good benchmarking platform for the race between photonic and conventional computers on the way towards "photonic supremacy".

Full PDF

AA Scalable Photonic Computer Solving the Subset Sum Problem

Xiao-Yun Xu,

1, 2

Xuan-Lun Huang,

1, 2

Zhan-Ming Li,

1, 2

Jun Gao,

1, 2

Zhi-QiangJiao,

1, 2

Yao Wang,

1, 2

Ruo-Jing Ren,

1, 2

H. P. Zhang, and Xian-Min Jin

1, 2, 4, ∗ Center for Integrated Quantum Information Technologies (IQIT),School of Physics and Astronomy and State Key Laboratory of Advanced Optical Communication Systems and Networks,Shanghai Jiao Tong University, Shanghai 200240, China CAS Center for Excellence and Synergetic Innovation Center in Quantum Information and Quantum Physics,University of Science and Technology of China, Hefei, Anhui 230026, China School of Physics and Astronomy, Institute of Natural Sciences,Shanghai Jiao Tong University, Shanghai 200240, China. Institute for Quantum Science and Engineering and Department of Physics,Southern University of Science and Technology, Shenzhen 518055, China

The subset sum problem is a typical NP-complete prob-lem that is hard to solve efﬁciently in time due to the in-trinsic superpolynomial-scaling property. Increasing theproblem size results in a vast amount of time consumingin conventionally available computers. Photons possessthe unique features of extremely high propagation speed,weak interaction with environment and low detectable en-ergy level, therefore can be a promising candidate to meetthe challenge by constructing an a photonic computercomputer. However, most of optical computing schemes,like Fourier transformation, require very high operationprecision and are hard to scale up. Here, we present a chipbuilt-in photonic computer to efﬁciently solve the subsetsum problem. We successfully map the problem into awaveguide network in three dimensions by using femtosec-ond laser direct writing technique. We show that the pho-tons are able to sufﬁciently dissipate into the networks andsearch all the possible paths for solutions in parallel. In thecase of successive primes the proposed approach exhibits adominant superiority in time consumption even comparedwith supercomputers. Our results conﬁrm the ability oflight to realize a complicated computational function thatis intractable with conventional computers, and suggestthe subset sum problem as a good benchmarking platformfor the race between photonic and conventional computerson the way towards “photonic supremacy”.

Introduction

NP-complete problems [1] are typically deﬁned as the prob-lems solvable in polynomial time on a non-deterministic Tur-ing machine (NTM), which indicates such problems are com-putationally hard on conventional electronic computers, a gen-eral type of deterministic Turing machines. The subset sumproblem (SSP) with practical application in resource alloca-tion [2] is a benchmark NP-complete problem [3] and its in-tractability has been harnessed in cryptosystems resistant toquantum attacks [4, 5]. Given a ﬁnite set S of N integers, ∗ [email protected] the SSP asks whether there is a subset of S whose sum isequal to the target T . Apparently, the number of subset growsexponentially with the problem size N , which leads to an ex-ponential time scaling and thus strongly limits the size of theproblem that can be tackled in reality.Despite the immense difﬁculty, some researchers attempt tosolve NP-complete problems in polynomial time with polyno-mial resource. A memcomputing machine [6, 7] as powerfulas a NTM has been demonstrated while the ambitious claim isnot valid in a realistic environment with inevitable noise [8].Designs of a NTM, where the magical oracles [1, 9, 10] arerealized by simultaneous exploring all computation paths, areproposed [11, 12]. Though in the cost of space or material,parallel exploration provides an alternative to decrease timeconsumption. As time is irreversible, not reusable and com-pletely out of our charge, it is reasonable to trade physical re-sources for it. Besides the above NTM proposals, similar mea-surements have been taken, for instance, the increasingly pow-erful electronic supercomputers with an integration of an in-creasing number of processors [13], molecule-based compu-tation utilizing large quantities of DNAs or motor molecules[14–18]. Furthermore, optimized algorithms are applied tospeciﬁc instances [19–21].Though improvements have been made, conventional elec-tronic computers are ultimately limited by heat dissipationproblem [16] which is also a possible limitation for memcom-puting machines consisting of commercial electronic devices[8]. The molecule-based computation is limited by the slowmovement[16–18] or the long reaction process[14, 15]. Quan-tum computation is still hindered by decoherence and scala-bility [22, 23]. Other proposals are still in the stage of the-ory [11, 12, 24–26]. However, we notice that photons havebeen extensively applied in proof-in-principle demonstrationsof supercomputing [27] even without quantum speed-up, in-cluding NP problem such as prime factorization [28] and NP-complete problems such as travelling salesman problem [29],Hamiltonian path problem [30–32] and dominating set prob-lem [33]. The a r X i v : . [ c s . ET ] F e b FIG. 1:

Schematic of the design and setup. (a)

A power-adjustable and horizontally polarized optical source is guaranteed by the quarter-waveplate (QWP), half-wave plate (HWP) and polarization beam splitter (PBS) in the input unit. The photons at 810 nm are prepared and coupledinto the network in the processing unit, then travel to generate all possible subset sums. The evolution results at the output ports are retrievedby the CCD to testify the existence of the corresponding sums. (b)

The abstract network for the speciﬁc instance { , , , } is composedof three different kinds of nodes representing split junctions, pass junctions and converge junctions, respectively. Split junctions (hexagonalnodes) divide the stream of photons into two portions. One portion moves vertically and the other travels diagonally.

Pass junctions (circularwhite nodes) allow the photons to proceed along their initial directions.

Converge junctions (circular yellow nodes) play a role in transferringphotons from diagonal lines to vertical lines. Though the circular yellow nodes overlap with the hexagonal nodes in the abstract network, theyare physically separate, as shown in (a). Photons travelling diagonally from a split junction to the next split junction represents including anelement into the summation. The value of the element is equal to the number of junctions between two subsequent rows of split junctions,as denoted by the integers on the left. The generated subset sums are equal to the spatial positions of the output signals, as the port numbersdenote. (c)

The

X-Y view of the top left corner of the waveguide network in (a) and the abstract network in (b), is composed of the three basicjunctions whose

X-Z views are shown in (d) , (e) and (f) respectively. The split junction is realized by a modiﬁed 3D beam splitter where acoupling distance of µm , a coupling length of . mm and a vertical decoupling distance of µm are deliberately selected, leading toa desirable splitting ratio. The unbalanced output of split junctions, revealed by the intensity distribution in (d), is designed to compensatethe bending loss caused by the subsequent arc ı cm and arc ı nf in (c). The converge junction is almost a mirror-image split junction except adifferent coupling length of . mm. The residual in port g is small enough to be ignored. A vertical decoupling distance of µm guaranteesan excellent pass junction whose extinction ratio is around dB, as the intensity distribution in (f) presents. chips serving as processing units to solve the SSP in a phys-ically scalable fashion. Like the current signal in electroniccomputers or the molecule in molecular computers, photonscontained in the optical source are treated as individual com-putation carriers. They travel in chips along buried waveguidenetworks to perform parallel computations. The speciﬁc in-stances of the problem are successfully encoded into the net-works according to particular rules. The existence of targetsums are judged by the arrival of photons to the correspond-ing output ports of the networks. We further investigate itsscalability and performance in time consumption, showing thephoton-enabled advantages. Results

A. Conﬁguration of the photonic computer for the SSP

The proposed photonic computer solving the SSP can beclassiﬁed as a non-Von Neumann architecture, see Supple-mentary Materials for its role in the evolution of computers.As shown in Fig. 1(a), the photonic computer consists of aninput unit, a processing unit and an output unit. The inputunit is employed to generate horizontally polarized photonsat 810 nm. Photons are then coupled into the processing unitto dissipate into the waveguide network to execute the com-putation task. After photons emit from the processing unit,the evolution results are read out by the output unit. Here theprocessing unit is an analog to the CPU of an electronic com-puter, playing a key role in the computation. In the following,we will discuss the design of the processing unit from mathe-matical and physical-implementation aspects to illuminate itscapability of solving the SSP.The processing unit can be represented by an abstract net-work composed of nodes and lines, which is primarily basedon the proposal of Dan V. Nicolau Jr. et al [16] while physicalimplementation has to be designed to ﬁt integrated photonics.As the network for the speciﬁc instance { , , , } in Fig.1(b) shows, there are three different types of nodes represent-ing split junctions, pass junctions and converge junctions, re-spectively. It should be noticed that though the circular yellownodes overlap with the hexagonal nodes in the abstract net-work, they are physically separate, as the waveguide networkin Fig. 1(a) presents. Once the photons enter the networkfrom the top node, the computation process is activated. Thephotons are split into two portions at hexagonal nodes ( splitjunctions ), traveling vertically and diagonally. When meetingthe circular white nodes ( pass junctions ), the photons proceedalong the original directions. Meanwhile, the circular yellownodes ( converge junction ), located at the end of the diagonalroutes which start from the former row of hexagonal nodes,are responsible for transferring photons from diagonal lines tovertical lines before next splits.The speciﬁc SSP is encoded into the network according toparticular arithmetical and scalable rules: (i) The vertical dis-tance (measured as the number of nodes) between two sub-sequent rows of hexagonal nodes is equal to the value of theelement from the set { , , , } , as denoted by the integers on the left. (ii) The diagonal routing leads to a horizontaldisplacement of photons, whose magnitude is also equal tothe integer on the left. The diagonal movement of photonsrepresents that the corresponding element is included into thesummation. On the contrary, the vertical movement meansthe element is excluded from the summation. (iii)

The valueof the ultimate sums are equivalent to the spatial position ofthe output signals, as denoted by the port numbers. For ex-ample, the path for port , highlighted by a translucent grayband, reveals that only elements and contribute to the sub-set sum . Owing to the vast parallelism, the photons arriveat the output ports with all possible subset sums generated.We fabricate the processing unit in Corning Eagle XG glasswith femtosecond laser direct writing technique (see Materialsand Methods). The top left corner of the waveguide networkin Fig. 1(a) and the abstract network in Fig. 1(b), is detailedlydepicted in Fig. 1(c)-1(f). As we can see, the split junction isrealized by a modiﬁed 3D beam splitter where the two waveg-uides ﬁrst couple evanescently (red segment), and then decou-ple with one of the waveguides climbing upward and the otherproceeding along the initial direction. To avoid extra loss, avertical decoupling distance of 25 µ m is deliberately selected.Meanwhile, the coupling length and coupling distance is setto be 1.8 mm and 10 µ m respectively to achieve the desirablesplitting ratio. As the intensity distribution in Fig. 1(d) re-veals, the modiﬁed beam splitter is unbalanced, which is onthe purpose of compensating the bending loss caused by thesubsequent arc ˆ cm and arc ı nf .The converge junction is almost the mirror-image splitjunction except a different coupling length of 3.3 mm. Thephotons in path f g should be completely transferred to path eh in an ideal case, and the residual induced by the imperfectfabrication is minimized. The output intensity at port g is lessthan of that at port h . The pass junction is implemented inthe form of one waveguide crossing over the other at a decou-pling vertical distance of 25 µ m . As the output intensity inFig. 1(f) reveals, the three-dimensional architecture ensuresan excellent pass junction whose extinction ratio is around24 dB. Supported by the powerful fabrication capability offemtosecond laser writing technique [43, 44], we are able tomap the abstract network of the SSP into a three-dimensionalphotonic chip. B. Experimental demonstration

We demonstrate the computation of the SSP at the speciﬁccases of { , , } and { , , , } . As is shown in Fig. 2(a)and Fig. 2(c), the evolution results are read out from a one-shot image, where the photons appear in a line of spots. Everysingle spot is an accepted witness of the existence of the cor-responding sum (denoted by the integer below the spot) if theexperiments are trusted. Since the involved problem size isnot too large, we check the reliability of our experimental re-sults by enumeration and conclude that all the spots observedare supposed to appear and that none of the expected resultsis absent. FIG. 2:

Experimental read-out of computing results. (a)

Experimental read-out of evolution results of the case { , , } . Every observablespot certiﬁes the existence of the subset sum denoted by the integer below. (b) Normalized intensity distribution of the case { , , } inexperiment and theory. Here an axis break is applied to display data points with a value of zero and the logarithmic coordinate simultaneously.The theoretical results are either zero or 0.125 while the experimental results have a ﬂuctuant distribution. A reasonable threshold can be easilyfound to classify the experimental outcomes into appearance (beyond the threshold) and absence (below the threshold, which is highlightedwith slash ﬁlling pattern). A wide tolerance band (ﬁlled by slash) allows a wide range of threshold with a lower bound of 0.00209 and anupper bound of 0.05891. (c) Experimental read-out of evolution results of the case { , , , } . (d) Normalized intensity distribution of thecase { , , , } in experiment and theory. Theoretical results are either zero or 0.0625. A wide tolerance band (ﬁlled by slash) allows a widerange of threshold with a lower bound of 0.00127 and an upper bound of 0.00661. The reliability of our experiments is further investigated bya comprehensive analysis of the intensity distribution, as pre-sented in Fig. 2(b) and Fig. 2(d). We calculate the theoreticaldistribution through a lossless model consisting of balancedsplit junctions, perfect pass junctions and ideal converge junc-tions. Therefore, the theoretical outcomes can be regard asbenchmarks of the SSPs. For the case of { , , } , the the-oretical result is either zero or . , while it is either zero or . in the case of { , , , } . In this theoretical regime,zero intensity indicates that a sum does not exist, otherwise itexists.We apply a threshold to analyze the retrieved intensity forevery output port. A valid appearance can be identiﬁed ifthe intensity goes beyond a reasonable threshold, otherwisean absence can be conﬁrmed (highlighted by slash pattern).The tolerant intervals of the thresholds applicable in ourexperiment are presented with bands ﬁlled with slash inFig. 2(b) and Fig. 2(d), straightforward revealing the lower bounds and the upper bounds. Beneﬁcial from the goodsignal-to-noise ratio obtained in our experiments, there isa wide tolerant band to accept a large range of thresholds,which implies the great accuracies of our experiments andveriﬁes the feasibility of our approach. C. Time-consumption budget

Interestingly, we ﬁnd that the optical source launched intothe photonic circuit has a signiﬁcant inﬂuence on the perfor-mance of our photonic computer. It should be noticed that thephotonic supremacy in time consumption over other schemesis achieved by classical light (a stream of photons), not quan-tum light. We obtain the same evolution results with bothclassical light and quantum light, and the heralded single-photon source fails to outperform the classical light. This phe-

FIG. 3:

Time consumption performance.

The comparison of es-timated computing time between the photonic computer and othercompetitors in the case of successive primes { , , , , . . . } . Themolecular computation is beat by the photonic computer all the time.The electronic competitors working in a brute-force manner are sur-passed at N = 6 , N = 12 and N = 28 , respectively. As prob-lem size increases, the superiority of photonic computer is enhanced,with the computing time of several orders of magnitude shorter thanthe rivals. nomenon is attributed to the fact that a bunch of photons arrivetogether in the case of classical light while heralded single-photon source only launches one photon at a time. Under suchcircumstances, it takes longer time with quantum light to col-lect enough signal photons to be distinguished from the envi-ronment, leading to a worse performance than classical lightand making it more challenging to surpass electronic comput-ers (see Supplementary Materials).To show the photon-enabled advantages, we further investi-gate the time-consumption performance in the case of classi-cal light. Here the computing time is determined by the propa-gation speed of photons and the longest path in the waveguidenetwork. Owing to the fast movement of ﬂying photons andthe compactness of the chip-based networks, it only takes theprocessing units a fraction of one nanosecond to accomplishthe computations in our experiments, which has already sur-passed many representative electronic computers emerging inthese decades (see Supplementary Materials).Furthermore, the potential of our approach is explored inthe context of successive primes by comparing with othercompetitors (see Materials and Methods for time estimationof different approaches), as shown in Fig. 3. It is noticedthat the photonic computer has a signiﬁcant advantage overthe molecular computation, which is attributed to the similartime scaling resulting from the similar conﬁguration of thecomputing networks, and the superiority of photons in mov-ing speed over molecules, i.e., ∼ × mm/s for nmphotons in waveguides and ∼ × − mm/s for actin ﬁla-ments [16]. Though faster biological molecules are reportedin a latest research [18], they are still on the long journey of chasing after photons.The time consumption of representative electronic com-petitors with the conventional Von Neumann architecture,characterized by ﬂoating point operations per second(FLOPS) [45], are also presented. It is found that thephotonic computer outperforms the state-of-the-art CPU[46] at a small size that is probably accessible in subsequentexperimental demonstrations. Compared with the GPU [47],the photonic computer exceeds it until N = 12 . Apparently,it is increasingly challenging to beat an increasingly strongcompetitor. Nevertheless, the most powerful supercomputer[13], Summit, composed of an enormous number of CPUsand GPUs, can be also surpassed at a modest size of .Besides, the superiority of the photonic computer is rein-forced with the growth of problem size, as the trend reveals.Even at a medium size, our approach consumes many ordersof magnitude shorter computing time than the molecularand electronic rivals, exhibiting strong competitiveness insolving the SSP at the case of successive primes (see Mate-rials and Methods for the speed-up of our photonic computer). Discussion

In summary, we demonstrate a photonic computer solvingthe SSP by mapping the problem into a waveguide networkin a three-dimensional architecture. With the demonstratedstandardized structure of basic junctions, regular conﬁgura-tion of the network and the mature femtosecond laser writingtechnique, the SSP can be encoded into a physical networkand conveniently solved in a scalable fashion. The compu-tational power is further analyzed by investigating the time-consumption performance. The results suggest that, for suc-cessive primes, photonic computers are very likely to beat themost powerful supercomputer with a near-future accessibleproblem size. Other performances, such as signal-to-noiseratio and Fisher information are also discussed (see Supple-mentary Materials).The photon-enabled advantage in solving the SSP can beunderstood from the unique features of light. Firstly, light isessentially a stream of photons, which can be sufﬁciently em-ployed to probe all the paths in parallel, by being dissipatedinto a large network with very small fraction of light in eachpath (can be down to single-photon level). Secondly, the ul-timate speed of ﬂying photons makes the evolution time veryshort in the designed structures, even for a large and compli-cated photonic network. Thirdly, photons can be conﬁned in avery limited space with the technique of integrated photonics,which is beneﬁcial to both the computing speed and scala-bility. Last but not least, interference is a unique strength ofphotons, whereas we can not see its contribution to the speed-up of the proposed photonic computer. Nevertheless, it can bepotentially utilized to achieve a reconﬁgurable photonic com-puter for different SSPs in the future (see Supplementary Ma-terials).Besides the fundamental interest of racing with conven-tional electronic computers, it would be more fascinating tomap many real-life problems into the frame of solving theSSP, which may boost the building of such photonic computertowards industrialization. It is also possible, but still open,to solve other NP problems in this purpose-built photoniccomputer. In light of the fact that any NP problem can bereduced to a NP-complete problem efﬁciently [3], any NPproblem is able to be mapped to the proposed network inprinciple. Therefore, a photonic solution of the SSP impliespossible solutions of a wide range of NP problems. Moreover,photon-enabled unique feature may also show its strength inother new computing architectures [48, 49].

Materials and Methods

Photonic chips fabrication.

Waveguide networks in three-dimensional artchitecture are written by the femtosecondlaser with a repetition rate of MHz, a central wavelengthof nm, a pulse duration of fs and a pulse energy of nJ. Before radiating into the borosilicate substrate at adepth of µm , the laser beam is shaped by a cylindricallens and then focused by a × objective with a numericalaperture of . . During the fabrication, the translational stagemoves in X , Y , Z directions according to the user-deﬁnedprogramme at a constant speed of mm/s. The careful mea-surements and characterization on the geometric parameterdependence of the three types of junction, such as couplinglength, coupling distance, decoupling distance and curvature,are taken to optimize the performance to form the standardelements. Estimation of computing time.

For both molecular compu-tation and our approach, the computing time is determined bythe moving speed of computation carrier (i.e., molecules andphotons) and the longest path in the network. For example,in the case of { , , , } shown in Fig. 1, the longest pathis the one linking to the port 23 which represents the sum ofall the elements in the set. According to the geometrical pa-rameters and the scalable rules of our waveguide network, itis easy to calculate the length of the longest path. The prop-agating speed of photons is estimated on the basis of the re-fractive index of Corning Eagle XG [50] and the refractiveindex change induced by femtosecond laser writing [51]. Thestructural parameters of molecular computation derive fromthe experiment by Dan V. Nicolau Jr. et al [16]. The fastermolecules, actin ﬁlaments, are chosen to compared with ourapproach.The running time taken by conventional electronic com-puters working in a brute-force mode, searching the entiresolution space consisting of all possible subsets, to solve theSSP is estimated by multiplying FLOPS by the total numberof arithmetic operations. The data of FLOPS adopted in ourresearch is either the peak performance or theoretical perfor-mance of the corresponding electronic machine. Performancedegradation [46] in practical scenario is neglected. Speed-up of the photonic computer.

Given a set of N el- ements, the number of subsets grows exponentially with N .According to the deﬁnition of the SSP, it requires us to ver-ify every possible subset. If we regard the veriﬁcation of asubset as a subtask, the number of subtasks or the number ofcomputation operations increases at an exponential rate.For a conventional electronic computer working sequen-tially, all subtasks are executed in sequence. Therefore, thetotal computing time is equivalent to the product of the num-ber of computation operations and the unit time taken by asingle operation, growing at an exponential rate. For our pho-tonic computer which works in a parallel mode, all subtaskscan be executed simultaneously. In our implementations, eachsubset is mapped to a path of the photonic circuits. With lightbeam (a stream of photons) being split and propagating alongall possible paths, all subsets are veriﬁed at the same time. Onsuch an occasion, the total computing time only depends onthe veriﬁcation of the largest subset.Here the veriﬁcation of the largest subset corresponds tothe movement of photons from the input port to the outputport through the longest path. As a result, the computingtime is equal to the traveling time of photons in the longestpath, growing at a sub-exponential rate which is slower thanthat of electronic computers. Moreover, as photons possessan ultra-high propagating speed and the integrated photoniccircuit has a compact structure, the computing process isfurther speeded up. Supplementary Materials

I. The evolution of computers and the role of non-Von Neu-mann architectureII. The inﬂuence of optical source on time consumptionIII. Time-consumption performanceIV. Signal-to-noise ratioV. Fisher informationVI. The role of interferenceFIG. S1: The role of non-Von Neumann architecture.FIG. S2: Time-consumption performance.FIG. S3: Signal-to-noise ratio.

Acknowledgments

Funding:

This research is supported by the Na-tional Key R&D Program of China (2019YFA0308700,2017YFA0303700), the National Natural Science Foundationof China (61734005, 11761141014, 11690033, 11774222),the Science and Technology Commission of Shanghai Mu-nicipality (17JC1400403), the Shanghai Municipal Educa-tion Commission (2017-01-07-00-02-E00049). X.-M.J. ac-knowledges additional support from a Shanghai talent pro-gram.

Author contributions:

X.-M.J. and H.-P.Z. conceivedthe project. X.-M.J. supervised the project. X.-Y.X. and X.-M.J. designed the experiment. X.-Y.X. fabricated the photonicchips. X.-Y.X., X.-L.H., Z.-M.L., J.G., Z.-Q.J., Y.W., R.-J.R.and X.-M.J. performed the experiment and analyzed the data.X.-Y.X. and X.-M.J. wrote the paper with input from all theother authors.

Competing interests:

The authors declare thatthey have no competing interests.

Data and materials avail- ability:

All data needed to evaluate the conclusions in the pa-per are present in the paper and the Supplementary Materials.Additional data available from authors upon request. [1] M. R. Garey, D. S. Johnson,

Computers and Intractability: AGuide to the Theory of NP-Completeness (W. H. Freeman, NewYork, 1979).[2] A. Darmann, G. Nicosia, U. Pferschy, J. Schauer, The SubsetSum game.

Eur. J. Oper. Res. , 539-549 (2014).[3] R. M. Karp, “Reducibility among combinatorial problems” in

Complexity of Computer Computations (The IBM ResearchSymposia Series, Springer, Boston, MA, 1972), pp. 85-103.[4] T. Okamoto, K. Tanaka, S. Uchiyama, “Quantum public-key cryptosystems” in

Advances in Cryptology-CRYPTO 2000 (Springer, Berlin, Heidelberg, 2000), pp. 147-165.[5] A. Kate, I. Goldberg, Generalizing cryptosystems based on thesubset sum problem.

Int. J. Inf. Secur. , 189-199 (2011).[6] F. L. Traversa, M. Di Ventra, Universal memcomputing ma-chines. IEEE Transactions on Neural Networks and LearningSystems , 2702-2715 (2015).[7] M. Di Ventra, Y. V. Pershin, The parallel approach. Nat. Phys. , 200-202 (2013).[8] F. L. Traversa, C. Ramella, F. Bonani, M. Di Ventra, Memcom-puting NP-complete problems in polynomial time using poly-nomial resources and collective states. Sci. Adv. , e1500031(2015).[9] A. Church, A. M. Turing. On computable numbers, with anapplication to the Entscheidungs problcm. Proceedings of theLondon Mathematical Society, 2 s. vol. 42 (1936-1937), pp.230-265. Journal of Symbolic Logic , 42C43 (1937).[10] J. W. Dawson Jr., Review: The Essential Turing: Seminal Writ-ings in Computing, Logic, Philosophy, Artiﬁcial Ingelligence,and Artiﬁcial Life plus The Secrets of Enigma, by Alan M. Tur-ing (author) and B. Jack Copeland (editor). Rev. Mod. Log. ,179-181 (2007).[11] A. Currin, K. Korovin, M. Ababi, K. Roper, D. B. Kell, P. J.Day, R. D. King, Computing exponentially faster: implement-ing a non-deterministic universal Turing machine using DNA. J. R. Soc. Interface. Jpn. J. Appl. Phys. , 5839-5841 (1998).[15] C. V. Henkel, T. B¨ack, J. N. Kok, G. Rozenberg, H. P. Spaink,DNA computing of solutions to knapsack problems. Biosystems , 156-162 (2007).[16] D. V. Nicolau Jr., M. Lard, T. Korten, F. C. M. J. M. van Delft,M. Persson, E. Bengtsson, A. M˚ansson, S. Diez, H. Linke, D. V.Nicolau, Parallel computation with molecular-motor-propelledagents in nanofabricated networks. Proc. Natl. Acad. Sci. U.S.A. , 2591-2596 (2016).[17] G. Heldt, Ch. Meinecke, S. Steenhusen, T. Korten, M. Groß,G. Domann, F. Lindberg, D. Reuter, St. Diez, H. Linke, St.E. Schulz, “Approach to combine electron-beam lithographyand two-photon polymerization for enhanced nano-channelsin network-based biocomputation devices” in (SPIE, 2018), vol. 10775.[18] F. C. M. J. M. van Delf, G. Ipolitti, DV. Nicolau Jr., A. Sudalaiyadum Peruma, O. Kaˇspar, S. Kheireddine, S.Wachsmann-Hogiu, D. V. Nicolau, Something has to give: scal-ing combinatorial computing by biological agents exploringphysical networks encoding NP-complete problems. InterfaceFocus , 20180034 (2018).[19] E. Horowitz, S. Sahni, Computing partitions with applicationsto the knapsack problem. J. ACM , 277-292 (1974).[20] D. Pisinger, Linear time algorithms for knapsack problems withbounded weights. Journal of Algorithms , 1-14 (1999).[21] K. Koiliaris, C. Xu, “A faster pseudopolynomial time algorithmfor subset sum” in Proceedings of the Twenty-Eighth AnnualACM-SIAM Symposium on Discrete Algorithms (SIAM, 2017),pp. 1062-1072.[22] W. L. Chang, T. T. Ren, M. Feng, L. C. Lu, K. W. Lin, M. Guo,“Quantum algorithms of the subset-sum problem on a quantumcomputer” in (IEEE, 2009), pp. 54-57.[23] T. D. Ladd, F. Jelezko, R. Laﬂamme, Y. Nakamura, C. Monroe,J. L. O’Brien, Quantum computers.

Nature , 45-53 (2010).[24] M. Oltean, O. Muntean, Solving the subset-sum problem witha light-based device.

Nat. Comput. , 321-331 (2009).[25] M. Hasan, S. Hossain, M. M. Rahman, M. S. Rahman, Solvingthe generalized subset sum problem with a light based device. Nat. Comput. , 541-550 (2011).[26] A. G. Rudi, S. Jalili, A parallel optical implementation ofarithmetic operations. Optics & Laser Technology , 173-182(2013).[27] H. J. Caulﬁeld, S. Dolev, Why future supercomputing requiresoptics. Nat. Photon. , 261-263 (2010).[28] K. Nitta, N. Katsuta, O. Matoba, An optical parallel system forprime factorization. Jpn. J. Appl. Phys. , 09LA02 (2009).[29] N. T. Shaked, S. Messika, S. Dolev, J. Rosen, Optical solu-tion for bounded NP-complete problems. Appl. Opt. , 711-724 (2007).[30] K. Wu, J. G. de Abajo, C. Soci, P. P. Shum, N. I. Zheludev, Anoptical ﬁber network oracle for NP-complete problems. LightSci. Appl. , e147 (2014).[31] M. R. V´azquez, V. Bharadwaj, B. Sotillo, S. Z. A. Lo, R. Ram-poni, N. I. Zheludev, G. Lanzani, S. M. Eaton, C. Soci, OpticalNP problem solver on laser-written waveguide platform. Opt.Express , 702-710 (2018).[32] S. Dolev, H. Fitoussi, Masking traveling beams: optical solu-tions for NP-complete problems, trading space for time. Theo-retical Computer Science , 837-853 (2010).[33] S. Goliaei, S. Jalili, J. Salimi, Light-based solution for the dom-inating set problem.

Appl. Opt. , 6979-6983 (2012).[34] M. Tillmann, B. Daki´c, R. Heilmann, S. Nolte, A. Szameit, P.Walther, Experimental boson sampling. Nat. Photon. , 540-544 (2013).[35] J. B. Spring, B. J. Metcalf, P. C. Humphreys, W. S. Koltham-mer, X. M. Jin, M. Barbieri, A. Datta, N. Thomas-Peter, N. K.Langford, D. Kundys, J. C. Gates, B. J. Smith, P. G. R. Smith, I.A. Walmsley, Boson sampling on a photonic chip. Science ,798-801 (2013).[36] M. A. Broome, A. Fedrizzi, S. Rahimi-Keshari, J. Dove, S.Aaronson, T. C. Ralph, A. G. White, Photonic boson sampling in a tunable circuit.

Science , 794-798 (2013).[37] A. Crespi, R. Osellame, R. Ramponi, D. J. Brod, E. F. Galv˜ao,N. Spagnolo, C. Vitelli, E. Maiorino, P. Mataloni, F. Sciarrino,Integrated multimode interferometers with arbitrary designs forphotonic boson sampling.

Nat. Photon. , 545-549 (2013).[38] J. Carolan, J. D. A. Meinecke, P. J. Shadbolt, N. J. Russell,N. Ismail, K. W¨orhoff, T. Rudolph, M. G. Thompson, J. L.O’Brien, J. C. F. Matthews, A. Laing, On the experimental ver-iﬁcation of quantum complexity in linear optics. Nat. Photon. , 621-626 (2014).[39] J. Carolan, C. Harrold, C. Sparrow, E. Mart´ın-L´opez, N. J. Rus-sell, J. W. Silverstone, P. J. Shadbolt, N. Matsuda, M. Oguma,M. Itoh, G. D. Marshall, M. G. Thompson, J. C. F. Matthews,T. Hashimoto, J. L. O’Brien, A. Laing, Universal linear optics. Science , 711-716 (2015).[40] J. Feldmann, M. Stegmaier, N. Gruhler, C. R´ıos, H. Bhaskaran,C. D. Wright, W. H. P. Pernice, Calculating with light using achip-scale all-optical abacus.

Nat. Commun. , 1256 (2017).[41] H. Tang, X. F. Lin, Z. Feng, J. Y. Chen, J. Gao, K. Sun, C. Y.Wang, P. C. Lai, X. Y. Xu, Y. Wang, L. F. Qiao, A. L. Yang,X. M. Jin, Experimental two-dimensional quantum walk on aphotonic chip. Sci. Adv. , eaat3174 (2018).[42] H. Tang, C. Di Franco, Z. Y. Shi, T. S. He, Z. Feng, J. Gao, K.Sun, Z. M. Li, Z. Q. Jiao, T. Y. Wang, M. S. Kim, X. M. Jin,Experimental quantum fast hitting on hexagonal graphs. Nat.Photon. , 754-758 (2018).[43] A. Szameit, F. Dreisow, T. Pertsch, S. Nolte, A. T¨unnermann,Control of directional evanescent coupling in fs laser writtenwaveguides. Opt. Express , 1579-1587 (2007).[44] R. Osellame, G. Cerullo, R. Ramponi, Femtosecond Laser Mi-cromachining: Photonic and Microﬂuidic Devices in Transpar-ent Materials (vol. 123 of Topics in Applied Physics, Springer,Berlin, Heidelberg, 2012).[45] FLOPS, Https://en.wikipedia.org/wiki/FLOPS.[46] P. Gepner, D. L. Fraser, V. Gamayunov, “Evaluation of the 3rdgeneration Intel Core processor focusing on HPC applications”in

Proceedings of the International Conference on Parallel andDistributed Processing Techniques and Applications

J. Opt. Soc. Am. B ,1629-1636 (2019).[49] L. M. Caligiuri, T. Musha, Quantum hyper-computing bymeans of evanescent photons. J. Phys.: Conf. Ser.

Opt. Express , 9443-9458 (2008).[52] F. Yang, R. Nair, M. Tsang, C. Simon, A. I. Lvovsky, Fisher in-formation for far-ﬁeld linear optical superresolution via homo-dyne or heterodyne detection in a higher-order local oscillatormode. Phys. Rev. A , 063829 (2017). Supplemental Materials: A Scalable Photonic Computer Solving the Subset Sum ProblemSupplementary Note 1: The evolution of computers and the role of non-Von Neumann architecture

The revolution brought by advanced electronic computers is prevalent in modern life, involving education, ﬁnance, transporta-tion, scientiﬁc research, medical care and so on. However, it is hard to imagine such scenarios take place in the past days whenpeople calculated with manual abacus or electromechanical devices. It is the development of computing devices that improvesthe human computing power, leading to a higher working efﬁciency and then propelling the prosperity of society.Looking back into the history of electronic computers, it evolves from a crude model to a sophisticated machine. The ﬁrstgeneral-purpose electronic computer was built with vacuum tubes in 1950s, weighing 30 tons. Then it developed with theemergence of transistors and thrived with the innovation of integrated circuits in the past decades, following Moore’s law to-wards higher performances and more compact size, boosting the improvement of human computing power. Recent electroniccomputers enable performances comparable to a mouse brain, far beyond the levels of early ones, such as UNIVAC.

Supplementary Figure 1:

The role of non-Von Neumann architecture.

Various non-Von Neumann computing architectures have beensuccessfully applied to solve the problems intractable for the conventional Von Neumann architecture.

However, Moore’s law has already faltered in recent years and it is believed to go to an end in the near future. Due to theinevitable heat dissipation problem, traditional electronic computers are ultimately limited. Meanwhile, with the miniaturiza-tion of integrated circuit, transistors will be unreliable as quantum tunneling effect is emerging. The human computing powercounting on electronic computers is becoming increasingly difﬁcult to make a breakthrough. Up to today, even the most pow-erful supercomputer operates an order of magnitude slower than the human brain, suggesting that the ultimate limit is hardlydramatically beyond the human brain.Furthermore, conventional electronic computers based on Von Neumann architecture are inherently not powerful enough tosolve NP-complete problems efﬁciently. As we depict in FIG. S1, many combinatorial problems in practice can be classiﬁedinto corresponding categories according to their computational complexity. Problems belonging to the subset P can be efﬁcientlysolved by conventional computers while other NP problems and

Supplementary Note 2: The inﬂuence of optical source on time consumption

We have applied a coherent laser to launch photons into the photonic computer to solved the SSP. As Fig. 3 in the main textexhibits, the photonic computer shows a supremacy over other competitors in time consumption with the growth of problemsize.Given the quantum supremacy taking place in other situations such as boson sampling (Ref. 34-39) and quantum walk (Ref.41), we attempt to ﬁgure out the inﬂuence induced by quantum light on the time consumption of solving the SSP with ourphotonic computer. Here we consider a heralded single-photon source. It is found that the same evolution results are obtainedwith both classical light and quantum light, and the heralded single-photon source fails to outperform the classical light. Thisphenomenon is attributed to the fact that a bunch of photons arrive together in the case of classical light while heralded single-photon source only launches one photon at a time. Under such circumstances, it takes longer time with quantum light toaccumulate enough signal photons to be distinguished from the environment, resulting in a worse performance and making itmore challenging to surpass electronic computers.Assuming that the environment noise is equivalent to m photons, then it is essential to accumulate at least m + 1 signalphotons to be distinguished from the environment. In an lossless case, the computing time taken by quantum light is at least m times of that taken by classical light. In a practical case with inevitable loss, the computing time taken by quantum light isfurther increased. Supplementary Note 3: Time-consumption performance

Besides the comparison in time consumption displayed in the Fig. 3 in the main text, we further compare the photoniccomputer with representative electronic computers which are developed over these decades and once treated as the state-of-the-art computers. As FIG. S2 reveals, our ﬁrst photonic demonstration (N=4) is comparable to today’s electronic computer (IntelCore i7-3770k), and indeed outperforms the rest representative electronic computers emerging in the ﬁrst several decades sincethe electronic computer was invented in 1950s. Furthermore, the photonic computer has a lower growing rate of computing timethan all the electronic counterparts, which dominantly determines the ultimate winner of this race. It has taken conventionalcomputers over half century to evolve from a crude model to state-of-the-art machines, undergoing a development of severalgenerations. We believe that, to some degree, the performance of our ﬁrst implementation reasonably exhibits the strengths andpotential of the photonic computer.

Supplementary Note 4: Signal-to-noise ratio

In the proposed photonic computer, the output signal going through the longest path appears be the weakest. Therefore, thesignal-to-noise ratio (SNR) is evaluated by analyzing the signal traveling in the longest path. According to the deﬁnition of SNR,1

Supplementary Figure 2:

Time-consumption performance.

The comparison in time-consumption performance between our photonic com-puter and the representative electronic computers which are developed over these years. These electronic computers are ﬁrst launched in 1950s,1975, 2000 and 2012, respectively. we have

SN R = 10 log ( Sig/N oi ) SN R = − log ( In/Sig ) + 10 log ( In/N oi ) Where

Sig represents the power of the weakest signal,

N oi is the equivalent power of environmental noise and In is the inputpower.We assume a stable environment , i.e. the environmental noise N oi is invariant. The latter term log ( In/N oi ) is inde-pendent of problem size N and determines the upper bound of SNR which grows with the increase of In . The former term − log ( In/Sig ) is related to problem size N . Then the expression can be simpliﬁed as SN R = F ( N ) + C Where C = 10 log ( In/N oi ) , F ( N ) = − log ( In/Sig ) .Based on the design of the photonic computer, we have F ( N ) = c N + c S Where S is the sum of the ﬁrst N primes and grows at a sub-exponential rate. Constants c and c are determined by the speciﬁcparameters of the photonic computer, such as propagation loss, splitting ratio of beam splitters and the size of basic modules.Here c and c are estimated to be -3.212 and -0.0252, respectively. Therefore, we have SN R = − . N − . S + C Take silicon detector for example, with 100Hz dark count and 1MHz count rate, the noise in one shot is down to 0.0001. Thecorresponding SNR with different input power is exhibited in FIG. S3, decreasing at a sub-exponential rate. It should be noticedthat the upper bound C has no inﬂuence on the decreasing rate of SNR. Supplementary Note 5: Fisher information

Since the longest path suffers the largest loss, the number of trials depends on the probability of a single photon arriving atthe corresponding output port of the longest path. On such an occasion, we consider a coin toss model to evaluate the Fisherinformation, i.e. the variable X only has two outcomes. Here X is a trial. X = 1 represents that a photon reaches the2 Supplementary Figure 3:

Signal-to-noise ratio.

The signal-to-noise ratio of our photonic computer in the case of different input power. target output port, which happens with a probability of θ . X = 0 corresponds the opposite situation, which takes place with aprobability of − θ . Therefore, the probability density function is P rob ( X | θ ) = θ X (1 − θ ) (1 − X ) . According to the deﬁnition of Fisher information (Ref. 52), the corresponding Fisher information carried by one trial is expressedas

Inf ( θ ) = 1 θ (1 − θ ) . Since the Fisher information is additive, the Fisher information contained in M independent trials is Mθ (1 − θ ) . Finally, we acquirethe lower bound on the variance of θ as V ar ( θ ) ≥ Inf ( θ ) = θ (1 − θ ) M . As θ is related to problem size N , we have V ar ( N ) ≥ Inf ( θ ( N )) = θ ( N )(1 − θ ( N )) M .

Based on the above analysis of SNR, we obtain θ ( N ) = 10 F ( N )10 = 10 − . N − . S , where S is the sum of the ﬁrst N primes. Supplementary Note 6: The role of interference

Inﬂuence on ultimate outcome.

Based on the structure of the proposed photonic circuit, it is found that different light beamsstemming from the same split junctions may gather somewhere else during their travel to the output ports. Therefore, in the caseof coherent light, there is a possibility of occurrence of interference, which might induce a ﬂuctuation of the intensity distributionat the output ports.However, interference can hardly inﬂuence the ultimate outcome. Since the SSP is a decision problem whose kernel lies ingiving an answer of YES or NO, a ﬂuctuation of intensity distribution almost can’t turn the outcome inverse in light of a fewfacts: (i)

A high signal-to-noise ratio, up to tens of dB even for a relatively large-size problem, can be provided, as analyzed inthe section IV, which lays a strong foundation for tolerating the ﬂuctuation induced by interference. (ii)

Completely destructiveor constructive interference, of course unwanted in our experiment, requires highly precise control of phase, whereas the harshrequirement is of great difﬁculty to meet due to the inevitable fabrication imperfection. (iii)

The inﬂuence from interference can3be greatly weakened and even eliminated by applying broadband optical sources with an ultra-short coherence length, such assuperluminescent diodes (e.g., SLD-35-HP SUPERLUM and SLD830S-A10W THORLABS) whose coherence length can beas short as a few microns.

Contribution to the speed-up.