Miniaturizing neural networks for charge state autotuning in quantum dots
Stefanie Czischek, Victor Yon, Marc-Antoine Genest, Marc-Antoine Roux, Sophie Rochette, Julien Camirand Lemyre, Mathieu Moras, Michel Pioro-Ladrière, Dominique Drouin, Yann Beilliard, Roger G. Melko
MMiniaturizing neural networks for charge state autotuning in quantum dots
Stefanie Czischek, ∗ Victor Yon,
2, 3, 4
Marc-Antoine Genest,
4, 5
Marc-AntoineRoux,
4, 5
Sophie Rochette,
4, 5
Julien Camirand Lemyre,
4, 5
Mathieu Moras,
4, 5
MichelPioro-Ladri`ere,
4, 5, 2, 3
Dominique Drouin,
2, 3, 4
Yann Beilliard,
2, 3, 4 and Roger G. Melko
1, 6 Department of Physics and Astronomy, University of Waterloo, Ontario, N2L 3G1, Canada Institut Interdisciplinaire d’Innovation Technologique (3IT), Universit´e de Sherbrooke, Sherbrooke J1K 0A5, Canada Laboratoire Nanotechnologies Nanosyst`emes (LN2) – CNRS UMI-3463 – 3IT, Sherbrooke J1K 0A5, Canada Institut quantique, Universit´e de Sherbrooke, Sherbrooke J1K 2R1, Canada D´epartement de Physique, Universit´e de Sherbrooke, Sherbrooke J1K 2R1, Canada Perimeter Institute for Theoretical Physics, Waterloo, Ontario, N2L 2Y5, Canada (Dated: January 12, 2021)A key challenge in scaling quantum computers is the calibration and control of multiple qubits.In solid-state quantum dots, the gate voltages required to stabilize quantized charges are uniquefor each individual qubit, resulting in a high-dimensional control parameter space that must betuned automatically. Machine learning techniques are capable of processing high-dimensional data –provided that an appropriate training set is available – and have been successfully used for autotuningin the past. In this paper, we develop extremely small feed-forward neural networks that can beused to detect charge-state transitions in quantum dot stability diagrams. We demonstrate thatthese neural networks can be trained on synthetic data produced by computer simulations, androbustly transferred to the task of tuning an experimental device into a desired charge state. Theneural networks required for this task are sufficiently small as to enable an implementation inexisting memristor crossbar arrays in the near future. This opens up the possibility of miniaturizingpowerful control elements on low-power hardware, a significant step towards on-chip autotuning infuture quantum dot computers.
I. INTRODUCTION
Solid-state quantum dots (QDs) are one of severalpromising candidates for qubits, the basic building blocksof quantum computers [1–8]. They are engineered insemiconductor devices by the electrostatic confinementof single charge carriers (electrons or holes), and pre-cisely tuned to a few-carrier regime where quantum ef-fects dominate. Even single-dot devices require the tun-ing of multiple gates for the control of reservoirs, dots,and tunnel barriers [2]. The relationship between appliedgate voltages and physical properties of a QD is highlycomplex and device-specific, requiring significant calibra-tion and tuning. Thus, a key challenge in scaling up QDarchitectures to act as multi-qubit devices will be tun-ing within the high-dimensional space of gate voltages.This is a highly non-trivial control problem that cannotbe accomplished without significant automation [9].An automated process of finding a range of gate volt-ages in which a QD is in a specific carrier configura-tion is called autotuning. Due to the variability inher-ent in different QD devices, autotuning naturally ben-efits from a data-driven approach. Several compellingmachine learning strategies have recently been intro-duced for from-scratch QD tuning [10], coarse tuninginto charge regimes [11–15], fine tuning couplings be-tween multiple dots [16, 17], or performing autonomousmeasurements [18, 19]. These studies have demonstrated ∗ [email protected] robust effectiveness in identifying electronic states andcharge configurations, and automating the precise tun-ing of gates. Common algorithms in supervised learning,such as support vector machines, deep, and convolutionalneural networks, are sufficiently powerful for the complexcharacterization tasks involved in this automation. Cou-pled with the advent of community training data sets fortasks such as state recognition [20] and charge transitionidentification [21], the enterprise of QD autotuning maywell be the first demonstration of a large-scale qubit con-trol problem tamed by machine learning.Like many elements of modern microprocessor design,an important consideration in quantum computers willbe integrating control technologies “on-chip” – i.e. on ornear the physical qubits inside the cryostat. This requiresconsideration of energy budgets to limit thermal dissipa-tion in the various control tasks, the performance costs oftransferring data, and the benefits of miniaturizing vari-ous control elements [22–25]. In autotuning of QDs, oneof the simplest control tasks involves the identificationof charge state transition lines in two-dimensional sta-bility diagrams. While previous works use image anal-ysis algorithms [26] or deep (convolutional) neural net-works [11, 15], in this paper we explore whether this taskcan be performed by extremely small feed-forward neuralnetworks. Mixed-signal vector-matrix multiplication ac-celerators based on crossbar arrays of emerging memorytechnologies (e.g. memristors [27]) provide the possibilityto implement sufficiently small neural networks on minia-turized hardware with low power consumption [28–33].We find that neural networks with input layers as smallas 5 × patches of a r X i v : . [ c ond - m a t . m e s - h a ll ] J a n the stability diagram with sufficient accuracy to identifycharge state transitions. The patches can then be shiftedaround the stability diagram in order to tune a quantumdot into a desired charge state starting from an arbitraryposition. One can increase the success rate of this au-totuning procedure by considering arrays of connectedpatches. By finding the minimal array size that provideshigh success rates, we show that the experimental mea-surement costs can be significantly reduced by coveringonly small regions of the stability diagram. The num-ber of parameters in the feed-forward neural networksrequired for this autotuning procedure are sufficientlysmall to make their implementation possible in present-day memristor crossbar arrays [28, 29, 33]. Further, wedemonstrate that these parameters can be trained onsynthetic (simulated) stability diagrams, while achiev-ing excellent performance in classifying transition linesfrom real experimental silicon metal-oxide-semiconductorquantum dot devices. Our work thus opens the possibil-ity of taking advantage of the high speed and energy effi-ciency of memristor crossbar arrays [32, 33], which couldbe integrated as part of a larger on-chip control systemfor QD autuning in the near future. II. SUPERVISED LEARNING FOR STABILITYDIAGRAMS
In order to pursue a machine learning approach to theproblem of quantum dot (QD) autotuning, we need todefine training and testing data sets, and an appropriateneural network architecture. Our end goal will be to usesupervised learning to classify transition lines in smallpatches of the stability diagrams of silicon metal-oxide-semiconductor QD devices. Since experimental data forsuch stability diagrams is expensive to obtain, we aim totrain our neural network architectures on synthetic data,which was obtained from a simulation package developedby M.-A. Genest [21]. This synthetic data approximatesexperimental diagrams with noise effects after some sig-nal processing, and provides “ground truth” labels for thecharge stability sectors which can be used for training.The machine learning architecture that we propose isa simple feed-forward neural network (FFNN), with asingle output neuron that indicates whether a transitionline is found in the input patch of data or not. In thefollowing, we explore FFNNs with a very limited numberof trainable parameters, roughly commensurate with thenumber of memristors expected in high-density crossbararrays available in the near future.
A. Data: experimental and synthetic
Our experimental data is obtained from a silicon metal-oxide-semiconductor QD device, as previously discussedin [34]. The upper panel of Fig. 1(a) shows the corre-sponding setup of gate electrodes, creating a potential landscape along the green arrow. Two plunger gates atthe QD and a connected reservoir (R) are denoted as G1and G2. The resulting potential landscape for the car-riers (electrons or holes) is sketched in the lower panelof Fig. 1(a), where the low-potential island correspondsto the QD in which carriers are trapped. We consider aunipolar QD device, which can only trap either electronsor holes. While all following statements are as well truefor holes, we focus on considering electrons throughoutthe paper. The number of electrons trapped in the singleQD defines its charge state, which can be tuned most ef-ficiently via the gate G1 controlling the depth of the QDpotential. Neighboring gates, such as G2 which definesthe electron reservoir connected to the QD, can addition-ally affect the QD charge state through cross-capacitanceeffects [34].Transitions between different charge states can be mea-sured via a single-electron transistor (SET) which istuned such that a measured current I SET is sensitive topotential changes. Changing the charge state of the QDby adding or removing an electron causes abrupt jumpsin I SET . The current I SET can be measured as a func-tion of the voltages at the QD plunger gate ( V appliedat gate G1) and the electron reservoir gate ( V appliedat gate G2), while keeping all other gate voltages fixed,providing two-dimensional charge stability diagrams [2].Transitions between different charge states appear as dis-tinct lines in the stability diagram and must be identifiedin order to tune the QD [11, 15, 26, 34].The upper panel of Fig. 1(b) shows a measured stabil-ity diagram where an additional oscillating background,caused by cross-capacitance effects of G1 and G2 actingon the SET, impedes detection of the transition lines.This background is specific for the considered measure-ment setup [34] and caused by the absence of a compen-sation procedure such as dynamic feedback control [35].It can be removed via a signal processing algorithm asdiscussed in [26], transforming the stability diagram intoa binary image, see lower panel of Fig. 1(b).After the signal processing, noise effects still appear inthe transformed diagrams. This noise is due to imperfec-tions in the experimental setup, such as the SET couplingto external charges which are not part of the quantumdot [26, 34]. It is these noisy transition lines, which liein the stability diagrams after signal processing, that weaim to detect with the FFNN of the next section. Detect-ing the transition lines in the binary image is a generalapproach and can be applied to any experimental setupafter a setup-specific signal processing algorithm has beenapplied to the raw measurement outcome.It would be preferable to have a large database oflabelled experimental stability diagrams with which totrain supervised machine learning strategies such as theconsidered FFNN. Since a suitable database is not avail-able, however, we propose to train the networks on syn-thetic data. We use the numerical algorithms discussed in[21] to simulate post-signal-processed stability diagrams,which include noise effects similar to those found in ex-
200 nm
G2G1 SETQD R E n e r gy ReservoirQD
Raw diagram V V I S E T Processed diagram V V Synthetic diagram V V Ground truth V V ··· ··· N visibleneurons M hiddenneurons no transitiontransitionOutput01(a) (b) (c) (d) FIG. 1. Steps towards autotuning experimental QD devices with artificial neural networks. (a) A scanning electron micrographof the considered experimental single-QD device is shown in the upper panel. The QD is controlled via the electrostatic gate G1and connected to an electron reservoir (R) whose electronic density is controlled via gate G2. The SET is located next to theQD and enables the detection of transition lines. Along the green arrow a potential landscape is created which is tuned to trapcarriers in the QD. A sketch of the potential is shown in the lower panel, where the potential at the QD and at the reservoircan be tuned via gates G1 and G2, respectively, influencing the number of carriers trapped in the QD. (b) The experimentallymeasured stability diagram (upper panel) contains a strong oscillating background, where transition lines can be seen as suddenjumps. By applying a signal processing algorithm, the background can be removed and a binary stability diagram shows thetransition lines with additional noise effects (lower panel). (c) A numerical algorithm is used to create synthetic stabilitydiagrams which simulate the experimental diagrams after the signal processing. Realistic noise effects are applied to transitionlines in the diagram (upper panel), while the ground truth data (lower panel) can be extracted containing the transition lineswithout noise. (d) A small feed-forward neural network is trained on patches of the synthetically created stability diagrams todetect transition lines and is then applied on patches of experimental stability diagrams after signal processing. The networkinput is given by the pixels in the small patch and the output is a single neuron telling whether a transition line is detected inthe patch or not. We consider different numbers of hidden neurons in one or two layers in between input and output with theintention of miniaturizing the neural network and with this the computational complexity. perimental measurements, see App. A for details. Thisenables us to create a large data set of synthetic stabilitydiagrams, which include the clear presence of transitionlines [see Fig. 1(c)]. This can then be transformed intoa training set for supervised learning on small patches ofpixels, which include the ground-truth labels correspond-ing to the presence or absence of a transition line in thepatch. To discuss this further, we must first examine inmore detail the specific neural network architecture usedfor patch classification.
B. Neural network training strategy
Motivated by on-chip integration of the autotuning in-side the cryostat, we wish to explore the performanceof supervised learning tasks on the minimal-size artifi-cial neural networks possible. Therefore, in this sectionwe restrict ourselves to small FFNNs with one or twohidden layers between the input and the output layer,as illustrated in Fig. 1(d). We furthermore restrict theamount of hidden neurons to small numbers, which putslimitations on the complexity of the network. In orderto know which small network architectures are useful, wewill explore their performance for a simple classificationtask on the smallest possible patches of the binary stabil- ity diagram after signal processing. The input neuronsof the network correspond to pixels in the patch, and asingle output neuron is used for a binary classification ofwhether a transition line is detected or not.As experimental stability diagrams contain large areaswithout transition lines [see Fig. 1(b)], when creating atest data set, the overhead of empty patches comparedto patches with transition lines should be considered. Toaccount for this, we add a weighting towards patches withtransition lines in the training procedure to compensate.When evaluating the performance of the FFNN classifierin the next section, we consider the total accuracy onthe full test set, as well as the accuracies for correctlyclassifying only patches with and without transition lines.Finally, by shifting patches across the diagram viaan algorithm driven by the classification outcome, thequantum dot can be tuned into any desired charge state[15, 26]. As only charge transition lines can be detectedbut not the absolute charge value, this shifting algorithmneeds to find the regime where the quantum dot is empty,which is reached when no more transition lines can befound. From this empty reference point the QD can befilled with the desired number of electrons by crossingthe corresponding number of transition lines. We discussthis algorithm as a step towards on-chip autotuning inSec. III B. L . . . . A cc u r a c y TotalNo transition lineTransition line
FIG. 2. Classification accuracy as a function of the patchsize N = L × L pixels. The total test accuracy (blue) is con-sidered together with the accuracy for correctly classifyingempty patches (orange) or patches including a transition line(green). Five feed-forward neural networks with a single hid-den layer of 10 neurons are trained over 500 epochs for eachdata point. Accuracies are averaged over training epochs 450to 500 to ensure convergence for larger patch sizes. The re-sults of all five networks are averaged with error bars denotingstandard deviations. III. RESULTS
In this section, we begin by discussing the detection oftransition lines in small patches of the stability diagramusing small FFNNs. In the below, we train the networkon a data set of 80000 patches which are extracted from800 numerically created synthetic stability diagrams atrandom positions. For the training we use the Adamoptimizer [36] on a cross-entropy loss function, applyingsigmoid activation functions in the hidden and outputlayers. The output corresponds to the probability of de-tecting a transition line and we round it to zero or one toachieve a binary outcome. The trained network is thentested on 2700 patches extracted from random positionsof 27 experimentally measured stability diagrams aftersignal processing. These neural network classificationsare used to define a shifting algorithm for small patches,capable of autotuning a single QD into a desired chargestate when starting from a random position in the sta-bility diagram.
A. Detection of transition lines
To begin our supervised learning procedure, we find asuitable patch size by analyzing the accuracy reached bya network consisting of a single hidden layer with ten hid-den neurons. Fig. 2 shows the accuracies reached whenclassifying patches of N = L × L pixels with varying L .The total accuracy and the accuracy for correctly classi-fying empty patches always reach ∼ (1) (2) (3) (4) (5) (6)Network architectures0 . . . . A cc u r a c y TotalNo transition lineTransition line 0 2 4 6Pixel threshold0 . . . . . . FIG. 3. Classification accuracy with different network se-tups. In all panels the total accuracy (blue), as well as theaccuracy for correctly classifying patches with (green) andwithout (orange) transition lines are shown, where the accu-racy is plotted on all y -axes. (a) Accuracy as a function ofthe network architecture as stated in Tab. I for a patch of size L = 5. For each data point five feed-forward neural networksare trained for 200 epochs and the accuracies over the last 50epochs are averaged, with standard deviations shown by theerror bars. The accuracies are compared to results of a pixelclassifier with a threshold of 2 pixels [dashed lines, blue areain (b)] and a threshold of 3 pixels [solid lines, gray area in(b)]. (b) Accuracy of the pixel classifier on the experimen-tal stability diagrams after signal processing as a function ofthe threshold pixel number. High values for all accuracies arereached for a threshold of 2 pixels (blue area), as well as for athreshold of 3 pixels (gray area). (c) Classification accuracyas a function of training epochs in a single run using networksetup (2) (solid lines) and (6) (dashed lines), and a patch ofsize L = 5. Convergence is found after less than 100 trainingepochs, justifying a training over 200 epochs for L = 5. is lower and shows a strong dependence on the patchsize. We find the best performance for L = 5, where allthree accuracies are high. Therefore, we focus on thispatch size, which dictates the size of the input layer ofthe FFNNs in the following.With the size of the input and output layers thus fixed,the total number of learnable parameters in the FFNN isdictated by the number of hidden units. In Fig. 3(a) weanalyze the dependence of the classification accuracy ondifferent hidden unit architectures. We consider differentnetwork setups as summarized in Tab. I, consisting of asingle [(1)-(3)] or two [(4)-(6)] hidden layers of differentsizes. In addition, to ensure that the small number ofhidden neurons does not have a significant limiting effecton the expressivity of the network, we compare these re-sults to a network with two hidden layers of 100 neuronseach [(6)], presumed to be representative of the asymp-totic limit. We train all network setups for 200 epochs Network M = 5 135(2) 1 M = 10 270(3) 1 M = 15 405(4) 2 M = 5, M = 10 200(5) 2 M = 10, M = 5 320(6) 2 M = 100, M = 100 12800TABLE I. Feed-forward neural network architectures consid-ered in this manuscript. The number of input neurons in allnetworks is given by the number of pixels in the consideredpatch and is L × L = 25 for L = 5. Each network has a singleoutput neuron telling whether a transition line is found in thepatch or not. In between the input and the output layer weconsider a single [(1)-(3)] or two [(4)-(6)] layers of M hid-den neurons in the first and M hidden neurons in the secondlayer. The number of learnable parameters corresponds to25 × M + M connecting weights and M biases for setups(1)-(3), and 25 × M + M × M + M connecting weights and M + M biases for setups (4)-(6). on the synthetic training set.As illustrated in Fig. 3(a), the total accuracy and theaccuracy for correctly classifying empty patches reaches ∼
96% for all architectures in Tab. I. However, a cleardependence on the network architecture is observed inthe accuracy for classifying patches with transition lines.Interestingly, optimal results are found for networks withsmall numbers of learnable parameters [(1) and (2)], af-firming that we are not limited by the restricted numberof hidden neurons. The accuracy decreases for larger net-work setups with more learnable parameters.In addition, we compare the accuracies reached by theneural network classifier with results from a simple clas-sifier based on the amount of bright pixels in the patch.If the amount of bright pixels crosses a certain thresh-old defined via a pixel number, the patch is classifiedas containing a transition line. Fig. 3(b) shows the ac-curacies reached with this pixel classifier as a functionof the threshold pixel number. Good performances arefound when choosing the threshold at 2 or 3 pixels. InFig. 3(a) we add the results of the pixel classifier for adirect comparison and observe that small networks, es-pecially architectures (1) and (2), outperform the pixelclassifier. This further justifies the use of FFNNs to learnto detect structure in the small patches.Finally, to confirm that training the networks for200 epochs is sufficient, Fig. 3(c) shows the accuraciesachieved as function of the number of training epochs, fornetwork architectures (2) and (6) trained on a 5 × × ∼
96% forall testing accuracies. Given the fact that this high testaccuracy for classifying experimental data occurs for a small and simple network structure, particularly one thatis trained on simulated synthetic data, we argue that theresults are quite promising. Therefore, we emphasize thisarchitecture in the next section, to define a shifting algo-rithm for small patches, with the end goal of autotuninga QD into a desired charge state in the stability diagram.
B. Tuning the device with a patch shiftingalgorithm
In this section we consider a shifting algorithm whichwe develop for the small patches considered above. Withthis algorithm we tune a single QD into a desired chargestate when starting at a random position in the stabil-ity diagram. The full algorithm is discussed in detailin App. B and depends on the classification outcome ofthe FFNN for the patch at each position. As the tran-sition lines only provide information about changes inthe charge state but not the absolute charge of the QD,we first need to find a reference point in the diagram[11, 15, 26, 34]. Similar to [15, 26] we use the emptyQD as a reference point, which is reached when no moretransition lines are detected while shifting the patch tothe left.From this empty reference point, the QD can be filledwith the desired number of electrons by crossing the cor-responding number of transition lines, as identified bythe neural network strategy discussed in the previoussection. Below, we focus on finding the single-electronregime, where the QD can be interpreted as a qubit.An example for the shifting algorithm finding thesingle-electron regime is shown in Fig. 4(a), where a patchof 5 × V V V V V V (1) (2) (3) (4) (5) (6)Network architecture0 . . . Su cc e ss r a t e EmptySingle-electron(a) (b)
FIG. 4. Shifting the patch into the single-electron regime.(a) Sketch of the shifting algorithm with a 5 × the latter is small. The empty reference point is foundsuccessfully in ∼
98% of the diagrams, while the successrate for finding the single-electron regime is rather smallwith ∼ V V V V K . . . Su cc e ss r a t e EmptySingle-electron(a) (b)
FIG. 5. Shifting patch arrays to find the single-electronregime. (a) Shifting algorithm for an array of 4 × × × K . Ar-rays are quadratic arrangements of patches with K stating thenumber of patches in one direction. We consider 25 experi-mentally measured stability diagrams after signal processingand average the success rates over ten start points for eachdiagram with error bars denoting standard deviations. Ini-tial points are chosen randomly but are the same for all arraysizes. Empty reference points are always found with good ac-curacy while the success rates for the single-electron regimeshow a strong dependence on the array size. the experimental measuring procedure, see App. A fordetails. These gaps are larger than the classified patch,so that transition lines can be missed. While strivingto keep the input of the FFNN small, this issue can beovercome, for example, by extending the algorithm tocouple adjacent patches. To illustrate this, we consideran array of K × K patches, which reduces the risk of thepatch translation algorithm hitting a gap in the tran-sition lines, while only slightly increasing the computa-tional costs and preserving the high accuracy reached inthe classification of small patches. Each patch in the ar-ray is classified individually by our small FFNN, and theshifting algorithm in this case now depends on whether atransition line is found in any of the individual patches,see App. B for details.Fig. 5(a) illustrates the shifting algorithm for an arrayof 4 × × K , where the samedata as in Fig. 4(b) is used. An FFNN with architecture(2) is used for the classification of the 5 × ∼ K = 1 (single patch) to K = 2 and a further increase isfound for larger K . At K = 4 the success rate reachesits maximum and saturates before it appears to decreaseagain for K ≥
8. The tuning into the single-electronregime is successful in ∼
75% of the diagrams for K = 4,which is a relatively large success rate. For different patchsizes L with similarly high classification accuracies (seeFig. 2), we find similar success rates if we adapt the ar-ray size K such that the total amount of ( L × K ) pixelsin the array is kept constant. We hope this result willencourage the further exploration of small FFNNs to au-totune single QD devices. IV. DISCUSSION AND OUTLOOK
We have demonstrated that the complex task of auto-tuning a quantum dot (QD) into a target charge statecan be robustly achieved by harnessing the power ofvery small feed-forward neural networks (FFNNs). SuchFFNNs are the workhorses of supervised machine learn-ing, and enable a data-driven approach akin to transferlearning, where networks trained on simulated data canbe used for classification of experimental QD data. Wehave shown in particular that a classification approach,where such small neural networks are trained to detectthe presence or absence of charge transition lines in smallpatches of stability diagrams, can be transferred with ex-cellent accuracy to experimental stability diagrams ob-tained from silicon metal-oxide-semiconductor quantumdot devices. Further, by combining this classification ap-proach with an algorithm that shifts arrays of small inputpatches, we have shown that a single QD can successfullybe autotuned into the single-electron regime, when start-ing from a random configuration of gate voltages. Withour small FFNNs, we reached classification accuracies, aswell as final success rates, which are comparable to pre-vious results with deep (convolutional) neural networks[11, 15]. The success of small input patches further sug-gests that the experimental cost of measuring the stabil-ity diagram can be significantly reduced. Indeed, withour shifting algorithm we are able to consider smallerregions than in comparable works [15, 26].We have found that the FFNNs used for this task re-quire a very small number of learnable parameters toachieve a maximal classification accuracy. This suggeststhat such neural networks could be implemented on state-of-the-art memristor crossbar arrays [28, 29, 33]. Hence, our work provides a first step towards developing anenergy-efficient autotuning device, which could conceiv-ably be implemented in an on-chip control system forQDs. In order to further pursue this idea, the perfor-mance of real memristor crossbar arrays, which is ex-pected to be influenced by imperfections [37, 38], willneed to be carefully analyzed on the classification taskat hand.It is particularly important to emphasize that through-out this work we have trained the small FFNNs on inputpatches extracted from synthetic stability diagrams gen-erated with a numerical simulation package [21]. Due tothe relative expense involved in obtaining experimentaldata on QDs, the success of this transfer learning ap-proach is an important step in further developing ourmachine learning strategy. It also exposes the clear op-portunity for creating a larger community data set fortraining neural networks to detect charge transition linesin QDs, similar to [20]. Eventually, such data sets –whether synthetic or experimental – will play an impor-tant role in standardizing and benchmarking machine-learning based QD control and autotuning.Finally, we remark that in this work we applied ourcharge transition line detection algorithm on stability di-agrams that have already undergone significant signalprocessing according to [26]. With our successful ma-chine learning approach, this signal processing step be-comes computationally more expensive than the classi-fication task. It would therefore be interesting to fur-ther refine our data-driven strategy, to eventually enablea similar neural-network based classification to occur di-rectly on raw experimental data. It is feasible that signif-icant miniaturization of such tasks could lead to a highly-efficient on-chip control system for autotuning quantumdots in the near future.
DATA AND IMPLEMENTATION
The code to create synthetic stability diagrams is ob-tainable at GitHub , and the data set used for trainingis available at GitHub . The feed-forward neural net-works are implemented and trained using PyTorch [39]and NumPy [40], figures are created using Matplotlib[41]. ACKNOWLEDGMENTS
We thank Z. Bandic for many enlightening discussions.We thank our collaborators at Sandia National Labora-tories for providing the samples used for the development https://github.com/Marc-AntoineGenest/Quantum-dots-simulator-and-image-processing-toolbox https://github.com/quantumdata/QD-charge-classification of our algorithm. RGM is supported by the Canada Re-search Chair (CRC) program and the Perimeter Insti-tute for Theoretical Physics. Research at Perimeter In-stitute is supported in part by the Government of Canadathrough the Department of Innovation, Science and Eco-nomic Development Canada and by the Province of On-tario through the Ministry of Colleges and Universities.This research was undertaken thanks in part to fundingfrom the Canada First Research Excellence Fund. Thework was supported in part by the Natural Sciences andEngineering Research Council of Canada (NSERC), theNSERC Alliance program, PROMPT Qu´ebec and Fondsde recherche Nature et technologies of Qu´ebec. Appendix A: Data creation
The training of the feed-forward neural network to de-tect transition lines in patches of stability diagrams re-quires a large training data set. As it is hard to cre-ate such a large set of experimentally measured stabilitydiagrams, we create the training set numerically. How-ever, we obtain the reached classification accuracies fromapplying the trained network to an experimentally mea-sured test data set.To get stability diagrams from the experimental device,the quantum dot is coupled to a single-electron transistor(SET). A current I SET running through the SET is mea-sured while tuning the voltages at two gate electrodes toobtain two-dimensional diagrams. The remaining gateelectrodes are kept at fixed voltages, as two-dimensionaldiagrams bring many advantages for the auto-tuning pro-cedure compared to one-dimensional single-gate scans[26].The current I SET shows an oscillatory behavior wheretransition lines are obtained as jumps in the oscillations[26, 34]. Since the transition lines are hard to extractfrom this raw data, we separate them from the oscillatorycurrent via a signal processing algorithm as discussed in[26]. In this algorithm, the raw signal is first sent througha high-pass filter to remove background effects. After-wards the frequency of the oscillations is extracted viaa Hilbert transform. At charge state transitions, whichappear as jumps in the oscillations, the frequency showsnegative peaks which can be identified by a thresholdmethod. The considered threshold is adapted to the ob-tained frequency distribution. This algorithm providesa binary mapping of the stability diagrams, where thetransition lines are extracted from the raw data.While those binary diagrams already contain the ex-tracted charge transition lines which are needed to tunethe device into a desired charge state, the diagrams alsocontain noisy pixels due to imperfections in the exper-imental measurement procedure and in the signal pro-cessing algorithm. The main imperfections that appearare the following: • The SET couples to an external charge: an addi-tional line appears in the diagram which is not a transition line, • the quantum dot couples to an external charge: thetransition lines experience a sudden jump and con-tinue at a shifted position, • measurements are performed too fast: if the tunnel-ing rate is too low, the electrons do not get throughthe barrier during the measurement, leading to aspreading and fading of transition lines at low volt-ages, • when the derivative of the oscillating signal is closeto zero, which is the case at the top and bottom ofthe oscillations, the negative peaks in the frequencydo not appear: quasi-periodic gaps are found in thetransition lines, • the measurement sensitivity can decrease: spots ofnoisy pixels appear.All these effects make the detection of transition linesin the binary diagrams harder, which is why we use thefeed-forward neural network to detect them.The network performance is tested on experimental di-agrams after signal processing, but we train it on numeri-cally created synthetic diagrams which simulate the noiseeffects discussed above. The synthetic training data canbe generated efficiently and faster than experimental dataand additionally comes with ground truth data contain-ing only the transition lines without noise. This groundtruth data is necessary to perform the supervised trainingof the network, where labels need to be provided.To create the synthetic training data, we use the al-gorithm discussed in [21]. First, ideal noiseless diagramsare created which contain the transition lines simulatinga given experimental setup. The positions of the tran-sitions are calculated via the Thomas-Fermi approxima-tion and hence show a realistic orientation and spacing[11, 12, 21]. This simulation directly provides binary di-agrams which are similar to the experimental diagramsafter signal processing. The noise effects discussed aboveare then added numerically to the ideal data, so that syn-thetic diagrams close to the experimental ones are cre-ated. Additionally, we add a unitary background noiseby turning each pixel bright with probability 0 . Appendix B: Shifting algorithm
The quantum dot (QD) device is tuned into the single-electron regime via shifting the classified small patchover the stability diagram. The algorithm for the shift-ing procedure depends on the classification outcome ofthe considered patch and can be divided into two parts.First, the empty reference point of the QD needs to befound, from which the device can be tuned into the single-electron regime in a second step. This algorithm is sim-ilar to [15, 26], but needs to be adapted for the smallpatch sizes.In the following we generally talk about an array of K × K patches as considered in the main text, where thespecial case of a single patch corresponds to K = 1. Thepatch has a size of L × L pixels.To find the empty regime we start with a patch array ata random position in the stability diagram. We then shiftthe array K × L pixels to the left and K × L pixels upwardsas long as no transition line is detected. While the uppercorner of the diagram is not reached, we apply periodicboundary conditions in the x -direction when searchingfor the first transition line.If a line is detected in one patch of the array, we followthe line by shifting the array one pixel to the left and L pixels upwards. If the upper boundary of the diagramis reached, we shift the patch L pixels to the left andlose the line. To avoid ending up in a wrong regimedue to misclassifications of noise, we only declare thetransition line as found if the same patch detects the lineten steps in a row. Otherwise, if the line is not detectedanymore, we continue shifting the patch diagonally withperiodic boundary conditions until the next transitionline is detected.When the first transition line is found, we continuefollowing the line until it fades out and is not detected anymore. We then shift the array K × L pixels to theleft until any patch detects the next line, which we followagain until it fades out. When no line is detected aftermoving 40 /K steps of K × L pixels to the left, no tran-sition lines are found anymore and the dot is empty. Wedefine this position as the empty reference point.During the whole procedure of finding the empty-dotregime, the process is terminated if the upper left cornerof the diagram is reached and this corner is defined asbeing in the empty regime. All shifts are only applied aslong as the boundaries of the stability diagram are notreached. If a shift would cross one of these boundaries,it is not applied in this direction.Having found the empty-dot regime, the next task is tofill the dot with a single electron, which is done by shiftingthe array until one transition line is crossed. Since theempty reference point is to the left of all transition lines,the array is shifted K × L pixels to the right and twopixels down as long as no line is detected.If any patch in the array detects a transition line, wefollow the line by shifting the array one pixel to the rightand L pixels downwards. When the bottom boundary ofthe diagram is reached, we continue shifting the patch L pixels to the right. We apply the same procedure asbefore to avoid ending up in the wrong regime due tomisclassified noise and only declare detecting a line whena single patch detects it five times in a row. If this is notthe case, we continue shifting the array until the next lineis detected. Since the shifting towards the reference pointmoves the patch to the upper part of the diagram, thechances of hitting the bottom boundary before finding atransition line are very low.When the first transition line is found starting from theempty regime, we move the array diagonally 2 K × L pixelsto the right and downwards and with this find a positionin the single-electron regime. If a different charge state isdesired, the procedure can be iterated analogously untilthe desired number of transition lines is crossed. Theshifting process for filling the dot is terminated whenevera corner of the diagram is reached and the position at thiscorner is defined as the single-electron regime.Mind that the terminations due to reaching cornerswhich we apply in both steps of the shifting algorithmare specific for the case of having finite pre-measuredstability diagrams. Generally, when directly tuning theexperimental device according to the shifting algorithm,the voltage limitations are not expected to be reached. [1] D. Loss and D.P. DiVincenzo, “Quantum computationwith quantum dots,” Phys. Rev. A , 120–126 (1998).[2] R. Hanson, L. P. Kouwenhoven, J. R. Petta, S. Tarucha,and L. M. K. Vandersypen, “Spins in few-electron quan-tum dots,” Rev. Mod. Phys. , 1217–1265 (2007).[3] M. Veldhorst, J. C. C. Hwang, C. H. Yang, A. W. Leen-stra, B. de Ronde, J. P. Dehollain, J. T. Muhonen, F. E.Hudson, K. M. Itoh, A. Morello, and A. S. Dzurak, “An addressable quantum dot qubit with fault-tolerantcontrol-fidelity,” Nat. Nanotechnol. , 981–985 (2014).[4] M. Veldhorst, C.H. Yang, J.C.C. Hwang, W. Huang, J.P.Dehollain, J.T. Muhonen, S. Simmons, A. Laucht, F.E.Hudson, K.M. Itoh, A. Morello, and A. S. Dzurak, “Atwo-qubit logic gate in silicon,” Nature , 410–414(2015).[5] R. Maurand, X. Jehl, D. Kotekar-Patil, A. Corna, H. Bo- huslavskyi, R. Lavi´eville, L. Hutin, S. Barraud, M. Vinet,M. Sanquer, and S. De Franceschi, “A CMOS silicon spinqubit,” Nat. Commun. , 13575 (2016).[6] K. Takeda, J. Kamioka, T. Otsuka, J. Yoneda, T. Naka-jima, M.R. Delbecq, S. Amaha, G. Allison, T. Kodera,S. Oda, and S. Tarucha, “A fault-tolerant addressablespin qubit in a natural silicon quantum dot,” Sci. Adv. , e1600694 (2016).[7] J. Yoneda, K. Takeda, T. Otsuka, T. Nakajima, M.R.Delbecq, G. Allison, T. Honda, T. Kodera, S. Oda,Y. Hoshi, N. Usami, K.M. Itoh, and S. Tarucha, “Aquantum-dot spin qubit with coherence limited by chargenoise and fidelity higher than 99 . , 102–106 (2018).[8] T.F. Watson, S.G.J. Philips, E. Kawakami, D.R. Ward,P. Scarlino, M. Veldhorst, D.E. Savage, M.G. Lagally,M. Friesen, S.N. Coppersmith, M.A. Eriksson, andL.M.K. Vandersypen, “A programmable two-qubit quan-tum processor in silicon,” Nature , 633–637 (2018).[9] A. Frees, J.K. Gamble, D.R. Ward, R. Blume-Kohout,M.A. Eriksson, M. Friesen, and S.N. Coppersmith,“Compressed optimization of device architectures forsemiconductor quantum devices,” Phys. Rev. Applied , 024063 (2019).[10] H. Moon, D. T. Lennon, J. Kirkpatrick, N. M. van Es-broeck, L. C. Camenzind, Liuqi Yu, F. Vigneau, D. M.Zumb¨uhl, G. A. D. Briggs, M. A. Osborne, D. Sejdinovic,E. A. Laird, and N. Ares, “Machine learning enablescompletely automatic tuning of a quantum device fasterthan human experts,” Nat. Commun. , 4161 (2020).[11] S.S. Kalantre, J.P. Zwolak, S. Ragole, X. Wu, N.M. Zim-merman, M.D. Stewart Jr., and J.M. Taylor, “Machinelearning techniques for state recognition and auto-tuningin quantum dots,” npj Quantum Inf. , 6 (2019).[12] J.P. Zwolak, T. McJunkin, S.S. Kalantre, J.P. Dodson,E.R. MacQuarrie, D.E. Savage, M.G. Lagally, S.N. Cop-persmith, M.A. Eriksson, and J.M. Taylor, “Autotun-ing of double-dot devices in situ with machine learning,”Phys. Rev. Applied , 034075 (2020).[13] J. Darulov´a, M. Troyer, and M.C. Cassidy, “Evalua-tion of synthetic and experimental training data in super-vised machine learning applied to charge state detectionof quantum dots,” arXiv:2005.08131[cond-mat.mes-hall](2020).[14] J. Darulov´a, S.J. Pauka, N. Wiebe, K.W. Chan, G.CGardener, M.J. Manfra, M.C. Cassidy, and M. Troyer,“Autonomous tuning and charge-state detection of gate-defined quantum dots,” Phys. Rev. Applied , 054005(2020).[15] R. Durrer, B. Kratochwil, J.V. Koski, A.J. Landig, C. Re-ichl, W. Wegscheider, T. Ihn, and E. Greplova, “Auto-mated tuning of double quantum dots into specific chargestates using neural networks,” Phys. Rev. Applied ,054019 (2020).[16] J.D. Teske, S.S. Humpohl, R. Otten, P. Bethke, P. Cer-fontaine, J. Dedden, A. Ludwig, A.D. Wieck, andH. Bluhm, “A machine learning approach for automatedfine-tuning of semiconductor spin qubits,” Appl. Phys.Lett. , 133102 (2019).[17] N.M. van Esbroeck, D.T. Lennon, H. Moon, V. Nguyen,F. Vigneau, L.C. Camenzind, L. Yu, D.M. Zumb¨uhl,G.A.D. Briggs, D. Sejdinovic, and N. Ares, “Quantumdevice fine-tuning using unsupervised embedding learn-ing,” New J. Phys. , 095003 (2020). [18] D. T. Lennon, H. Moon, L. C. Camenzind, Liuqi Yu,D. M. Zumb¨uhl, G. A .D. Briggs, M. A. Osborne, E. A.Laird, and N. Ares, “Efficiently measuring a quantumdevice using machine learning,” npj Quantum Inf. , 79(2019).[19] V. Nguyen, S.B. Orbell, D.T. Lennon, H. Moon, F. Vi-gneau, L.C. Camenzind, L. Yu, D.M. Zumb¨uhl, G.A.D.Briggs, M.A. Osborne, D. Sejdinovic, and N. Ares,“Deep reinforcement learning for efficient measurementof quantum devices,” arXiv:2009.14825 [cond-mat.mes-hall] (2020).[20] J.P. Zwolak, S.S. Kalantre, X. Wu, S. Ragole, and J.M.Taylor, “Qflow lite dataset: A machine-learning approachto the charge states in quantum dot experiments,” PLoSOne , 1–17 (2018).[21] M.-A. Genest, Impl´ementation d’une m´ethoded’identification de l’occupation ´electronique d’uneboˆıte quantique grˆace `a des techniques d’apprentissageprofond , Master’s thesis, Universit´e de Sherbrooke(2020).[22] L. M. K. Vandersypen, H. Bluhm, J. S. Clarke, A. S. Dzu-rak, R. Ishihara, A. Morello, D. J. Reilly, L. R. Schreiber,and M. Veldhorst, “Interfacing spin qubits in quantumdots and donors—hot, dense, and coherent,” npj Quan-tum Inf. , 34 (2017).[23] B. Patra, R. M. Incandela, J. P. G. van Dijk, H. A. R. Ho-mulle, L. Song, M. Shahmohammadi, R. B. Staszewski,A. Vladimirescu, M. Babaie, F. Sebastiano, and E. Char-bon, “Cryo-CMOS circuits and systems for quantumcomputing applications,” IEEE J. Solid-State Circuits , 309–321 (2018).[24] L. Geck, A. Kruth, H. Bluhm, S. van Waasen, andS. Heinen, “Control electronics for semiconductor spinqubits,” Quantum Sci. Technol. , 015004 (2019).[25] S. J. Pauka, K. Das, R. Kalra, A. Moini, Y. Yang,M. Trainer, A. Bousquet, C. Cantaloube, N. Dick, G. C.Gardner, M. J. Manfra, and D. J. Reilly, “A cryogenicinterface for controlling many qubits,” arXiv:1912.01299[quant-ph] (2019).[26] M. Lapointe-Major, O. Germain, J. Camirand Lemyre,D. Lachance-Quirion, S. Rochette, F. Camirand Lemyre,and M. Pioro-Ladri`ere, “Algorithm for automated tuningof a quantum dot into the single-electron regime,” Phys.Rev. B , 085301 (2020).[27] L. Chua, “Memristor-the missing circuit element,” IEEETrans. Circuit Theory , 507–519 (1971).[28] A. Amirsoleimani, F. Alibart, V. Yon, J. Xu, M.R.Pazhouhandeh, S. Ecoffey, Y. Beilliard, R. Genov, andD. Drouin, “In-memory vector-matrix multiplication inmonolithic complementary metal–oxide–semiconductor-memristor integrated circuits: Design choices, challenges,and perspectives,” Adv. Intell. Syst. , 2000115 (2020).[29] C. Sung, H. Hwang, and I.K. Yoo, “Perspective: A re-view on memristive hardware for neuromorphic compu-tation,” J. Appl. Phys. , 151903 (2018).[30] F. Alibart, E. Zamanidoost, and D.B. Strukov, “Patternclassification by memristive crossbar circuits using ex situand in situ training,” Nat. Commun. , 2072 (2013).[31] F. Merrikh Bayat, M. Prezioso, B. Chakrabarti, H. Nili,I. Kataeva, and D. Strukov, “Implementation of mul-tilayer perceptron network with highly uniform passivememristive crossbar circuits,” Nat. Commun. , 2331(2018).[32] M. Hu, C.E. Graves, C. Li, Y. Li, N. Ge, E. Montgomery, N. Davila, H. Jiang, R. S. Williams, J. J. Yang, Q. Xia,and J.P. Strachan, “Memristor-based analog computa-tion and neural network classification with a dot productengine,” Adv. Mater. , 1705914 (2018).[33] A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, andE. Eleftheriou, “Memory devices and applications forin-memory computing,” Nat. Nanotechnol. , 529–544(2020).[34] S. Rochette, M. Rudolph, A.-M. Roy, M. J. Curry,G. A. Ten Eyck, R. P. Manginell, J. R. Wendt, T. Pluym,S. M. Carr, D. R. Ward, M. P. Lilly, M. S. Carroll, andM. Pioro-Ladri`ere, “Quantum dots with split enhance-ment gate tunnel barrier control,” Appl. Phys. Lett. ,083101 (2019).[35] C. H. Yang, W. H. Lim, F. A. Zwanenburg, and A. S.Dzurak, “Dynamically controlled charge sensing of afew-electron silicon quantum dot,” AIP Adv. , 042111(2011).[36] D.P. Kingma and J. Ba, “Adam: A method for stochasticoptimization,” arXiv:1412.6980 [cs.LG] (2014).[37] G.C. Adam, A. Khiat, and T. Prodromakis, “Challengeshindering memristive neuromorphic hardware from goingmainstream,” Nat. Commun. , 5267 (2018).[38] C. Wang, D. Feng, W. Tong, J. Liu, Z. Li, J. Chang, Y. Zhang, B. Wu, J. Xu, W. Zhao, Y. Li, and R. Ren,“Cross-point resistive memory: Nonideal properties andsolutions,” ACM Trans. Des. Autom. Electron. Syst. ,4 (2019).[39] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury,G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Rai-son, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang,J. Bai, and S. Chintala, “Pytorch: An imperative style,high-performance deep learning library,” in Advances inNeural Information Processing Systems 32 , edited byH. Wallach, H. Larochelle, A. Beygelzimer, F. Alche-Buc,E. Fox, and R. Garnett (Curran Associates, Inc., 2019)pp. 8024–8035.[40] C.R. Harris, K. J. Millman, S.J. van der Walt, R. Gom-mers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor,S. Berg, N.J. Smith, R. Kern, M. Picus, S. Hoyer, M.H.van Kerkwijk, M. Brett, A. Haldane, J. Fern´andez delR´ıo, M. Wiebe, P. Peterson, P. G´erard-Marchant,K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi,C. Gohlke, and T.E. Oliphant, “Array programmingwith NumPy,” Nature , 357–362 (2020).[41] J. D. Hunter, “Matplotlib: A 2D graphics environment,”Comput. Sci. Eng.9