[PDF] Scaling of Multi-contact Phase Change Device for Toggle Logic Operations

Abstract

Scaling of two dimensional six-contact phase change devices that can perform toggle logic operations is analyzed through 2D electrothermal simulations with dynamic materials modeling, integrated with CMOS access circuitry. Toggle configurations are achieved through a combination of isolation of some contacts from others using amorphous regions and coupling between different regions via thermal crosstalk. Use of thermal crosstalk as a coupling mechanism in a multi-contact device in the memory layer allows implementation of analog routing and digital logic operations at a significantly lower transistor count, with the added benefit of non-volatility. Simulation results show approximately linear improvement in peak current and voltage requirements with thickness scaling.

Full PDF

11 Scaling of Multi-contact Phase Change Device for Toggle Logic Operations

R. S. Khan,* N. H. Kan’an, J. Scoggin, H. Silva, and A. Gokirmak

Department of Electrical and Computer Engineering, University of Connecticut, Storrs, Connecticut 06269, USA

Scaling of two dimensional six-contact phase change devices that can perform toggle logic operations is analyzed through 2D electrothermal simulations with dynamic materials modeling, integrated with CMOS access circuitry. Toggle configurations are achieved through a combination of isolation of some contacts from others using amorphous regions and coupling between different regions via thermal crosstalk. Use of thermal cross-talk as a coupling mechanism in a multi-contact device in the memory layer allows implementation of analog routing and digital logic operations at a significantly lower transistor count, with the added benefit of non-volatility. Simulation results show approximately linear improvement in peak current and voltage requirements with thickness scaling.

I. Introduction

Phase change memory (PCM) is a state-of-the-art high-density, high-speed, high endurance non-volatile memory technology that is scalable to ~4 nm . PCM fills the performance-capacity gap between DRAM and flash memory, with speeds comparable to DRAM and capacity comparable to flash memory. PCM utilizes the large resistivity contrast (~10 at 300 K) between the disordered amorphous (high resistivity) and ordered crystalline (low resistivity) phases of chalcogenide materials such as Ge Sb Te (GST) to store information . In recent years, there has been a growing interest in the collocation of memory and processor, as well as in-memory computation , to eliminate the performance bottleneck in von Neumann architectures , especially for machine learning applications. The dense multi-layer crossbar architecture of PCM is compatible with conventional CMOS circuitry through back-end-of-line integration , providing the opportunity to integrate 100s of GBs of high-speed non-volatile memory on top of the processor ( FIG. 1 ). The CMOS realestate needed for memory access scales with the memory capacity. Hence, approaches that reduce the CMOS footprint for routing and logic functions are highly desired. Multi-contact phase change devices integrated with CMOS can significantly reduce the area requirements and provide the additional benefit of non-volatility. The multi-contact phase change logic devices utilize self-heating between pairs of contacts to electrically isolate or connect other contacts, through melting and quenching a portion (T melt ~873 K for GST ) of the phase change material or heating the material above glass transition (T glass ~ 420 Kfor GST . Thermal cross-talk between the different regions of the phase change material is utilized as a coupling mechanism to implement logic functions with fewer CMOS elements. Multiplexers or flip-flops can be implemented by integrating a single multi-contact device with 5 MOSFETs, while sixteen transistors are needed to implement a flip-flop and fourteen transistors are needed to implement a 2-to-1 multiplexer using conventional CMOS circuitry. Scaling of the phase change element reduces the current requirements for the MOSFETs, further reducing the CMOS realestate, improving speed and reduce power requirements. In this work, we report the results of a computational study on a 6-contact phase change device, integrated with 5 nMOSFETs, configured as a toggle flip-flop or a multiplexer ( FIG. 3 ). Effect of scaling on power consumption and speed are reported as well.

II. Simulation Setup

We use 2-D finite element simulations in COMSOL Multiphysics using the framework described in Woods et al. to demonstrate the functionality and study scalability of multi-contact devices and the complementing electronic circuitry. The model simultaneously captures amorphization-crystallization dynamics, heat diffusion, electrical current flow and thermoelectric effects. Grain orientation and crystallinity are simultaneously tracked using a crystal density (𝐶𝐷⃗⃗⃗⃗⃗ ) vector. The components of 𝐶𝐷⃗⃗⃗⃗⃗ , CD and CD , track grain orientation. The one-norm ||𝐶𝐷⃗⃗⃗⃗⃗ || = 𝐶𝐷 + 𝐶𝐷 tracks local crystallinity: ||𝐶𝐷⃗⃗⃗⃗⃗ || = 1 or 0 corresponds to fully crystalline or fully amorphous states, respectively. A rate equation is used to track local crystallinity: 𝑑𝐶𝐷 𝑖 𝑑𝑡 = 𝑁𝑢𝑐𝑙𝑒𝑎𝑡𝑖𝑜𝑛 + 𝐺𝑟𝑜𝑤𝑡ℎ + 𝐴𝑚𝑜𝑟𝑝ℎ𝑖𝑧𝑎𝑡𝑖𝑜𝑛 , (1) where CD i is a component of 𝐶𝐷⃗⃗⃗⃗⃗ . Nucleation term initiates nucleation at random locations.

Growth term is responsible for increasing ||𝐶𝐷⃗⃗⃗⃗⃗ || to 1 once nucleation is initiated.

Amorphization term rapidly brings ||𝐶𝐷⃗⃗⃗⃗⃗ || to 0 at locations where temperature is greater than the melting temperature. All three terms in equation (1) are material phase and temperature

FIG. 1. Schematic of PCM crossbar array integrated on top of CPU. The phase change and CMOS logic part control routing of data and logic operations. * Corresponding author. Email: [email protected] dependent. Equation (1) is coupled with electric current and heat transfer physics : ∇ ∙ 𝐽 = ∇ ∙ (−𝜎∇𝑉 − 𝜎𝑆∇𝑇) = 0 , (2) 𝑑𝐶 𝑃 𝑑𝑇𝑑𝑡 − ∇. (𝑘∇𝑇) = −∇𝑉 ∙ 𝐽 − ∇ ∙ (JST) + 𝑄 𝐻 , (3) where J is the current density, σ is the electrical conductivity, V is the electric potential, S is the Seebeck coefficient, d is the mass density, and k is the thermal conductivity. Q H accounts for latent heat of phase change ( ∆𝐻 𝑎,𝑐 ) . 𝑄 𝐻 = 𝑑||𝐶𝐷⃗⃗⃗⃗⃗ || 𝑑𝑡 × ∆𝐻 𝑎,𝑐 (𝑇) × 𝑑 , (3) We use GST material parameters for the phase change material, including temperature and phase dependent electrical conductivity , Seebeck coefficients , thermal conductivities , and specific heats . Electrical conductivity is also field dependent . A constant mass density of is 6.2 g.cm is used . Although GST parameters are used here, the device concept can be implemented with other phase change materials as well. A fixed out-of-plane depth of 20 nm, 10 nm and 5 nm are used for the simulations as indicated. III. Device configuration

The thermal boundary conditions, and thickness and properties of the phase change material determine thermal losses, time scales for heating and cooling, hence the device speed and power requirements. The GST element is assumed to be passivated by a SiO layer, and the thermal boundary conditons are set to be 293 K at the SiO boundaries, located 250 nm from the center (10x times the GST radius). In the experimental phase change devices, typically a SiN layer is deposited under the oxide passivation ( FIG. 2 ). n-channel MOSFETs with width x length = 120 nm x 22 nm are used as access devices, sized to provide sufficient current for melting and amorphization. We used TiN contacts with 10 nm radius, distributed uniformly around the circular GST patch (

FIG. 3 ). Three of the contacts are configured for writing (W , W , and W ) and the other three for reading (R , R , and R ), forming two write paths (W and W ) and two read paths (R and R ). All six contacts are accessed using nFETs. The gates of read nFETs are connected to V read (V read is high during read), and the write nFETs are connected to V write (V write is high during write). The write circuitry is identical for the flip-flop and the multiplexer implementations. Contacts W and W are electrically shorted and connected to ground through a single nFET. W is connected to V DD (positive supply voltage) using another nFET. A short pulse at V write results in one of the write paths getting amorphized resulting in different device configurations. The Read circuitry is different for the two implementations: outputs Q and Qʹ are taken across resistors R L connected to read contacts R and R for the flip-flop, FIG. 2. Fabrication schematics for a bottom-contacted multi-contact device: (a) Growth of SiO on Si, (b) via formation for 10 nm radius bottom contacts using optical lithography and reactive ion etching (RIE), (c) metal (TiN) deposition and planarization, (d) GST sputter deposition, (e) patterning of 25 nm radius GST patch using optical lithography and RIE, (f) deposition of Si N for passivation. FIG. 3. Schematic of six contact toggle flip-flop (a) and toggle multiplexer (b). The distinct regions in the GST represent different crystalline grains prior to initialization of the device. (a) (b) (c)(d) (e) (f) Si SiO TiN GST Si N R L V read Q' R L V read Q V write V DD V write V read V DD

20 nm R W W W R R V read V Write V DD V Write V read R W W W R R V read X X GNDR Y (a) (b)

GSTTiN whereas inputs X and X are applied to contacts R and R , respectively, and the output Y is taken across resistor R connected to contact R . IV. Results and Discussion

We start with the device in a fully crystalline state (

FIG. 4 a). To initialize the device to one of the configurations, a write pulse is applied (V write is high). The two write paths (W and W ) begin drawing current (

FIG. 5 , 8 Top); one of the paths draws a progressively larger proportion of the current (

FIG. 4 b) and eventually melts due to thermal runaway. The ‘winning’ path is determined by the path resistance mismatch, which is determined by the initial random grain map and process variations. Adding a small series resistance to one of the paths deterministically set the first “winner”. W melts in this case (

FIG. 4 c) and amorphizes after the pulse is terminated (

FIG. 4 d). The device takes ~50 ns to thermalize (return to room temperature), after which a read operation is performed. The path R is now blocked by the amorphous strip W , while R is crystalline and thus draws much more current than R during read (

FIG. 4 e). Applying a subsequent write pulse, W draws most of the current and eventually melts because W is initially amorphous (

FIG. 4 f). As W melts, the amorphous GST in the W path is heated above the crystallization temperature (

FIG. 4 g) and crystallizes: W is now amorphous and W is crystalline (

FIG. 4 h), resulting in a toggled state (R blocked, R crystalline). Now R draws substantially more current that R during read. A. Toggle Flip-flop

In the toggle flip flop, the output voltage toggles between low and high each time the input is high . If the input is low, the output remains in its current state; if the input is high, the output switches from low to high, or from high to low. In our proposed device, the input is V write . When V write is high, one of the write paths melts and amorphizes while the other crystallizes, resulting in a toggled state. As the read path that is blocked by the amorphous region draws much less current than the opposite read path, the output voltages Q and Qʹ assume opposite values (when Q is high Qʹ is low, and vice versa). The amorphized path and the output voltages thus toggle with each write pulse ( FIG. 5 ). Q and Qʹ can be connected to a comparator or to the gate of another transistor for rail-to-rail output. The output voltages Q and Qʹ depend on the read resistor R L . A higher value of R L results in higher output voltages but a reduced output ratio ( FIG. 6 ). A higher R L reduces both V GS and V DS for the nFET. Device can be read 5ns after the write operation with a 10x contrast between the read terminals, and the contrast increases to 60x after device cools down to room temperature ( FIG. 7 ) for the parameters used in this study, for a GST out of plane thickness of 20 nm. The ratio of the output voltages increases during cool down as the resistivity of amorphous GST decreases exponentially with temperature . Thus, the cooling rate, the read resistors, and the sensitivity of the comparator all influence the speed at which the device can operate. FIG. 4. Snapshots of electro-thermal simulation of the proposed device during initialization (a-d) and the first toggle operation (e-i). The CD map (left) shows the crystallinity profile of the device at different time steps with molten and amorphous material indicated by pink and peach, respectively. The temperature map (center) shows the temperature throughout the device. The conductivity map (right) shows the conductivity of GST, where conductivity is lowest for amorphous (dark blue) and highest for melt (dark red). (a) t = . n s t = . n s t = . n s t = . n s t = . n s t = . n s t = . n s t = . n s (b) (c) (d)(e) (f) (g)(h) CD Temperature

Conductivity R R R W W W R R R W W W R R R W W W R R R W W W R R R W W W R R R W W W R R R W W W R R R W W W t = n s (i) R R R W W W B. Toggle Multiplexers

In a multiplexer, one of several input signals are forwarded to the output based on the control pin configuration . The control pin decides which input goes to the output. For the proposed toggle multiplexer, two signals (X and X ) are applied as input and there is one output (Y). V write is the control signal. Application of a write pulse (V write high) would toggle the input signal that goes to the output. With the application of the first write pulse, the the write path W melts and amorphizes, blocking the read path R , connecting the output Y to the input X (X → Y). When X is high/low, Y follows X , irrespective of the state of X ( FIG. 8 ). With consequitive write pulse, W is amorphized, blocking read path R , switching Y to X (X → Y), irrespective of state of X . Thus each write pulse toggles the input signal that is forwarded to the output. C. Discussion

As the device thickness is scaled down, the GST volume required to be amorphized decreases. This results in decrease in energy and power consumption. This is verified by simulations as maximum power consumption for the device decreases from 585 μW to 92 μW as thickness is scaled from 20 nm down to 5 nm. There is reduction in the supply voltage V DD , write current I write ( FIG. 9 ), and energy consumption with scsaling. Parameters for different thicknesses are compared in Table I. The speed of the device is determined by the distance between write contacts (shorter distance between contacts will result in faster re-crystallization of amorphous paths, thus higher speed), placement of thermal anchor (the closer the thermal anchors, the less time it will take to cool down to room temperature, though at the cost of additional power for write operations), and the size of the write FETs (larger FETs provide more current at the cost of an increased footprint). Although we simulated the device concept with a 25 nm radius GST patch, the device can be scaled even further, limited by the fabrication of the TiN contact and the heat transfer within the GST patch that must allow for sufficient but not too much thermal cross-talk. A smaller GST patch would reduce the power consumption and required FET size as well, or increase the speed of the device for the same FET size, at a reliability cost in terms of thermal cross-talk control. For example, the minimum time required to successfully amorphized a write path decreases from 6 ns to 5 ns as GST radius is decreased from 35 nm to 25 nm. For both the flip-flop and the multiplexer the I write through the path that is recrystallizing (I

W1W3 for the first write pulse after initialization and I

W1W2 for second write pulse) increases during the pulse duration. This is the combined effect of decrease in amorphous resistivity and recrystallization of previously amorphous write path due to increase in temperature. With the resistivity of the path decreasing, it

FIG. 5. Currents in the two write paths W and W (top) and output voltages (bottom) during the initialization pulse followed by two write pulses for toggle filp-flop. The output voltages (Q and Qʹ) toggle between V low and V high after each write pulse. The value of R L is 10 kΩ. FIG. 6. Output voltages for different values of R L for the toggle flip-flop (spheres, left-axis). Low and high values are represented using red and blue spheres respectively. The right axis (stars) shows the ratio of V high to V low . FIG. 7. Ratio of V high to V low for the toggle flip-flop during 5 ns read pulses after amorphization. The first read is performed 5 ns after the termination of the write pulse. The change in ratio during the read pulse is due to change in temperature of the device. The ratio stabilizes after 40 ns following the termination of the write pulse. -4 -3 -2 -1 R L = 10 k  Toggle I w r it e (  A ) I W1W3 I W1W2

Initialization

Q, Q' V low V high Q ' =0 Q ' =1Q ' =1 Q=1 Q=0 O u t pu t V o lt a g e s ( V ) Time (ns)

Q=0 -5 -4 -3 -2 -1 V low V high O u t pu t V o lt a g e s ( V ) R L ( ) R a ti o R a ti o = V h i gh / V l o w Time after write pulse (ns) starts drawing current. If the pulse is not terminated in due time, both paths will eventually melt, resulting in device failure (The device can be returned to operational state by annealing/set pulse). Thus careful consideration of pulse duration is needed to ensure proper operation of the device. We did not observe any degradation in the simulated output voltages with repeated write pulses. In experimental devices void formation at write contacts during higher cycle counts may be a reliability concern. Even in ideal circumstances, the proposed device will be slower than its CMOS implementation: ~.3 ns for CMOS writes vs. ~10 ns for the proposed device (

FIG. 7 , assuming the comparator has the sensitivity to detect a ratio of ~30x between output voltages). The presented configurations in

FIG. 3 do not provide rail-to-rail functionality, which can be addressed by altering the access circuitry or the device configuration. For example, using an inverter instead of a resistor in the read circuit for the toggle multiplxer can achieve rail-to-rail voltage (

FIG. 10 ). A conventional CMOS toggle flip flop and 2-to-1 multiplexer requires 16 and 14 transistors, respectively. The presented device uses 5 transistors to achieve similar functionality, thus reducing the FET count significantly and reduce the necessary CMOS realestate by ~2x.

V. Summary

A computational analysis of six-contact device with toggle flip-flop or multiplexing functionality using only 5 transistors (compared to 16 and 14 in a conventional flip-flop and multiplexer) and non-volatility has been presented. 2-D finite element simulations were performed using temperature dependent material parameters and accounting for thermoelectric effects. The pulse duration and required transistor sizes depend on device dimensions, contact spacing, and the phase change material. The presented approach to integrate a phase change element with CMOS bring additional functionality (e.g. logic, routing) to the memory layer to free Table I- Write/read pulse parameters Out of plane thickness (nm) 20 10 5 V DD (V) 3 2.2 1.65 Peak write voltage (V) 3 2.2 1.65 Rise /fall time for read and write pulses (ns) 1 1 1 Read/write pulse duration (ns) 5 5 5 V read during read operation (V) 0.5 0.5 0.5 Peak write current (µA) ~193 101 56 Peak read current (µA) 6 ~4.7 ~4.2 Maximum power (µW) 585 222 92 Write energy (pJ) ~2.9 ~1.1 ~0.46 FIG. 8. Write currents (top), input voltages (middle), and output voltages (bottom) for toggle multiplexer. For read operations, four combinations of X and X are tested (X , X ): (0, 0), (0, 1), (1, 0), (1, 1). Output follows the state of one of the inputs (determined by the write pulse) regardless of the state of the other input. FIG. 9. Peak current during write operation (red spheres, left axis) and supply voltage (blue spheres, right axis) for different out of plane GST thicknesses. Both the current and the voltage decrease with decreasing thickness. Y = X2 Y = X1 I w r it e (  A ) IW1W2

IW1W3

Y = X1 X2 X1 I npu t s ( V ) Y O u t pu t ( V ) Time (ns) V DD , V w r it e ( V ) I w r it e , m a x (  A ) Out of plane GST thickness (nm) 0123 up precious CPU resources and require no power dissipation to hold information hence are also suitable for intermittent power applications.

Acknowledgements

Raihan Khan performed the simulations, analysis and writing of the manuscript supported by the U.S. National Science Foundation (NSF) through ECCS 1711626 award. Nadim H. Kan’an formulated the idea and performed preliminary simulations supported by NSF ECCS CAREER 1150960 award. Ali Gokirmak and Helena Silva contributed to the design of experiments, analysis and writing of the manuscript.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References B.C. Lee, P. Zhou, J. Yang, Y. Zhang, B. Zhao, E. Ipek, O. Mutlu, and D. Burger, Micro, IEEE , 143 (2010). S. Raoux, G.W. Burr, M.J. Breitwisch, C.T. Rettner, Y.C. Chen, R.M. Shelby, M. Salinga, D. Krebs, S.H. Chen, H.L. Lung, and C.H. Lam, IBM J. Res. Dev. , 465 (2008). G.W. Burr, M.J. BrightSky, A. Sebastian, H.-Y. Cheng, J.-Y. Wu, S. Kim, N.E. Sosa, N. Papandreou, H.-L. Lung, H. Pozidis, E. Eleftheriou, and C.H. Lam, IEEE J. Emerg. Sel. Top. Circuits Syst. , 146 (2016). F. Xiong, E. Yalon, A. Behnam, C.M. Neumann, K.L. Grosse, S. Deshmukh, and E. Pop, in (2016), pp. 4.1.1-4.1.4. W. Kim, M. Brightsky, T. Masuda, N. Sosa, S. Kim, R. Bruce, F. Carta, G. Fraczak, H.Y. Cheng, A. Ray, Y. Zhu, H.L. Lung, K. Suu, and C. Lam, in

Tech. Dig. - Int. Electron Devices Meet. IEDM (2016), pp. 4.2.1-4.2.4. H.-S.P. Wong, S. Raoux, S. Kim, J. Liang, J.P. Reifenberg, B. Rajendran, M. Asheghi, and K.E. Goodson, Proc. IEEE , 2201 (2010). A. Sebastian, M. Le Gallo, G.W. Burr, S. Kim, M. Brightsky, and E. Eleftheriou, J. Appl. Phys. , 111101 (2018). J. Backus, Commun. ACM , 613 (1978). D. Kau, S. Tang, I. V. Karpov, R. Dodge, B. Klehn, J.A. Kalb, J. Strand, A. Diaz, N. Leung, J. Wu, S. Lee, T. Langtry, K.W. Chang, C. Papagianni, J. Lee, J. Hirst, S. Erra, E. Flores, N. Righos, H. Castro, and G. Spadini, in

Tech. Dig. - Int. Electron Devices Meet. IEDM (2009). N. Kan’an, H. Silva, and A. Gokirmak, IEEE J. Electron Devices Soc. , 72 (2016). N.H. Kanan, Phase Change Devices for Nonvolatile Logic, University of Connecticut, 2017. N.H. Kan’an, H. Silva, and A. Gokirmak, in

Device Res. Conf. - Conf. Dig. DRC (2014), pp. 145–146. M. Cassinerio, N. Ciocchini, and D. Ielmini, Adv. Mater. , 5975 (2013). R.S. Khan, N.H. Kan’an, J. Scoggin, H. Silva, and A. Gokirmak, in

Device Res. Conf. - Conf. Dig. DRC (2019), pp. 99–100. R.S. Khan, N. Kan’an, J. Scoggin, Z. Woods, L. Adnane, A. Gorbenko, A. Gokirmak, and H. Silva, in

Mater. Res. Soc. Fall Meet. (Boston, MA, 2017), p. EM07.13.02. R.S. Khan, N.H. Kan’an, J. Scoggin, Z. Woods, H. Silva, and A. Gokirmak, in

Mater. Res. Soc. Spring Meet. (Phoenix, AZ, 2019), p. EP08.08.05. K. Cil, F. Dirisaglik, L. Adnane, M. Wennberg, A. King, A. Faraclas, M.B. Akbulut, Y. Zhu, C. Lam, A. Gokirmak, and H. Silva, IEEE Trans. Electron Devices , 433 (2013). A. Redaelli, M. Boniardi, A. Ghetti, U. Russo, C. Cupeta, S. Lavizzari, A. Pirovano, and G. Servalli, in

Tech. Dig. - Int. Electron Devices Meet. IEDM (2013), pp. 30.4.1-30.4.4. COMSOL Multiphysics 5.3 (COMSOL Inc., Burlington, MA, 2017). Z. Woods and A. Gokirmak, IEEE Trans. Electron Devices , 4466 (2017). Z. Woods, J. Scoggin, A. Cywar, L. Adnane, and A. Gokirmak, IEEE Trans. Electron Devices , 4472 (2017). J. Scoggin, R.S. Khan, H. Silva, and A. Gokirmak, Appl. Phys. Lett. , 193502 (2018). A. Faraclas, G. Bakan, L. Adnane, F. Dirisaglik, N.E. Williams, A. Gokirmak, and H. Silva, Electron Devices, IEEE Trans. , 372 (2014). L. Adnane, F. Dirisaglik, A. Cywar, K. Cil, Y. Zhu, C. Lam, A.F.M. Anwar, A. Gokirmak, and H. Silva, J. Appl. Phys. , 125104 (2017). A. Faraclas, N. Williams, A. Gokirmak, and H. Silva, IEEE Electron Device Lett. , 1737 (2011). W.K. Njoroge, H.-W. Wöltgens, and M. Wuttig, J. Vac. Sci. Technol. A Vacuum, Surfaces, Film. , 230 (2002). S. Brown and Z. Vranesic, McGraw Hill High. Educ. (2005).

FIG. 10. Toggle multiplexer with modified read circuitry for rail-to-rail output voltage (a). Complement of output voltage Y for different X and X values. Simulation is performed in LTspice. V o lt a g e ( V ) Time (ns) X1 X2 Y V read V DD

20 nm R W W W R R V read X X YV write V writewrite