Featured Researches

Emerging Technologies

A Framework to Explore Workload-Specific Performance and Lifetime Trade-offs in Neuromorphic Computing

Neuromorphic hardware with non-volatile memory (NVM) can implement machine learning workload in an energy-efficient manner. Unfortunately, certain NVMs such as phase change memory (PCM) require high voltages for correct operation. These voltages are supplied from an on-chip charge pump. If the charge pump is activated too frequently, its internal CMOS devices do not recover from stress, accelerating their aging and leading to negative bias temperature instability (NBTI) generated defects. Forcefully discharging the stressed charge pump can lower the aging rate of its CMOS devices, but makes the neuromorphic hardware unavailable to perform computations while its charge pump is being discharged. This negatively impacts performance such as latency and accuracy of the machine learning workload being executed. In this paper, we propose a novel framework to exploit workload-specific performance and lifetime trade-offs in neuromorphic computing. Our framework first extracts the precise times at which a charge pump in the hardware is activated to support neural computations within a workload. This timing information is then used with a characterized NBTI reliability model to estimate the charge pump's aging during the workload execution. We use our framework to evaluate workload-specific performance and reliability impacts of using 1) different SNN mapping strategies and 2) different charge pump discharge strategies. We show that our framework can be used by system designers to explore performance and reliability trade-offs early in the design of neuromorphic hardware such that appropriate reliability-oriented design margins can be set.

Read more
Emerging Technologies

A Graph Partitioning Algorithm with Application in Synthesizing Single Flux Quantum Logic Circuits

In this paper, a new graph partitioning problem is introduced. The depth of each part is constrained, i.e., the node count in the longest path of the corresponding sub-graph is no more than a predetermined positive integer value p. An additional constraint is enforced such that each part contains only nodes selected from consecutive levels in the graph. The problem is therefore transformed into a Depth-bounded Levelized Graph Partitioning (DLGP) problem, which is solved optimally using a dynamic programming algorithm. As an example application, we have shown that DLGP can effectively generate timing-correct circuit solutions for Single Flux Quantum (SFQ) logic, which is a magnetic-pulse-based, gate-level pipelined superconductive computing fabric. Experimental results confirm that DLGP generates circuits with considerably lower path balancing overheads compared with a baseline full-path-balancing approach. For example, the balancing overhead (a critical measure of quality metric) for the SFQ circuit realization in terms of D-Flip-Flop count is reduced by 3.61 times on average for 10 benchmark circuit, given p=5.

Read more
Emerging Technologies

A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism

Memristive neural networks (MNNs), which use memristors as neurons or synapses, have become a hot research topic recently. However, most memristors are not compatible with mainstream integrated circuit technology and their stabilities in large-scale are not very well so far. In this paper, a hardware friendly MNN circuit is introduced, in which the memristive characteristics are implemented by digital integrated circuit. Through this method, spike timing dependent plasticity (STDP) and unsupervised learning are realized. A weight sharing mechanism is proposed to bridge the gap of network scale and hardware resource. Experiment results show the hardware resource is significantly saved with it, maintaining good recognition accuracy and high speed. Moreover, the tendency of resource increase is slower than the expansion of network scale, which infers our method's potential on large scale neuromorphic network's realization.

Read more
Emerging Technologies

A Hybrid FeMFET-CMOS Analog Synapse Circuit for Neural Network Training and Inference

An analog synapse circuit based on ferroelectric-metal field-effect transistors is proposed, that offers 6-bit weight precision. The circuit is comprised of volatile least significant bits (LSBs) used solely during training, and non-volatile most significant bits (MSBs) used for both training and inference. The design works at a 1.8V logic-compatible voltage, provides 10^10 endurance cycles, and requires only 250ps update pulses. A variant of LeNet trained with the proposed synapse achieves 98.2% accuracy on MNIST, which is only 0.4% lower than an ideal implementation of the same network with the same bit precision. Furthermore, the proposed synapse offers improvements of up to 26% in area, 44.8% in leakage power, 16.7% in LSB update pulse duration, and two orders of magnitude in endurance cycles, when compared to state-of-the-art hybrid synaptic circuits. Our proposed synapse can be extended to an 8-bit design, enabling a VGG-like network to achieve 88.8% accuracy on CIFAR-10 (only 0.8% lower than an ideal implementation of the same network).

Read more
Emerging Technologies

A Logic Simplification Approach for Very Large Scale Crosstalk Circuit Designs

Crosstalk computing, involving engineered interference between nanoscale metal lines, offers a fresh perspective to scaling through co-existence with CMOS. Through capacitive manipulations and innovative circuit style, not only primitive gates can be implemented, but custom logic cells such as an Adder, Subtractor can be implemented with huge gains. Our simulations show over 5x density and 2x power benefits over CMOS custom designs at 16nm [1]. This paper introduces the Crosstalk circuit style and a key method for large-scale circuit synthesis utilizing existing EDA tool flow. We propose to manipulate the CMOS synthesis flow by adding two extra steps: conversion of the gate-level netlist to Crosstalk implementation friendly netlist through logic simplification and Crosstalk gate mapping, and the inclusion of custom cell libraries for automated placement and layout. Our logic simplification approach first converts Cadence generated structured netlist to Boolean expressions and then uses the majority synthesis tool to obtain majority functions, which is further used to simplify functions for Crosstalk friendly implementations. We compare our approach of logic simplification to that of CMOS and majority logic-based approaches. Crosstalk circuits share some similarities to majority synthesis that are typically applied to Quantum Cellular Automata technology. However, our investigation shows that by closely following Crosstalk's core circuit styles, most benefits can be achieved. In the best case, our approach shows 36% density improvements over majority synthesis for MCNC benchmark.

Read more
Emerging Technologies

A Low-Power Domino Logic Architecture for Memristor-Based Neuromorphic Computing

We propose a domino logic architecture for memristor-based neuromorphic computing. The design uses the delay of memristor RC circuits to represent synaptic computations and a simple binary neuron activation function. Synchronization schemes are proposed for communicating information between neural network layers, and a simple linear power model is developed to estimate the design's energy efficiency for a particular network size. Results indicate that the proposed architecture can achieve 0.61 fJ per classification per component (neurons and synapses) and outperforms other designs in terms of energy per % accuracy.

Read more
Emerging Technologies

A Machine Learning Accelerator In-Memory for Energy Harvesting

There is increasing demand to bring machine learning capabilities to low power devices. By integrating the computational power of machine learning with the deployment capabilities of low power devices, a number of new applications become possible. In some applications, such devices will not even have a battery, and must rely solely on energy harvesting techniques. This puts extreme constraints on the hardware, which must be energy efficient and capable of tolerating interruptions due to power outages. Here, as a representative example, we propose an in-memory support vector machine learning accelerator utilizing non-volatile spintronic memory. The combination of processing-in-memory and non-volatility provides a key advantage in that progress is effectively saved after every operation. This enables instant shut down and restart capabilities with minimal overhead. Additionally, the operations are highly energy efficient leading to low power consumption.

Read more
Emerging Technologies

A Model of an Oscillatory Neural Network with Multilevel Neurons for Pattern Recognition and Computing

The current study uses a novel method of multilevel neurons and high order synchronization effects described by a family of special metrics, for pattern recognition in an oscillatory neural network (ONN). The output oscillator (neuron) of the network has multilevel variations in its synchronization value with the reference oscillator, and allows classification of an input pattern into a set of classes. The ONN model is implemented on thermally-coupled vanadium dioxide oscillators. The ONN is trained by the simulated annealing algorithm for selection of the network parameters. The results demonstrate that ONN is capable of classifying 512 visual patterns (as a cell array 3 * 3, distributed by symmetry into 102 classes) into a set of classes with a maximum number of elements up to fourteen. The classification capability of the network depends on the interior noise level and synchronization effectiveness parameter. The model allows for designing multilevel output cascades of neural networks with high net data throughput. The presented method can be applied in ONNs with various coupling mechanisms and oscillator topology.

Read more
Emerging Technologies

A Multilayer Neural Network Merging Image Preprocessing and Pattern Recognition by Integrating Diffusion and Drift Memristors

With the development of research on novel memristor model and device, neural networks by integrating various memristor models have become a hot research topic recently. However, state-of-the-art works still build such neural networks using drift memristor only. Furthermore, some other related works are only applied to a few individual applications including pattern recognition and edge detection. In this paper, a novel kind of multilayer neural network is proposed, in which diffusion and drift memristor models are applied to construct a system merging image preprocessing and pattern recognition. Specifically, the entire network consists of two diffusion memristive cellular layers for image preprocessing and one drift memristive feedforward layer for pattern recognition. Experimental results show that good recognition accuracy of noisy MNIST is obtained due to the fusion of image preprocessing and pattern recognition. Moreover, owing to high-efficiency in-memory computing and brief spiking encoding methods, high processing speed, high throughput, and few hardware resources of the entire network are achieved.

Read more
Emerging Technologies

A Noise Filter for Dynamic Vision Sensors using Self-adjusting Threshold

Neuromorphic event-based dynamic vision sensors (DVS) have much faster sampling rates and a higher dynamic range than frame-based imagers. However, they are sensitive to background activity (BA) events which are unwanted. we propose a new criterion with little computation overhead for defining real events and BA events by utilizing the global space and time information rather than the local information by Gaussian convolution, which can be also used as a filter. We denote the filter as GF. We demonstrate GF on three datasets, each recorded by a different DVS with different output size. The experimental results show that our filter produces the clearest frames compared with baseline filters and run fast.

Read more

Ready to get started?

Join us today