Khadeer Ahmed | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Khadeer Ahmed is active.

Explore More

Publication

Featured researches published by Khadeer Ahmed.

international symposium on neural networks | 2014

Accelerating pattern matching in neuromorphic text recognition system using Intel Xeon Phi coprocessor

Khadeer Ahmed; Qinru Qiu; Parth Malani; Mangesh Tamhankar

Neuromorphic computing systems refer to the computing architecture inspired by the working mechanism of human brains. The rapidly reducing cost and increasing performance of state-of-the-art computing hardware allows large-scale implementation of machine intelligence models with neuromorphic architectures and opens the opportunity for new applications. One such computing hardware is Intel Xeon Phi coprocessor, which delivers over a TeraFLOP of computing power with 61 integrated processing cores. How to efficiently harness such computing power to achieve real time decision and cognition is one of the key design considerations. This paper presents an optimized implementation of Brain-State-in-a-Box (BSB) neural network model on the Xeon Phi coprocessor for pattern matching in the context of intelligent text recognition of noisy document images. From a scalability standpoint on a High Performance Computing (HPC) platform we show that efficient workload partitioning and resource management can double the performance of this many-core architecture for neuromorphic applications.

signal processing systems | 2016

A Neuromorphic Architecture for Context Aware Text Image Recognition

Qinru Qiu; Zhe Li; Khadeer Ahmed; Wei Liu; Syed Faisal Habib; Hai Helen Li; Miao Hu

Although existing optical character recognition (OCR) tools can achieve excellent performance in text image detection and pattern recognition, they usually require a clean input image. Most of them do not perform well when the image is partially occluded or smudged. Humans are able to tolerate much worse image quality during reading because the perception errors can be corrected by the knowledge in word and sentence level context. In this paper, we present a brain-inspired information processing framework for context-aware Intelligent Text Recognition (ITR) and its acceleration using memristor based crossbar array. The ITRS has a bottom layer of massive parallel Brain-state-in-a-box (BSB) engines that give fuzzy pattern matching results and an upper layer of statistical inference based error correction. Optimizations on each layer of the framework are introduced to improve system performance. A parallel architecture is presented that incorporates the memristor crossbar array to accelerate the pattern matching. Compared to traditional multicore microprocessor, the accelerator has the potential to provide tremendous area and power savings and more than 8,000 times speedups.

signal processing systems | 2014

Neuromorphic acceleration for context aware text image recognition

Qinru Qiu; Zhe Li; Khadeer Ahmed; Hai Helen Li; Miao Hu

Although existing optical character recognition (OCR) tools can achieve excellent performance in text image detection and pattern recognition, they usually require a clean input image. Most of them do not perform well when the image is partially occluded or smudged. Humans are able to tolerate much worse image quality during reading because the perception errors can be corrected by the knowledge in word and sentence level context. In this paper, we present a brain-inspired information processing framework for context-aware Intelligent Text Recognition (ITR) and its acceleration using memristor based crossbar array. The ITRS has a bottom layer of massive parallel Brain-state-in-a-box (BSB) engines that give fuzzy pattern matching results and an upper layer of statistical inference based error correction. The framework works robustly in noisy environment. A parallel architecture is presented that incorporates the memristor crossbar array to accelerate the pattern matching. Compared to traditional microprocessor, the accelerator has the potential to provide tremendous area and power savings and more than 8,000 times speedups.

international symposium on neural networks | 2016

Simulation of bayesian learning and inference on distributed stochastic spiking neural networks

Khadeer Ahmed; Amar Shrestha; Qinru Qiu

The ability of neural networks to perform pattern recognition, classification and associative memory, is essential to applications such as image and speech recognition, natural language understanding, decision making etc. In spiking neural networks (SNNs), information is encoded as sparsely distributed train of spikes, which allows learning through the spike-timing dependent plasticity (STDP) property. SNNs can potentially achieve very large scale implementation and distributed learning due to the inherent asynchronous and sparse inter-neuron communications. In this work, we develop an efficient, scalable and flexible SNN simulator, which supports learning through STDP. The simulator is ideal for biologically inspired neuron models for computation but not for biologically realistic models. Bayesian neuron model for SNNs that is capable of online and fully-distributed STDP learning is introduced. The function of the simulator is validated using two networks representing two different applications from unsupervised feature extraction to inference based sentence construction.

international symposium on neural networks | 2016

Probabilistic inference using stochastic spiking neural networks on a neurosynaptic processor

Khadeer Ahmed; Amar Shrestha; Qinru Qiu; Qing Wu

Spiking neural networks are rapidly gaining popularity for their ability to perform efficient computation akin to the way a brain processes information. It has the potential to achieve low cost and high energy efficiency due to the distributed nature of neural computation and the use of low energy spikes for information exchange. A stochastic spiking neural network naturally can be used to realize Bayesian inference. IBMs TrueNorth is a neurosynaptic processor that has more than 1 million digital spiking neurons and 268 million digital synapses with less than 200 mW peak power. In this paper we propose the first work that converts an inference network to a spiking neural network that runs on the TrueNorth processor. Using inference-based sentence construction as a case study, we discuss algorithms that transform an inference network to a spiking neural network, and a spiking neural network to TrueNorth corelet designs. In our experiments, the TrueNorth spiking neural network constructed sentences have a matching accuracy of 88% while consuming an average power of 0.205 mW.

international symposium on neural networks | 2017

Stable spike-timing dependent plasticity rule for multilayer unsupervised and supervised learning

Amar Shrestha; Khadeer Ahmed; Yanzhi Wang; Qinru Qiu

Spike-Timing Dependent Plasticity (STDP), the canonical learning rule for spiking neural networks (SNN), is gaining tremendous interest because of its simplicity, efficiency and biological plausibility. However, to date, multilayer feed-forward networks of spiking neurons are either only partially trained using STDP or pre-trained using traditional deep neural networks which are converted to deep spiking neural networks or a two-layer network where STDP learnt features are manually labelled. In this work, we present a low-cost, simplified, yet stable STDP rule for layer-wise unsupervised and supervised training of a multilayer feed-forward SNN. We propose to approximate Bayesian neuron using Stochastic Integrate and Fire (SIF) neuron model and introduce a supervised learning approach using teacher neurons to train the classification layer with one neuron per class. A SNN is trained for classification of handwritten digits with multiple layers of spiking neurons, including both the feature extraction and classification layer, using the proposed STDP rule. Our method achieves comparable to better accuracy on MNIST dataset than manually labelled two layer networks for the same sized hidden layer. We also analyze the parameter space to provide rationales for parameter fine-tuning and provide additional methods to improve noise resilience and input intensity variations. We further propose a Quantized 2-Power Shift (Q2PS) STDP rule, which reduces the implementation cost of digital hardware while achieves comparable performance.

ieee computer society annual symposium on vlsi | 2016

System Design for In-Hardware STDP Learning and Spiking Based Probablistic Inference

Khadeer Ahmed; Amar Shrestha; Yanzhi Wang; Qinru Qiu

The emerging field of neuromorphic computing is offering a possible pathway for approaching the brains computing performance and energy efficiency for cognitive applications such as pattern recognition, speech understanding, natural language processing etc. In spiking neural networks (SNNs), information is encoded as sparsely distributed spike trains, enabling learning through the spike-timing dependent plasticity (STDP) mechanism. SNNs can potentially achieve ultra-low power consumption and distributed learning due to the inherent asynchronous and sparse inter-neuron communications. Several inroads have been made in SNN implementations, however, there is still a lack of computational models that lead to hardware implementation of large scale SNN with STDP capabilities. In this work, we present a set of neuron models and neuron circuit motifs that form SNNs capable of in-hardware fully-distributed STDP learning and spiking based probabilistic inference. Functions such as efficient Bayesian inference and unsupervised Hebbian learning are demonstrated on the proposed SNN system design. A highly scalable and flexible digital hardware implementation of the neuron model is also presented. Experimental results on two different applications: unsupervised feature extraction and inference based sentence construction, have demonstrated the proposed designs effectiveness in learning and inference.

asilomar conference on signals, systems and computers | 2011

A novel approach for simulation, measurement and representation of surface EMG (sEMG) signals

Anvith Katte Mahabalagiri; Khadeer Ahmed; Fred H. Schlereth

In this paper, we describe new methods for the simulation, measurement and representation of sEMG signals. With regard to simulation, we choose a 2-D state space model and suggest a 3-D model which can account for in- homogeneities, nonlinearities and memory in the medium and which can be hardware accelerated through FPGA. With regard to measurement we use surface electrodes with a new amplifier circuit topology, which mitigates the effects of pickup and artifacts. With regard to representation, we describe a method for using wavelets which shows promise for isolating signals of interest.

ieee high performance extreme computing conference | 2016

Distributed and configurable architecture for neuromorphic applications on heterogeneous cluster

Khadeer Ahmed; Qinru Qiu; Mangesh Tamhankar

With the proliferation of application specific accelerators, the use of heterogeneous clusters is rapidly increasing. Consisting of processors with different architectures, a heterogeneous cluster aims at providing different performance and cost tradeoffs for different types of workloads. In order to achieve peak performance, software running on heterogeneous cluster needs to be designed carefully to provide enough flexibility to explore its variety. We propose a design methodology to modularize complex software applications with data dependencies. The software application designed in this way have the flexibility to be reconfigured for different hardware platforms to facilitate resource management, and features high scalability and parallelism. Using a neuromorphic application as a case study, we present the concept of modularization and discuss the management, scheduling and communication of the modules. We also present experimental results demonstrating the improvements and effects of system scaling on throughput.

international conference on computer aided design | 2017