Nader Rafla | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nader Rafla is active.

Explore More

Publication

Featured researches published by Nader Rafla.

international midwest symposium on circuits and systems | 2006

A Study of Finite State Machine Coding Styles for Implementation in FPGAs

Nader Rafla; Brett LaVoy Davis

Finite State Machines (FSM), are one of the more complex structures found in almost all digital systems today. Hardware Description Languages are used for high-level digital system design. VHDL (VHSIC Hardware Description Language) provides the capability of different coding styles for FSMs. Therefore, a choice of a coding style is needed to achieve specific performance goals and to minimize resource utilization for implementation in a re-configurable computing environment such as an FPGA. This paper is a study of the tradeoffs that can be made by changing coding styles. A comparative study on three different FSM coding styles is shown to address their impact on performance and resource utilization for the most commonly used encoding methods for FPGA designs. The results show that a particular coding style leads to a savings in resource utilization with a significant performance improvement over the others while the others pose a consistent performance regardless of the resource utilization outcome.

workshop on microelectronics and electron devices | 2011

A non-volatile memory array based on nano-ionic Conductive Bridge Memristors

Steve Wald; R. Jacob Baker; Maria Mitkova; Nader Rafla

Much excitement has been generated over the potential uses of chalcogenide glasses and other materials in circuits as “memristors” or as non-volatile memories. The memristor is a fourth passive two terminal electronic device, postulated by Leon Chua in 1971 and rediscovered in 2008. Our Conductive Bridge Memristor (CBM) changes its resistance in response to current passing through it by building up or dissolving a conductive molecular bridge in an otherwise insulating chalcogenide film. This paper outlines the design and simulation of a non-volatile memory using an array of CBM devices integrated with CMOS access transistors and read/write access circuitry. We have designed and simulated a large memory array layout using CBM devices accessed by an NMOS transistor and CMOS row/column read and write drivers. The design uses a folded-cascode op-amp configured to integrate current on the column as a strategy for sensing the device resistance. Each CBM device is connected to the array through a single minimum size NMOS transistor. The design has been simulated using a SPICE model for the PMC (Programmable Metallization Cell) [7]. We demonstrate the feasibility of accessing the device for read without exceeding the write threshold, and discuss the tradeoff of speed vs. array size associated with this technique. Plans are being developed to fabricate the design on a MOSIS multi project wafer with BEOL processing for the CBM devices.

midwest symposium on circuits and systems | 2008

Reducing power consumption in FPGAs by pipelining

Steve Bard; Nader Rafla

Reducing the logic levels in digital hardware designs can dramatically reduce power consumption of field-programmable gate arrays (FPGAs). In this study, logic levels were varied by applying different degrees of pipelining to five types of circuits: a parity circuit, two multipliers, an adder-based design, a sine-cosine generator, and an encryption circuit. Power was measured to the core logic of a 90-nm FPGA for each design. Results show that reducing the logic levels in a parity circuit can cut dynamic switching power by nearly a third, with no area expense. They also indicate that introducing pipeline registers can cut power by 44 percent to 83 percent in the other designs. In most cases, the reduction can be achieved with little or no area expense. In other cases, a noteworthy area tradeoff is required. The reduction can be attributed to the pipeline registerspsila ability to curb the number of useless signal transitions, or glitches. Reducing logic levels can reduce glitches by orders of magnitude, according to the results. The power-reduction techniques could be applied to many digital logic circuits and would be especially effective in compute-intensive designs.

international midwest symposium on circuits and systems | 2011

Hardware implementation of context switching for hard real-time operating systems

Nader Rafla; Deepak Kumar Gauba

Nowadays more and more embedded real-time applications use multithreading. The benefits of multithreading include better throughput, improved responsiveness, and ease of development. However, these benefits come with costs and pitfalls which are unacceptable for a typical hard real-time system. These costs are mainly caused by scheduling and context switching between threads. While different scheduling algorithms have been developed to improve the overall system performance, context switching still consumes lot of processor resources and presents a major overhead, especially for hard real-time applications. In this paper, we propose a new approach to improve the overall performance of embedded systems that use multithreading by moving the context switching component of the Real-Time Operating System (RTOS) to the processor hardware itself. This technique leads to savings of processor clock cycles used by context switching, which is a very important resource for a hard real-time embedded systems.

international midwest symposium on circuits and systems | 2010

A reconfigurable pattern matching hardware implementation using on-chip RAM-based FSM

Nader Rafla; Indrawati Gauba

The use of synthesizable reconfigurable IP cores has increasingly become a trend in System on Chip (SoC) designs because of their flexibility and powerful functionality. The market introduction of multi-featured platform FPGAs equipped with embedded memory and processor blocks has further expanded the possibility of utilizing dynamic reconfiguration to improve overall system adaptability to meet varying product requirements. In this paper, a reconfigurable hardware implementation for pattern matching using Finite State machine (FSM) is proposed. The FSM design is RAM-based and is reconfigured on the fly through altering memory contents only. An embedded processor is used for orchestrating run time reconfiguration. Experimental results show that the system can reconfigure itself based on a new incoming pattern and perform the text search without the need of a host processor. Results also proved that each search iteration was executed in one clock cycle and the maximum achievable clock frequency is independent of search pattern length.

midwest symposium on circuits and systems | 2007

Evolvable Reconfigurable Hardare framework for edge detection

Nader Rafla

Systems on Reconfigurable Chips contain rich resources of logic, memory, and processor cores on the same fabric. This platform is suitable for implementation of Evolvable Reconfigurable Hardware Architectures (ERHA). It is based on the idea of combining reconfigurable Field Programmable Gate Arrays (FPGA) along with genetic algorithms (GA) to perform the reconfiguration operation. This architecture is a suitable candidate for implementation of early-processing stage operators of image processing such as filtering and edge detection. However, there are still fundamental issues need to be solved regarding the on-chip reprogramming of the logic. This paper presents a framework for implementing an evolvable hardware architecture for edge detection on Xilinx Virtex-4 chip. Some preliminary results are discussed.

international midwest symposium on circuits and systems | 2013

An automated embedded computer vision system for object measurement

Nicholas Pauly; Nader Rafla

Automation is common in many industries. Some tasks, such as object measurement, can easily be automated. The equipment required to perform automated measurements may be cost prohibitive and may prevent widespread deployment in automated point of sale kiosks or product inspection applications. This paper focuses on the development of an optical measurement device that is inexpensive relative to existing systems. The device consists of an image sensor, several laser line projectors, and a microcontroller. The hardware design and configuration of the device is discussed along with the algorithms used to process the captured images and perform the measurement tasks. The conclusion includes a discussion of the performance of the optical measurement device.

international midwest symposium on circuits and systems | 2009

On-chip intrinsic evolution methodology for sequential logic circuit design

Fan Xiong; Nader Rafla

This paper focuses on the application of Virtual Reconfigurable Circuit (VRC) design methodology and intrinsic evolution for the design of small sequential circuits and their implementation on a single programmable chip with an embedded hardcore processor. The evolutionary algorithm is developed in software that runs on the embedded processor. Fitness function is calculated using hardware architecture and is used to guide the evolution process. This new method is applied to the development of a 3-bit sequence detector and the evolved architecture is implemented on a Xilinxtm Virtex-II pro device. Simulations were run on the evolved architecture and on the same circuit designed using conventional Hardware Descriptive Language (HDL). Both designs showed the same functional behavior. Synthesis results show that the new method can be used in successfully implementing small sequential circuits on a reconfigurable hardware environment.

international symposium on signal processing and information technology | 2005

Real-time 3D image visualization system for digital video on a single chip

Nader Rafla

Implementation of a real-time image visualization system on a reconfigurable chip (FPGA) is proposed. The system utilizes an innovative stereoscopic image capture, processing and visualization technique. Implementation is done as a two stage process. In the first stage, the stereo pair is captured using two image sensors. The captured images are then synchronized and sent to the second stage for fusion. A controller module is developed, designed, and placed on the FPGA for this purpose. The second stage is used for reconstruction and visualization of the 3D image. An innovative technique employing dual-processor architecture on the same single FPGA is developed for this purpose. The whole system is placed on a single PCB resulting in a fast processing time and the ability to view 3D video in real-time. The system is simulated, implemented, and tested on real images. Results show that this system is a low cost solution for efficient 3D video visualization using a single chip

field programmable gate arrays | 2018

HexCell: a Hexagonal Cell for Evolvable Systolic Arrays on FPGAs: (Abstract Only)

Fady Hussein; Luka Daoud; Nader Rafla

This paper presents a novel cell architecture for evolvable systolic arrays. HexCell is a tile-able processing element with a hexagonal shape that can be implemented and dynamically reconfigured on field-programmable gate arrays (FPGAs). The cell contains a functional unit, three input ports, and three output ports. It supports two concurrent configuration schemes: dynamic partial reconfiguration (DPR), where the functional unit is partially reconfigured at run time, and virtual reconfiguration circuit (VRC), where the cell output port bypasses one of the input data or selects the functional unit output. Hence, HexCell combines the merits of DPR and VRC including resource-awareness, reconfiguration speed and routing flexibility. In addition, the cell structure supports pipelining and data synchronization for achieving high throughput for data-intensive applications like image processing. A HexCell is represented by a binary string (chromosome) that encodes the cells function and the output selections. Our developed evolvable HexCell array supports more inputs and outputs, a variety of possible datapaths, and has faster reconfiguration, compared to the state-of-the-art systolic array while maintaining the same resource utilization. Moreover, by using the same genetic algorithm on the two systolic arrays, results show that the HexCell array has higher throughput and can evolve faster than state-of-the-art array.

Explore More