Semih Aslan
Texas State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Semih Aslan.
international midwest symposium on circuits and systems | 2012
Semih Aslan; Sufeng Niu; Jafar Saniie
In this paper, an improved fixed-point hardware design of QR decomposition, specifically optimized for Xilinx FPGAs is introduced. A Givens Rotation algorithm is implemented by using a folded systolic array and the CORDIC algorithm, making this very suitable for high-speed FPGAs or ASIC designs. We improve the internal cell structure so that the system can run at 246MHz with nearly 24M updates per second throughout on a Virtex5 FPGA. The matrix size can be easily scaled up.
electro information technology | 2009
Semih Aslan; Erdal Oruklu; Jafar Saniie
The QR factorization is used in many signal processing and communication applications such as echo cancellation, adaptive beamforming and multiple-inputmultiple- output (MIMO) systems. However, division, square root and inverse square root operations required by the QR algorithm are very difficult to implement because they are computationally slow and area-consuming arithmetic operations. This paper presents unified hardware architecture for fast, area efficient QR factorization based on the Householder transformation. Newton-Raphson, and Goldschmidt algorithms are used for fast division, square root and inverse square root blocks. By using a unified architecture, area and power requirements for QR factorization are reduced without decreasing overall speed. The design and implementation of the proposed hardware is presented with synthesis results based on FPGA hardware.
instrumentation and measurement technology conference | 2013
Sufeng Niu; Semih Aslan; Jafar Saniie
In this paper, we present a high performance adaptive FIR filter hardware architecture. In particular, the RLS (Recursive Least Square) algorithm for adaptive signal processing is explored based on QR decomposition, which is accomplished by using the Givens Rotation algorithm. The Givens Rotation algorithm is implemented using a systolic array and LUT-based Newtons method. This architecture is suitable for high-speed FPGAs or ASIC designs. It also solves the tradeoff between throughput and latency issues. As a case study, this QR design is tested using Xilinx XC5VLX110T FPGA. The findings show that the system is capable of running the QR decomposition at up to 200MHz with 56 clock cycles latency.
electro information technology | 2010
Christophe Desmouliers; Semih Aslan; Erdal Oruklu; Jafar Saniie; F. Martinez Vallina
The objective of this work is to design and implement an Image and Video Processing Platform (IVPP) on FGPAs using PICO based HLS. This hardware/software codesign platform has been implemented on a Xilinx Virtex-5 FPGA. The video interface blocks are done in RTL and the initialization phase is done using a MicroBlaze processor allowing the support of multiple video resolutions. This paper discusses the architectural building blocks showing the flexibility of the proposed platform. This flexibility is achieved by using a new design flow based on PICO. IVPP allows custom-processing blocks to be plugged-in to the platform architecture without modifying the front-end (capturing video data) and back-end (displaying processed output). This paper presents several examples of video processing applications, such as a Canny edge detector, motion detector and object tracking that have been realized using IVPP for real-time video processing.
Iet Computers and Digital Techniques | 2012
Christophe Desmouliers; Erdal Oruklu; Semih Aslan; Jafar Saniie; Fernando Martinez Vallina
In this study, an image and video processing platform (IVPP) based on field programmable gate array (FPGAs) is presented. This hardware/software co-design platform has been implemented on a Xilinx Virtex-5 FPGA using a high-level synthesis and can be used to realise and test complex algorithms for real-time image and video processing applications. The video interface blocks are done in Register Transfer Languages and can be configured using the MicroBlaze processor allowing the support of multiple video resolutions. The IVPP provides the required logic to easily plug-in the generated processing blocks without modifying the front-end (capturing video data) and the back-end (displaying processed output data). The IVPP can be a complete hardware solution for a broad range of real-time image/video processing applications including video encoding/decoding, surveillance, detection and recognition.
international midwest symposium on circuits and systems | 2012
Semih Aslan; Erdal Oruklu; Jafar Saniie
A flexible and efficient fixed to floating point conversion tool is presented for digital signal processing and communication systems. Fixed point numbers are heavily used in digital systems because they require less hardware, verification time and design effort compared to floating point number systems. However, floating point numbers offer better precision. Some digital designs may use a hybrid number system wherein fixed and floating point numbers can be used together to improve accuracy. The proposed design tool converts fixed-point numbers to floating-point numbers, including IEEE-754 floating point number standard. This tool generates Verilog RTL code and its testbench that can be implemented in FPGA and VLSI systems. The proposed design tool can increase productivity by reducing the design and verification time. The generated design has been implemented on Xilinx Virtex-5 FPGAs and compared to conventional fixed to floating conversion tools.
internaltional ultrasonics symposium | 2009
Erdal Oruklu; Semih Aslan; Jafar Saniie
In this study, we have examined time-frequency distributions Gabor transform, Wigner-Ville distribution, Choi-Williams distribution and Wavelet transform for improved flaw detection performance in ultrasonic nondestructive testing applications. A new methodology is presented with respect each T/F distribution methods, with necessary steps to achieve maximum flaw echo visibility enhancement. This methodology describes i) mapping ultrasonic signal to T/F domain, ii) projection from T/F representation back to time domain, iii) interpretation of the signal using order statistics. These steps include choosing the optimal time and frequency window sizes (based on Heisenberg principle), and the appropriate post-processing detection method to minimize the effect of null-observations. To demonstrate the validity of the methods, we discuss and draw an analogy between T/F distributions and the conventional Split-Spectrum Processing flaw detection method. The analytical and experimental studies verify the feasibility of the T/F techniques for NDE applications.
international midwest symposium on circuits and systems | 2012
Spenser Gilliland; Jafar Saniie; Semih Aslan
In this paper, we present a Reconfigurable Ultrasonic System-on-Chip Hardware (RUSH) platform for real-time signal analysis and image processing. The platform is designed to directly process the full range of ultrasound from 20 KHz to 20 MHz. The project aims to make it simple to effectively develop and implement algorithms in embedded software and reconfigurable hardware. This provides the user with an opportunity to explore the full design space including software only, hardware only, and hardware/software co-design. The RUSH platform provides high speed access to a 12-bit ADC controlled by a Xilinx FPGA. Access to the ultrasound data and custom IP cores is available through a gigabit Ethernet connection managed by an embedded Linux based operating system running on a Microblaze processor instantiated in the FPGA fabric.
international midwest symposium on circuits and systems | 2010
Semih Aslan; Christophe Desmouliers; Erdal Oruklu; Jafar Saniie
Matrix operations are required in many complex algorithms in digital, image and video processing applications. The conventional method is usually used to implement matrix multiplications for small matrices. However, with the development of VLSI technology and FPGAs, there is an increasing demand for developing a high speed, low power and low area matrix multiplication system for large matrices. The design and verification process of an area-efficient and high throughput matrix multiplication operator can be time consuming and complex. In this study, an efficient hardware design tool is developed to generate a matrix multiplication operator based on user input parameters using three different approaches (Conventional, Strassen, and Hybrid). Strassen algorithm can be used to reduce the area of the matrix multiplication system; nevertheless this method increases the memory requirements and decreases the accuracy of the results. The proposed Hybrid-Strassen matrix multiplication algorithm increases the precision of the operation while reducing the area of the overall system compared to a conventional approach. The generated design has been implemented on Xilinx Virtex-5 FPGAs but can be synthesized for VLSI implementations.
international midwest symposium on circuits and systems | 2013
Sufeng Niu; Sizhou Wang; Semih Aslan; Jafar Saniie
In this paper, an embedded hardware and software system design and implementation for QR Decomposition Recursive Least Square (QRD-RLS) algorithm using Givens Rotation are presented. Furthermore, hardware and software design optimization are introduced to the Givens Rotation-based method. The computation performance is compared for hardware implementation running on Xilinx Virtex-5 FPGA, and software design running on two different processors (Intel i7 processor and ARM embedded processor) for solving least square problems. The challenges for hardware optimization and software algorithm are also presented.