Seyyed Mahdi Najmabadi

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Seyyed Mahdi Najmabadi is active.

Explore More

Publication

Featured researches published by Seyyed Mahdi Najmabadi.

international symposium on parallel and distributed processing and applications | 2015

High throughput hardware architectures for asymmetric numeral systems entropy coding

Seyyed Mahdi Najmabadi; Zhe Wang; Yousef Baroud; Sven Simon

In this paper two new hardware-based entropy coding architectures for asymmetric numeral systems are introduced, as entropy encoding is one of the major phases in a compression algorithm. The proposed architectures are based on tabled asymmetric numeral systems (tANS). The tabled asymmetric numeral systems combines the speed advantage of table based approaches (e.g. Huffman encoding) with the higher compression rate advantage of arithmetic encoding. Both proposed architectures have been synthesized to a state-of-the-art FPGA, and the synthesis results show high encoding throughput. The architectures are capable of encoding one symbol per clock cycle. The performance of the architectures depends on the number of symbols in the alphabet and may vary from 146 up to 290 Mega symbols per second (Msps).

computational science and engineering | 2013

Stream Processing of Scientific Big Data on Heterogeneous Platforms -- Image Analytics on Big Data in Motion

Seyyed Mahdi Najmabadi; Michael Klaiber; Zhe Wang; Yousef Baroud; Sven Simon

High performance image analytics is an important challenge for big data processing as image and video data is a huge portion of big data e.g. generated by a tremendous amount of image sensors worldwide. This paper presents a case study for image analytics namely the parallel connected component labeling (CCL) which is one of the first steps of image analytics in general. It is shown that a high performance CCL implementation can be obtained on a heterogeneous platform if parts of the algorithm are processed on a fine grain parallel field programmable gate array (FPGA) and a multi-core processor simultaneously. The proposed highly efficient architecture and implementation is suitable for the processing of big image and video data in motion and reduces the amount of memory required by the hardware architecture significantly for typical image sizes.

Computation | 2017

Analyzing the Effect and Performance of Lossy Compression on Aeroacoustic Simulation of Gas Injector

Seyyed Mahdi Najmabadi; Philipp Offenhäuser; Moritz Hamann; Guhathakurta Jajnabalkya; Fabian Hempert; Colin W. Glass; Sven Simon

Computational fluid dynamic simulations involve large state data, leading to performance degradation due to data transfer times, while requiring large disk space. To alleviate the situation, an adaptive lossy compression algorithm has been developed, which is based on regions of interest. This algorithm uses prediction-based compression and exploits the temporal coherence between subsequent simulation frames. The difference between the actual value and the predicted value is adaptively quantized and encoded. The adaptation is in line with user requirements, that consist of the acceptable inaccuracy, the regions of interest and the required compression throughput. The data compression algorithm was evaluated with simulation data obtained by the discontinuous Galerkin spectral element method. We analyzed the performance, compression ratio and inaccuracy introduced by the lossy compression algorithm. The post processing analysis shows high compression ratios, with reasonable quantization errors.

international conference on systems signals and image processing | 2015

Visually lossless image compression extension for JPEG based on just-noticeable distortion evaluation

Zhe Wang; Sven Simon; Yousef Baroud; Seyyed Mahdi Najmabadi

A visually lossless image encoding extension for JPEG is presented. Such extension enables an efficient implementation of perceptual coding by reusing existing widespread software libraries and hardware IP cores for JPEG. For any pixel in a decoded image, the proposed algorithm guarantees a maximum distortion bounded by the just-noticeable distortion (JND) threshold measured based on the input image. Perceptual coding is performed in three steps: (1) standard transform domain coding, (2) spatial domain distortion visibility analysis by JND model and (3) spatial domain residual coding. Such scheme has been implemented in this work as an extension for JPEG based on a low complexity JND model. The encoder determines if a pixel block in a standard JPEG output image contains distortions beyond the visibility threshold given by the JND model. If it is true then the locations and the values of such distortions are encoded as side information. Quantization step size for the distortion values, i.e. perceptual residuals, are chosen based on the visibility threshold. Experimental results show that in terms of compression efficiency, the proposed perceptual encoding extension outperforms the standard JPEG encoder by 50% for a visually lossless compression of images.

field programmable custom computing machines | 2016

Online Bandwidth Reduction Using Dynamic Partial Reconfiguration

Seyyed Mahdi Najmabadi; Zhe Wang; Yousef Baroud; Sven Simon

Online compression of I/O-data streams in Custom Computing Machines will enhance the effective network band-width of computing systems as well as storage bandwidth and capacity. In this paper a self-adaptive dynamic partial reconfigurable architecture for online compression is proposed. The proposed architecture will bring new possibilities in online compression due to its adaptivity to dynamic environments. In this study, network traffic, and input data distribution are considered as two dynamic behaviors. The degree of improvement provided by the architecture depends on data distribution, bandwidth, and available resources. Our analysis shows an improvement of up to 20% in compression ratios in comparison to non-adaptive approaches.

2016 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC) | 2016

A self-adaptive dynamic partial reconfigurable architecture for online data stream compression

Seyyed Mahdi Najmabadi; Zhe Wang; Yousef Baroud; Sven Simon

Online compression of I/O-data streams in general purpose computing will enhance the effective I/O bandwidth of processors, the bandwidth of the computer network as well as the storage capacity and the read/write performance of the storage. In this paper, a self-adaptive dynamic partial reconfigurable architecture for the online compression of data streams is introduced. The proposed architecture will bring new possibilities in online compression due to its adaptivity to different factors like current data bandwidth, data statistics and the level of available resources and so forth. The architecture consists of multiple partially reconfigurable regions that are reconfigured dynamically with suited compression or decompression IP cores based on the above-mentioned factors at run time. The main goal of the adaptive online compression of the data stream is to provide maximum decompression throughput. The degree of improvement depends on the network throughput and the available resources. Our analysis shows an improvement up to 40% in decompression throughput in comparison to non-adaptive approaches.

signal processing algorithms architectures arrangements and applications | 2017

A resource-efficient monitoring architecture for hardware accelerated self-adaptive online data stream compression

Seyyed Mahdi Najmabadi; Prajwala Pandit; Trung-Hieu Tran; Sven Simon

In this paper, a novel scalable and resource-efficient architecture capable of monitoring the compressibility of a data stream with various entropy encoding algorithms is proposed. The self-adaptive architecture determines the best compression technique among many techniques which may be selected to encode an online data stream. This information can be used to reconfigure an adaptive encoding architecture dynamically at runtime to provide a high compression ratio. We have compared two hardware architectures that model the same functionality but perform the processing of the input data differently. This paper contributes a resource-efficient self-adaptive way of selecting the best lossless data compression method in hardware, independent of the end application. The processing architecture which uses soft-core processors provides approximately 35% resource savings as compared to the hardware implementation of processing modules in VHDL. Our experimental results show that the overall compression achieved by using self-adaptive architectures is around 14% better than that provided by the best compression technique in a non-adaptive system.

conference on design and architectures for signal and image processing | 2017

Hardware-based architecture for asymmetric numeral systems entropy decoder

Seyyed Mahdi Najmabadi; Harsimran Singh Tungal; Trung-Hieu Tran; Sven Simon

In this paper, two novel hardware architectures based on tabled asymmetric numeral systems decoding algorithm are proposed. In the proposed architectures the decoding throughput is highly dependent on the how much the data is compressed at encoding time. The synthesis results presented here show that the throughput of the parallel architecture can reach up 200 MB/s. The benchmarks show that the parallel architecture that runs on Xilinx Kintex FPGA provides higher throughout in comparison with the same algorithm running on Core i3 CPU.

picture coding symposium | 2016

Low complexity perceptual image coding by just-noticeable difference model based adaptive downsampling

Zhe Wang; Yousef Baroud; Seyyed Mahdi Najmabadi; Sven Simon

A pixel domain algorithm for low complexity perceptual image coding is proposed. The algorithm exploits a combination of downsampling, predictive coding and just-noticeable difference (JND) model. Downsampling is performed adaptively on the input image based on regions-of-interest (ROI) identified by measuring the downsampling distortions against the visibility thresholds given by the JND model. The downsampled pixel is encoded if the differences are within the JND thresholds, and otherwise the original pixels are encoded with a quantization parameter selected based on the JND model. Noise shaping is employed to suppress potential visual artifacts due to quantization error propagation. The coding error at any pixel location can be guaranteed to be within the corresponding JND threshold. Experimental results show improved rate-distortion performance and visual quality over JPEG-LS as well as reduced rates compared with other standard codecs like JPEG 2000 at the same PSPNR.

Journal of Real-time Image Processing | 2017