Javed Absar
STMicroelectronics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Javed Absar.
international conference on parallel architectures and compilation techniques | 2015
Riyadh Baghdadi; Ulysse Beaugnon; Albert Cohen; Tobias Grosser; Michael Kruse; Chandan Reddy; Sven Verdoolaege; Adam Betts; Alastair F. Donaldson; Jeroen Ketema; Javed Absar; Sven Van Haastregt; Alexey Kravets; Anton Lokhmotov; Róbert Dávid; Elnar Hajiyev
Programming accelerators such as GPUs with low-level APIs and languages such as OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic parallelization and domain specific languages (DSLs) have been proposed to hide complexity and regain performance portability. We present PENCIL, a rigorously-defined subset of GNU C99-enriched with additional language constructs-that enables compilers to exploit parallelism and produce highly optimized code when targeting accelerators. PENCIL aims to serve both as a portable implementation language for libraries, and as a target language for DSL compilers. We implemented a PENCIL-to-OpenCL backend using a state-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affine array accesses, generates optimized OpenCL code. To demonstrate the potential and performance portability of PENCIL and the PENCIL-to-OpenCL compiler, we consider a number of image processing kernels, a set of benchmarks from the Rodinia and SHOC suites, and DSL embedding scenarios for linear algebra (BLAS) and signal processing radar applications (SpearDE), and present experimental results for four GPU platforms: AMD Radeon HD 5670 and R9 285, NVIDIA GTX 470, and ARM Mali-T604.
high performance embedded architectures and compilers | 2008
Praveen Raghavan; Andy Lambrechts; Javed Absar; Murali Jayapala; Francky Catthoor; Diederik Verkest
Modern mobile devices need to be extremely energy efficient. Due to the growing complexity of these devices, energy aware design exploration has become increasingly important. Current exploration tools often do not support energy estimation, or require the design to be very detailed before the estimate is possible. It is important to get early feedback on both performance and energy consumption during all phases of the design and at higher abstraction levels. This paper presents a unified optimization and exploration framework, from source level transformation to processor architecture design. The proposed retargetable compiler and simulator framework can map applications to a range of processors and memory configurations, simulate and report detailed performance and energy estimates. An accurate energy modeling approach is introduced, which can estimate the energy consumption of processor and memories at a component level, which can help to guide the design process. Fast energy-aware architecture exploration is illustrated using an example processor. The flow is demonstrated using a representative wireless benchmark on two state of the art processors and on a processor with advanced low power extensions for memories. The framework also supports exploration of various novel low power extensions and their combinations. We show that a unified framework enables fast feedback on the effect of source level transformations of the application code on the final cycle count and energy consumption.
IEEE Transactions on Consumer Electronics | 2004
Chiew Tong Lau; A. Benjamin Premkumar; Javed Absar; Sapna George
MPEG-AAC is the current state of the art in audio compression technology. The CD-quality promised at bit rate as low as 64 kbps makes AAC a strong candidate for high quality low bandwidth audio streaming applications over wireless network. Besides this low bit rate requirement, the codec must be able to run on personal wireless handheld devices with its inherent low power characteristics. While the AAC standard is definite enough to ensure that a valid AAC stream is correctly decodable by all AAC decoders, it is flexible enough to accommodate variations in implementation, suited to different resources available and application areas. This paper reviews various implementation techniques of the encoder. We then proposed our method of an optimized software implementation of MPEG-AAC (LC profile). The coder is able to perform encoding task using half the processing power compared to standard implementation without significant degradation in quality as shown by both subjective listening test and an ITU-R compliant quality-testing program (OPERA).
languages, compilers, and tools for embedded systems | 2014
Ulysse Beaugnon; Alexey Kravets; Sven Van Haastregt; Riyadh Baghdadi; David Tweed; Javed Absar; Anton Lokhmotov
We present VOBLA, a domain-specific language designed for programming linear algebra libraries. VOBLA is compiled to PENCIL, a domain independent intermediate language designed for efficient mapping to accelerator architectures such as GPGPUs. PENCIL is compiled to efficient, platform-specific OpenCL code using techniques based on the polyhedral model. This approach addresses both the programmer productivity and performance portability concerns associated with accelerator programming. We demonstrate our approach by using VOBLA to implement a BLAS library. We have evaluated the performance of OpenCL code generated using our compilation flow on ARM Mali, AMD Radeon, and AMD Opteron platforms. The generated code is currently on average 1.9x slower than highly hand-optimized OpenCL code, but on average 8.1x faster than straightforward OpenCL code. Given that the VOBLA coding takes significantly less effort compared to hand-optimizing OpenCL code, we believe our approach leads to improved productivity and performance portability.
ACM Transactions on Design Automation of Electronic Systems | 2009
Praveen Raghavan; Murali Jayapala; Andy Lambrechts; Javed Absar; Francky Catthoor
Modern mobile devices need to be extremely energy efficient. Due to the growing complexity of these devices, energy-aware design exploration has become increasingly important. Current exploration tools often do not support energy estimation, or require the design to be very detailed before estimation is possible. It is important to get early feedback on both performance and energy consumption during all phases of the design and at higher abstraction levels. This article presents a unified optimization and exploration framework to explore source-level transformation to processor architecture design space. The proposed retargetable compiler and simulator framework can map applications to a range of processors and memory configurations, simulate, and report detailed performance and energy estimates. An accurate and consistent energy modeling approach is introduced which can estimate the energy consumption of processor and memories at a component level, which can help to guide the design process. Fast energy-aware architecture exploration is illustrated by modeling both state-of-the-art processors as well as other architectures. Various design trade-offs are also illustrated on different academic as well as industrial benchmarks from both the wireless communication and multimedia domain. We also illustrate a design space exploration on different applications and show that there is large trade-off space between application performance, energy consumption, and area. We show that the proposed framework is consistent, accurate, and covers a large design space including various novel low-power extensions in a unified framework.
international symposium on circuits and systems | 2004
Saman S. Abeysekera; Kabi Prakash Padhi; Javed Absar; Sapna George
In this paper, we enumerate the specific advantages of using a sinusoidal representation for scaling of audio signals. Most existing systems scale signals using time domain algorithms. Here, we discuss a frequency domain technique to scale signals by any desired factor. We also compare the computational savings of the proposed algorithm with traditional time domain methods.
Archive | 2015
Riyadh Baghdadi; Albert Cohen; Tobias Grosser; Sven Verdoolaege; Anton Lokhmotov; Javed Absar; Sven Van Haastregt; Alexey Kravets; Alastair F. Donaldson
high performance embedded architectures and compilers | 2015
Riyadh Baghdadi; Javed Absar; Ulysse Beaugnon; Adam Betts; Albert Cohen; Róbert Dávid; Alastair F. Donaldson; Tobias Grosser; Sven Van Haastregt; Elnar Hajiyev; Jeroen Ketema; Alexey Kravets; Michael Kruse; Anton Lokhmotov; Chandan Reddy; Sven Verdoolaege
Archive | 2015
Riyadh Baghdadi; Albert Cohen; Tobias Grosser; Sven Verdoolaege; Javed Absar; Sven Van Haastregt; Alexey Kravets; Anton Lokhmotov; Alastair F. Donaldson
acm conference on systems programming languages and applications software for humanity | 2014
Ulysse Beaugnon; Riyadh Baghdadi; Javed Absar; Adam Betts; Albert Cohen; Alastair F. Donaldson; Tobias Grosser; Sven Van Haastregt; Yabin Hu; Jeroen Ketema; Alexey Kravets; Anton Lokhmotov; Sven Verdoolaege