Is this you? Create Your Porfile

Ami Marowka

Shenkar College of Engineering and Design

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ami Marowka is active.

Explore More

Publication

Featured researches published by Ami Marowka.

Communications of The ACM | 2007

Parallel computing on any desktop

Ami Marowka

Parallelization lets applications exploit the high throughput of new multicore processors, and the OpenMP parallel programming model helps developers create multithreaded applications.

IEEE Distributed Systems Online | 2008

Think Parallel: Teaching Parallel Programming Today

Ami Marowka

Parallel computing is rapidly entering mainstream computing, and multicore processors can now be found in the heart of supercomputers, desktop computers, and laptops. Consequently, applications will increasingly need to be parallelized to fully exploit the multicore processor throughput gains that are becoming available. Unfortunately, writing parallel code is more complex than writing serial code. An introductory parallel computing course aims to introduce students to this technology shift and to explain that parallelism calls for a different way of thinking and new programming skills. The course covers theoretical topics and offers practical experience in writing parallel algorithms on state-of-the-art parallel computers, parallel programming environments, and tools.

international conference on algorithms and architectures for parallel processing | 2008

Performance of OpenMP Benchmarks on Multicore Processors

Ami Marowka

The appearance of Multicore processors brings high performance computing to the desktop and opens the doors of mainstream computing for parallel computing. This paradigm shift leads the integration of parallel programming standards for high-end shard-memory machine architectures into desktop programming environments. In this paper we present a performance study of these new systems. We evaluate the performance of an OpenMP shared-memory programming model that is integrated into Microsoft Visual Studio C++ 2005 and Intel C++ compilers on a multicore processor. We benchmarked using the NAS OpenMP high-level applications benchmarks and the EPCC OpenMP low-level benchmarks. We report the basic timings, scalability, and runtime profiles of each benchmark and analyze the running results.

international symposium on parallel and distributed computing | 2005

Execution model of three parallel languages: OpenMP, UPC and CAF

Ami Marowka

The aim of this paper is to present a qualitative evaluation of three state-of-the-art parallel languages: OpenMP, Unified Parallel C (UPC) and Co-Array Fortran (CAF). OpenMP and UPC are explicit parallel programming languages based on the ANSI standard. CAF is an implicit programming language. On the one hand, OpenMP designs for shared-memory architectures and extends the base-language by using compiler directives that annotate the original source-code. On the other hand, UPC and CAF designs for distribute-shared memory architectures and extends the base-language by new parallel constructs. n nWe deconstruct each language into its basic components, show examples, make a detailed analysis, compare them, and finally draw some conclusions.

2008 Advanced Software Engineering and Its Applications | 2008

Towards High-Level Parallel Programming Models for Multicore Systems

Ami Marowka

Parallel programming represents the next turning point in how software engineers write software. Multicore processors can be found today in the heart of supercomputers, desktop computers and laptops. Consequently, applications will increasingly need to be parallelized to fully exploit multicore processors throughput gains now becoming available. Unfortunately, writing parallel code is more complex than writing serial code. This is where the threading building blocks (TBB) approach enters the parallel computing picture. TBB helps developers create multithreaded applications more easily by using high-level abstractions to hide much of the complexity of parallel programming, We study the programmability and performance of TBB by evaluating several practical applications. The results show very promising performance but parallel programming with TBB is still tedious and error-prone.

international symposium on parallel and distributed computing | 2007

Routing Speedup in Multicore-Based Ad Hoc Networks

Ami Marowka

The integration of multicore processors into wireless mobile devices is creating new opportunities to enhance the speed and scalability of message routing in ad hoc networks. In this paper we study the impact of multicore technology on routing speed and node efficiency, and draw conclusions regarding the measures that should be taken to conserve energy and prolong the lifetime of a network. We formally define three metrics and use them for performance evaluation: time-to-destination (T2D), average routing speedup (ARS), and average-node-efficiency (ANE). The T2D metric is the time a message takes to travel to its destination in a loaded traffic network. ARS measures the average routing speed gained by a multicore-based network over a single-core based network, and ANE measures the average efficiency of a node, or the number of active cores. These benchmarks show that routing speedup in networks with multicore nodes increases linearly with the number of cores and significantly decrease traffic bottlenecks, while allowing more routings to be executed simultaneously. The average node efficiency, however, decreases linearly with the number of cores per node. Power-aware protocols and energy management techniques should therefore be developed to turn off the unused cores.

International Journal of Parallel, Emergent and Distributed Systems | 2009

Bsp2omp: A Compiler For Translating Bsp Programs To Openmp

Ami Marowka

The convergence of the two widely used parallel programming paradigms, shared- memory and distributed- shared-memory parallel programming models, into a unified parallel programming model is crucial for parallel computing to become the next mainstream programming paradigm. We study the design differences and the performance issues of two parallel programming models: a shared- memory programming model (OpenMP) and a distributed- shared programming model (BSP). The study was carried out by designing a compiler for translating BSP parallel programs to an OpenMP programming model called BSP20MP. Analysis of the compiler outcome, and of the performance of the compiled programs, show that the two models are based on very similar underlying principles and mechanisms.

international parallel and distributed processing symposium | 2008

BSP2OMP: A compiler for translating BSP programs to OpenMP

Ami Marowka

international parallel and distributed processing symposium | 2008

10th Workshop on Advances in Parallel and Distributed Computational Models - APDCM'08

Oscar H. Ibarra; Koji Nakano; Jacir Luiz Bordim; Akihiro Fujiwara; Anu G. Bourgeois; Satoshi Fujita; Shuichi Ichikawa; Yasushi Inoguchi; Chuzo Iwamoto; Xiaohong Jiang; Hirotsugu Kakugawa; Ami Marowka; Susumu Matsumae; Eiji Miyano; Mitsuo Motoki; Hirotaka Ono; Sanguthevar Rajasekaran; Ivan Stojmenovic; Yasuhiko Takenaga; Jerry L. Trahan; Jose Alberto Fernandez Zepeda; Jingyuan Zhang; Joseph JáJá; Arnold L. Rosenberg; Sartaj Sahni; Jie Wu; Pen Chung Yew; Albert Y. Zomaya

Parallel and distributed computing offer the promise to deliver the computing power necessary to solve many important problems whose requirements exceed the capabilities of the most powerful existing computers. Aiming to fulfill this promise, recent years have seen a flurry of activity in the arena of parallel and distributed computing which evolved into novel and robust computing models. These models reflect advances in computational devices and environments such as optical interconnects, programmable logic arrays, networks of workstations, radio communications, mobile computing, DNA computing, quantum computing, sensor networks, etc. In addition, practical experience with both parallel computers and distributed data communication networks has brought about an understanding of their potential and limitations which, in turn, have fostered the development of sophisticated algorithms. It is very encouraging see that the advent of these models, combined with the availability of efficient algorithms, has led to significant advances in the resolution of various difficult problems of practical interest.

IEEE Distributed Systems Online | 2008