Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Subramaniam Maiyuran is active.

Publication


Featured researches published by Subramaniam Maiyuran.


international symposium on microarchitecture | 2011

SIMD re-convergence at thread frontiers

Gregory Frederick Diamos; Benjamin Ashbaugh; Subramaniam Maiyuran; Andrew Kerr; Haicheng Wu; Sudhakar Yalamanchili

Hardware and compiler techniques for mapping data-parallel programs with divergent control flow to SIMD architectures have recently enabled the emergence of new GPGPU programming models such as CUDA, OpenCL, and DirectX Compute. The impact of branch divergence can be quite different depending upon whether the programs control flow is structured or unstructured. In this paper, we show that unstructured control flow occurs frequently in applications and can lead to significant code expansion when executed using existing approaches for handling branch divergence. This paper proposes a new technique for automatically mapping arbitrary control flow onto SIMD processors that relies on a concept of a Thread Frontier, which is a bounded region of the program containing all threads that have branched away from the current warp. This technique is evaluated on a GPU emulator configured to model i) a commodity GPU (Intel Sandybridge), and ii) custom hardware support not realized in current GPU architectures. It is shown that this new technique performs identically to the best existing method for structured control flow, and re-converges at the earliest possible point when executing unstructured control flow. This leads to i) between 1.5 – 633.2% reductions in dynamic instruction counts for several real applications, ii) simplification of the compilation process, and iii) ability to efficiently add high level unstructured programming constructs (e.g., exceptions) to existing data-parallel languages.


Archive | 2001

Memory access latency hiding with hint buffer

Subramaniam Maiyuran; Vivek Garg; Mohammad A. Abdallah; Jagannath Keshava


Archive | 1998

Method and system for optimizing write combining performance in a shared buffer structure

Salvador Palanca; Vladimir Pentkovski; Niranjan L. Cooray; Subramaniam Maiyuran; Angad Narang


Archive | 1998

Method and apparatus for prefetching data into cache

Salvador Palanca; Niranjan L. Cooray; Angad Narang; Vladimir Pentkovski; Steve Tsai; Subramaniam Maiyuran; Jagannath Keshava; Hsien-Hsin Lee; Steve Spangler; Suresh N. Kuttuva; Praveen Mosur


Archive | 2000

System and method for cache sharing

Jagganath Keshava; Vladimir Pentkovski; Subramaniam Maiyuran; Salvador Palanca; Hsin-Chu Tsai


Archive | 2000

Opportunistic sharing of graphics resources to enhance CPU performance in an integrated microprocessor

Subramaniam Maiyuran; Vivek Garg; Jagannath Keshava; Salvador Palanca


Archive | 2002

Method and apparatus for cache replacement for a multiple variable-way associative cache

Salvador Palanca; Subramaniam Maiyuran


Archive | 1998

Efficient utilization of write-combining buffers

Vladimir Pentkovski; Hsien-Cheng E Hsieh; Hsien-Hsin Lee; Subramaniam Maiyuran


Archive | 2003

Method and apparatus for a stew-based loop predictor

Subramaniam Maiyuran; Peter J. Smith; Stephan Jourdan


Archive | 1998

Synchronization of weakly ordered write combining operations using a fencing mechanism

Salvador Palanca; Vladimir Pentkovski; Subramaniam Maiyuran; Lance E. Hacking; Roger A. Golliver; Shreekant S. Thakkar

Researchain Logo
Decentralizing Knowledge