Narayan Ranganathan
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Narayan Ranganathan.
architectural support for programming languages and operating systems | 1998
Narayan Ranganathan; Manoj Franklin
Recent fascination for dynamic scheduling as a means for exploiting instruction-level parallelism has introduced significant interest in the scalability aspects of dynamic scheduling hardware. In order to overcome the scalability problems of centralized hardware schedulers, many decentralized execution models are being proposed and investigated recently. The crux of all these models is to split the instruction window across multiple processing elements (PEs) that do independent, scheduling of instructions. The decentralized execution models proposed so far can be grouped under 3 categories, based on the criterion used for assigning an instruction to a particular PE. They are: (i) execution unit dependence based decentralization (EDD), (ii) control dependence based decentralization (CDD), and (iii) data dependence based decentralization (DDD). This paper investigates the performance aspects of these three decentralization approaches. Using a suite of important benchmarks and realistic system parameters, we examine performance differences resulting from the type of partitioning as well as from specific implementation issues such as the type of PE interconnect.We found that with a ring-type PE interconnect, the DDD approach performs the best when the number of PEs is moderate, and that the CDD approach performs best when the number of PEs is large. The currently used approach---EDD---does not perform well for any configuration. With a realistic crossbar, performance does not increase with the number of PEs for any of the partitioning approaches. The results give insight into the best way to use the transistor budget available for implementing the instruction window.
High Performance Parallelism Pearls#R##N#Volume 2: Multicore and Many-core Programming Approaches | 2016
Prashanth Thinakaran; Diana Guttman; Mahmut T. Kandemir; Meenakshi Arunachalam; Rahul Khanna; Praveen Yedlapalli; Narayan Ranganathan
This chapter presents an image-matching application that can take advantage of many-core architectures. Different parallelization strategies are explored that can take advantage of inter- and intraimage parallelism. The two main metrics that determine the application performance, tree creation time and search time, were studied in the context of scalability. Important insights obtained from a profiler-based analysis help identify the challenges in scalability of DB threads. The scalability with respect to increasing DBThreads with optimal KD-trees is shown to lead to 5.8× speedup in create time and 2.8× speedup in search time in the case of 120 threads when compared to single-threaded Xeon Phi performance.
Archive | 2013
Robert C. Swanson; Mahesh S. Natu; Rahul Khanna; Murugasamy K. Nachimuthu; Sarathy Jayakumar; Anil S. Keshavamurthy; Narayan Ranganathan
Archive | 2011
Ashok Raj; Narayan Ranganathan; Mohan J. Kumar; Theodros Yigzaw
Archive | 2011
Robert C. Swanson; Mallik Bulusu; Robert Bruce Bahnsen; Narayan Ranganathan; David Lombard
Archive | 2012
Narayan Ranganathan; Mahesh S. Natu; Mohan Kumar; Sarathy Jayakumar
Archive | 2017
Ramamurthy Krithivas; Narayan Ranganathan; Mohan J. Kumar; John C. Leung
Archive | 2016
Sarathy Jayakumar; Ashok Raj; John G. Holm; Narayan Ranganathan; Mohan Kumar; Sergiu D. Ghetie
Archive | 2013
Ashok Raj; Mohan J. Kumar; Narayan Ranganathan
Archive | 2016
Krithivas, Ramamurthy, Ariz.; Narayan Ranganathan; Kumar, Mohan J., Oreg.; Leung, John, Calif.