Peter J. Wilson
Freescale Semiconductor
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peter J. Wilson.
networking architecture and storages | 2015
Farrukh Hijaz; Brian Kahne; Peter J. Wilson; Omer Khan
Software IP forwarding routers provide flexibility, programmability and extensibility, while enabling fast deployment. The key question is whether they can keep up with the efficiency of special purpose hardware counterparts. Shared memory stands out as sine qua non for parallel programming of many commercial multicore processors, so it is the paradigm of choice to implement software routers. For efficiency, shared memory is often implemented with hardware support for cache coherence and data consistency among the cores. Although it enables efficient data access in many common case scenarios, the communication between cores using shared memory synchronization primitives often limits scalability. In this paper we perform a thorough characterization of a multithreaded packet processing application to quantify the opportunities from exploiting concurrency, as well as identify scalability bottlenecks in futuristic shared memory multicores. We propose to retain the shared memory model, however, introduce a set of lightweight in-hardware explicit messaging send/receive instructions in the instruction set architecture (ISA). These instructions are used to mitigate the overheads of multi-party communication in shared memory protocols. Using simulations of a 64 core multicore, we identify that scalability of parallel packet processing is limited due to packet ordering requirement that leads to expensive implicit communication under shared memory. Using explicit messaging support in the ISA, the communication bottleneck is mitigated, and the application scales to 30× at 64 cores.
international parallel and distributed processing symposium | 2017
Halit Dogan; Farrukh Hijaz; Masab Ahmad; Brian Kahne; Peter J. Wilson; Omer Khan
Shared Memory stands out as a sine qua non for parallel programming of many commercial and emerging multicore processors. It optimizes patterns of communication that benefit common programming styles. As parallel programming is now mainstream, those common programming styles are challenged with emerging applications that communicate often and involve large amount of data. Such applications include graph analytics and machine learning, and this paper focuses on these domains. We retain the shared memory model and introduce a set of lightweight in-hardware explicit messaging instructions in the instruction set architecture (ISA). A set of auxiliary communication models are proposed that utilize explicit messages to accelerate synchronization primitives, and efficiently move computation towards data. The results on a 256-core simulated multicore demonstrate that the proposed communication models improve performance and dynamic energy by an average of 4x and 42% respectively over traditional shared memory.
microprocessor test and verification | 2005
Brian Kahne; Aseem Gupta; Peter J. Wilson; Nikil D. Dutt
The ability to enhance single-thread performance, such as by increasing clock frequency, is reaching a point of diminishing returns: power is becoming a dominating factor and limiting scalability. Adding additional cores is a scalable way to increase performance, but it requires that system designers have a method for developing multithreaded applications. Plasma, (parallel language for system modeling and analysis) is a parallel language for system modeling and multi-threaded application development implemented as a superset of C++. The language extensions are based upon those found in Occam, which is based upon CSP (communicating sequential processes) by C. A. R. Hoare. The goal of the Plasma project is to investigate whether a language with the appropriate constructs might be used to ease the task of developing highly multi-threaded software. In addition, through the inclusion of a discrete event simulation API, we seek to simplify the task of system modeling and increase productivity through clearer representation and increased compile-time checking of the more difficult-to-get-right aspects of systems models (the concurrency). The result is a single language which allows users to develop a parallel application and then to model it within the context of a system, allowing for hardware-software partitioning and various other early tradeoff analyses. We believe that this language offers a simpler and more concise syntax than other offerings and can be targeted at a large range of potential architectures, including heterogeneous systems and those without shared memory
Archive | 2008
William C. Moyer; Peter J. Wilson
Archive | 2001
Bryan D. Marietta; Peter J. Wilson
Archive | 2008
Perry H. Pelley; George P. Hoekstra; Peter J. Wilson
Archive | 2013
Joseph C. Circello; Daniel M. McCarthy; John D. Mitchell; Peter J. Wilson; John J. Vaglica
Archive | 2017
Peter J. Wilson
Archive | 2011
Peter J. Wilson
Archive | 2016
Peter J. Wilson; Brian Kahne