In today's highly interconnected digital era, performance portability has become an important issue in software development. Performance portability refers to the ability of an application to run effectively on different hardware platforms. When developers design performance-enabled and portable applications, they need to support multiple platforms without compromising performance, ideally while minimizing platform-specific code.
Performance portability is considered a highly sought-after feature in the high-performance computing (HPC) community, but there is currently no universal or consistent measurement standard.
Performance is measured in two ways: one is by comparing the performance of an optimized version of the application with its portable version; the other is based on the number of floating point operations (FLOPs) performed and moves from main memory during execution. Data to processor frequency to compare the application's theoretical peak performance. As the diversity of hardware increases, it becomes increasingly critical to develop software that can run on a wide range of machines, which affects the long-term use of the application and the possibility of updating it.
Performance portability is widely mentioned in the industry, and usually refers to: first, the ability to run the same application on multiple hardware platforms; second, achieving certain performance targets on these platforms. At the Performance Portability Conference held by the U.S. Department of Energy (DOE) in 2016, an expert stated that “an application can be considered an application if it can achieve a consistent level of performance on each platform relative to the best known implementation.” For efficiency and portability.”
Jeff Larkin (NVIDIA) pointed out directly that performance portability is "the same source code can run productively on multiple different architectures."
Therefore, performance portability has become an important topic of discussion in the high-performance computing community. Partners from industry, academia, and DOE National Laboratories have regularly held the "Performance, Portability and High-Efficiency Computing Productivity Forum" since 2016 to promote research and development of performance portability.
As computing architecture continues to evolve, performance portability remains important. Developers assume that their single code base will achieve acceptable performance on newer architectures, as well as on a variety of current architectures that have not yet been tested. As hardware diversity increases, it will become necessary to develop software that can run across multiple platforms. This is related to the vitality and continued relevance of applications.
The U.S. Department of Energy's Exascale Computing Project (ECP) emphasizes that performance portability is an ongoing concern, especially in a multi-platform environment.
Since 2016, DOE has held multiple workshops to discuss the growing importance of performance portability. The 2017 conference invited participation from many well-known institutions including the National Energy Research Scientific Computing Center (NERSC) and Los Alamos National Laboratory (LANL).
To quantify when a program has achieved performance portability, two factors need to be considered. First, performance portability can be measured by comparing the number of lines of code used across architectures versus the number of lines of code used only for a single architecture. Second, performance can be measured in a variety of ways. For example, to compare the performance of a platform-optimized version of an application with a portable version, an effective measurement method is to use a roof performance model, which can obtain the theoretical peak performance of the application.
At the 2016 conference, a participant mentioned, "When the application team claims that it is performance portable, this code is considered performance portable."
In recent years, research has pointed out that portable code written for various parallel computing architectures needs to comply with open standard programming models, and the code must be developed and improved on multiple platforms simultaneously. These strategies can help developers find parameters suitable for different platforms.
There are a variety of programming applications and systems on the market designed to help developers achieve performance portability. Some common frameworks include OpenCL, SYCL, Kokkos, RAJA, etc. These programming interfaces support multi-platform multi-process programming. Some non-framework solutions include self-tuning and domain-specific languages.
As technology advances, are we ready for a new era of programming that takes the possibilities of performance portability to new heights?