Jim Holt
Freescale Semiconductor
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jim Holt.
design automation conference | 2012
Henry Hoffmann; Jim Holt; George Kurian; Eric Lau; Martina Maggio; Jason E. Miller; Sabrina M. Neuman; Mahmut E. Sinangil; Yildiz Sinangil; Anant Agarwal; Anantha P. Chandrakasan; Srinivas Devadas
Addressing the challenges of extreme scale computing requires holistic design of new programming models and systems that support those models. This paper discusses the Angstrom processor, which is designed to support a new Self-aware Computing (SEEC) model. In SEEC, applications explicitly state goals, while other systems components provide actions that the SEEC runtime system can use to meet those goals. Angstrom supports this model by exposing sensors and adaptations that traditionally would be managed independently by hardware. This exposure allows SEEC to coordinate hardware actions with actions specified by other parts of the system, and allows the SEEC runtime system to meet application goals while reducing costs (e.g., power consumption).
international symposium on microarchitecture | 2009
Jim Holt; Anant Agarwal; Sven Brehmer; Max J. Domeika; Patrick Griffin; Frank Schirrmeister
Systems architects commonly use multiple cores to improve system performance. Unfortunately, multicore hardware is evolving faster than software technologies. New multicore software standards are necessary in light of the new challenges and capabilities that embedded multicore systems provide. The newly released multicore communications API standard targets small-footprint, highly efficient intercore and interchip communications.
formal methods in computer-aided design | 2009
Subodh Sharma; Ganesh Gopalakrishnan; Eric Mercer; Jim Holt
We present a dynamic verification tool MCC for Multicore Communication API applications — a new API for communication among cores. MCC systematically explores all relevant interleavings of an MCAPI application using a tailor-made dynamic partial order reduction algorithm (DPOR). Our contributions are (i) a way to model the non-overtaking message matching relation underlying MCAPI calls with a high level algorithm to effect DPOR for MCAPI that controls the lower level details so that the intended executions happen at runtime; and (ii) a list of default safety properties that can be utilized in the process of verification. To our knowledge, this is the first push button model checker for MCAPI application writers that, at present, deals with an interesting subset of MCAPI calls. Our result is the demonstration that we can indeed develop a dynamic model checker for MCAPI that can directly control the non-deterministic behavior at runtime that is inherent in any implementation of the library without additional API modifications or additions.
microprocessor test and verification | 2007
Dam Sunwoo; Hassan Al-Sukhni; Jim Holt; Derek Chiou
Power estimation and verification have become important aspects of System-on-Chip (SoC) design flows. However, rapid and effective power modeling and estimation technologies for complex SoC designs are not widely available. As a result, many SoC design teams focus the bulk of their efforts on using detailed low-level models to verify power consumption. While such models can accurately estimate power metrics for a given design, they suffer from two significant limitations: (1) they are only available late in the design cycle, after many architectural features have already been decided, and (2) they are so detailed that they impose severe limitations on the size and number of workloads that can be evaluated. While these methods are useful for power verification, architects require information much earlier in the design cycle, and are therefore often limited to estimating power using spreadsheets where the expected power dissipation of each module is summed up to predict total power. As the model becomes more refined, the frequency that each module is exercised may be added as an additional parameter to further increase the accuracy. Current spreadsheets, however, rely on aggregate instruction counts and do not incorporate either time or input data and thus have inherent inaccuracies. Our strategy for early power estimation relies on (i) measurements from real silicon, (ii) models built from those measurements models that predict power consumption for a variety of processor micro-architectural structures and (iii)FPGA-based implementations of those models integrated with an FPGA-based performance simulator/emulator. The models will be designed specifically to be implemented within FPGAs. The intention is to integrate the power models with FPGA-based full-system, functional and performance simulators/emulators that will provide timing and functional information including data values. The long term goal is to provide relative power accuracy and power trends useful to architects during the architectural phase of a project, rather than precise power numbers that would require far more information than is available at that time. By implementing the power models in an FPGA and driving those power models with a system simulator/emulator that can feed the power models real data transitions generated by real software running on top of real operating systems, we hope to both improve the quality of early stage power estimation and improve power simulation performance.
IEEE Transactions on Computers | 2015
Etem Deniz; Alper Sen; Brian Kahne; Jim Holt
We present a novel automated multicore benchmark synthesis framework with characterization and generation components. Our framework uses parallel patterns in capturing important characteristics of multi-threaded applications and generates synthetic multicore benchmarks from those applications. The resulting synthetic benchmarks are small, fast, portable, human-readable, and they accurately reflect microarchitecture dependent and independent characteristics of the original multicore applications. Also, they can use either Pthreads or MCA libraries. We implement our techniques in the MINIME tool and generate synthetic benchmarks from PARSEC, Rodinia, and EEMBC MultibenchTM benchmarks on x86 and Power Architecture® platforms. We show that synthetic benchmarks are representative across a range of multicore machines with different architectures, while being on average 21× faster and 14× smaller than original benchmarks.
programming models and applications for multicores and manycores | 2013
Cheng Wang; Sunita Chandrasekaran; Barbara M. Chapman; Jim Holt
In recent years rapid revolution of Multiprocessor System-on-Chip (MPSoC) poses new challenges for programming such architectures in an efficient manner. In order to explore potential hardware concurrency, software developers are still expected to handle many of the low-level details of programming including utilizing DMA, ensuring cache co-herency, and inserting synchronization primitives explicitly. Software portability is yet another issue: the state-of-the-art is that hardware vendors supply vendor-specific software development toolchains which makes it harder for applications to be ported to many different possible architectures without re-structuring the code, while at the same time ensuring efficiency. In this paper, we extend the usage of a high-level programming model, OpenMP, to multicore embedded systems. To address the architectural challenges, we propose a lightweight unified OpenMP runtime library, libEOMP, by leveraging the MCA (Multicore Association) APIs as the target of our OpenMP translation. MCA APIs support device-level communication and resource management for multicore embedded systems. We have implemented and evaluated libEOMP on an embedded platform supplied by Freescale Semiconductor. We observed that libEOMP not only performed as well as optimized vendor-specific OpenMP runtime libraries but also achieved better portability, programmability and productivity.
languages, compilers, and tools for embedded systems | 2013
Cheng Wang; Sunita Chandrasekaran; Peng Sun; Barbara M. Chapman; Jim Holt
Multicore embedded systems are being widely used in telecommunication systems, robotics, medical applications and more.While they offer a high-performance with low-power solution, programming in an efficient way is still a challenge. In order to exploit the capabilities that the hardware offers, software developers are expected to handle many of the low-level details of programming including utilizing DMA, ensuring cache coherency, and inserting synchronization primitives explicitly. The state-of-the-art involves solutions where the software toolchain is too vendor-specific thus tying the software to a particular hardware leaving no room-for portability. In this paper we present a runtime system to explore mapping a high-level programming model, OpenMP, on to multicore embedded systems. A key feature of our scheme is that unlike the existing approaches that largely rely on POSIX threads, our approach leverages the Multicore Association (MCA) APIs as an OpenMP translation layer. The MCA APIs is a set of low-level APIs handling resource management, inter-process communications and task scheduling for multicore embedded systems. By deploying the MCA APIs, our runtime is able to effectively capture the characteristics of multicore embedded systems compared with the POSIX threads. Furthermore, the MCA layer enables our runtime implementation to be portable across various architectures. Thus programmers only need to maintain a single OpenMP code base which is compatible by various compilers, while on the other hand, the code is portable across different possible types of platforms. We have evaluated our runtime system using several embedded benchmarks. The experiments demonstrate promising and competitive performance compared to the native approach for the platform.
ieee international symposium on workload characterization | 2012
Etem Deniz; Alper Sen; Jim Holt; Brian Kahne
Benchmarks capture the essence of many important real-world applications and allow performance, and power analysis while developing new systems. Synthetic benchmarks are a miniaturized form of benchmarks that allow high simulation speeds and act as proxies to proprietary applications. Software architecture principles guide the development of new applications and benchmarks. We leverage software architectural patterns in developing synthetic benchmarks for embedded multicore systems. We developed an automated framework complete with characterization and synthesis components and performed experiments on PARSEC and Rodinia benchmarks. Our benchmarks can be run on any given infrastructure, that is, SMP or message passing, unlike previously developed benchmarks. Hence, this allows us to target heterogeneous embedded multicore systems. Our results show that the synthetic benchmarks and the real applications are similar with respect to various micro-architecture dependent as well as independent metrics.
ACM Transactions on Design Automation of Electronic Systems | 2012
Etem Deniz; Alper Sen; Jim Holt
We describe verification and coverage methods for multicore software that uses message passing libraries for communication. Specifically, we provide techniques to improve reliability of software using the new industry standard MCAPI by the Multicore Association. We develop dynamic predictive verification techniques that allow us to find actual and potential errors in a multicore software. Some of these error types are deadlocks, race conditions, and violation of temporal assertions. We complement our verification techniques with a mutation-testing-based coverage metric. Coverage metrics enable measuring the quality of verification tests. We implemented our techniques in tools and validated them on several multicore programs that use the MCAPI standard. We implement our techniques in tools and experimentally show the effectiveness of our approach. We find errors that are not found using traditional dynamic verification techniques and we can potentially explore execution schedules different than the original program with our coverage tool. This is the first time such predictive verification and coverage metrics have been developed for MCAPI.
microprocessor test and verification | 2009
Jim Holt; Jaideep Dastidar; David Lindberg; John Pape; Peng Yang
MCSoC are comprised of a rich set of processor cores, specialized hardware accelerators, and I/O interfaces. Focusing only on functional verification is risky because the motivation for building such systems in the first place is to achieve high levels of system throughput: a functionally correct MCSoC that does not exhibit sufficient performance will fail in the market. Furthermore, focusing performance verification on individual system components (e.g., measuring processor core performance or hardware accelerator performance in isolation) is insufficient due to (1) the degree of resource contention that occurs in the MCSoC, and (2) the degree of configuration flexibility that is typically afforded by an MCSoC. These factors motivate system-level performance verification of MCSoC. This paper presents an important industrial case study of MCSoC performance verification, highlighting the methodology used, the lessons learned, and recommendations for improvement.