Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marina Biberstein.
Ibm Journal of Research and Development | 2003
Jaime H. Moreno; Victor Zyuban; Uzi Shvadron; Fredy D. Neeser; Jeff H. Derby; Malcolm Scott Ware; Krishnan K. Kailas; Ayal Zaks; Amir Geva; Shay Ben-David; Sameh W. Asaad; Thomas W. Fox; Daniel Littrell; Marina Biberstein; Dorit Naishlos; Hillery C. Hunter
We describe an innovative, low-power, high-performance, programmable signal processor (DSP) for digital communications. The architecture of this processor is characterized by its explicit design for low-power implementations, its innovative ability to jointly exploit instruction-level parallelism and data-level parallelism to achieve high performance, its suitability as a target for an optimizing high-level language compiler, and its explicit replacement of hardware resources by compile-time practices. We describe the methodology used in the development of the processor, highlighting the techniques deployed to enable application/architecture/compiler/implementation co-development, and the optimization approach and metric used for power-performance evaluation and tradeoff analysis. We summarize the salient features of the architecture, provide a brief description of the hardware organization, and discuss the compiler techniques used to exercise these features. We also summarize the simulation environment and associated software development tools. Coding examples from two representative kernels in the digital communications domain are also provided. The resulting methodology, architecture, and compiler represent an advance of the state of the art in the area of low-power, domain-specific microprocessors.
european conference on object oriented programming | 2001
Marina Biberstein; Joseph Gil; Sara Porat
Both encapsulation and immutability are important mechanisms, that support good software engineering practice. Encapsulation protects a variable against all kinds of access attempts from certain sections of the program. Immutability protects a variable only against write access attempts, irrespective of the program region from which these attempts are made. Taking mostly an empirical approach, we study these concepts and their interaction in JAVA. We propose code analysis techniques, which, using the new sealing information, can help to identify variables as encapsulated, immutable, or both.
international symposium on performance analysis of systems and software | 2008
Marina Biberstein; Uzi Shvadron; Javier Turek; Bilha Mendelson; Moon S. Chang
The transition to multicore architectures creates significant challenges for programming systems. Taking advantage of specialized processing cores such as those in the Cell BE processor and managing all the required data movement inside the processor cannot be done efficiently without help from the software infrastructure. Alongside new programming models and compiler support for multicores, programmers need performance evaluation and analysis tools. In this paper, we present tools that help analyze the performance of applications executing on the Cell platform. The performance debugging tool (PDT) provides a means for recording significant events during program execution, maintaining the sequential order of events, and preserving important runtime information such as core assignment and relative timing of events. The trace analyzer (TA) reads and visualizes the PDT traces. We describe the architecture of the PDT and present several important use cases demonstrating the usage of PDT and TA to understand the performance of several workloads. We also discuss the overhead of tracing and its impact on the benchmark execution and performance analysis.
international parallel and distributed processing symposium | 2004
Marina Biberstein; Eitan Farchi; Shmuel Ur
Summary form only given. In previous work, we introduced the alternative pasts algorithm that delays the assignment of values to variables until their usage. Whenever a variable is used, the algorithm chooses one of its past values that is consistent with some possible execution. The alternative pasts algorithm can be seen as belonging to a class of algorithms that shadow the execution and choose at any point to modify the values of some of the variables. We build on this work and extend it in two directions. First we show a more powerful shadowing algorithm that can delay not only writes but also reads and other kinds of instructions, at most until a relevant control decision is taken, which is the longest possible delay for algorithms of this class. We prove that this algorithm inherits the ability of the alternative pasts algorithm to generate significantly different interleavings, which are guaranteed to execute differently. In addition, we show a new use for the two algorithms, namely alternative replay. Unlike regular replay, where the execution of the program is reproduced, alternative replay is an execution that did not happen before but could have happened. For example, if a bug did not materialize, alternative replay can be used to show the user alternative execution in which the impact of the bug can be observed.
Ibm Journal of Research and Development | 2009
Marina Biberstein; Shiri Dori-Hacohen; Yuval Harel; Andre Heilper; Bilha Mendelson; Uzi Shvadron; Eran Treister; Javier Turek; Moon S. Chang
Optimizing performance on multicore processors is a daunting task M. S. Chang because of the increased importance of such factors as thread communication, memory contention, and memory access latency. This paper presents two tools that programmers and performance analysts can use to understand application performance on the Cell Broadband Engine® (Cell/B.E.) processor: the Performance Debugging Tool (PDT) and the Trace Analyzer (TA). PDT traces user-space events, augmenting them with scheduling data from the operating system; those traces are then read, analyzed, and presented visually by the TA. This paper describes the implementation issues arising from the fact that a common lowoverhead clock shared by all cores, essential for analysis and visualization, is not available on the Cell/B.E. processor. The TA employs an offline analysis to align the collected events to a common time based only on thread-local timestamps, event order, and context switch information. We also discuss the overhead of tracing and its impact on execution and performance analysis. We illustrate the use of the PDT and TA by analyzing several significant Cell/B.E. processor workloads, including native code and higher-level abstractions offered by the Data Communication and Synchronization services. We show how trace analysis can help identify performance issues in these workloads and how it can be used by programmers to spot performance antipatterns (common programming practices leading to suboptimal performance).
Archive | 2001
Larry Koved; Bilha Mendelson; Sara Porat; Marina Biberstein
conference of the centre for advanced studies on collaborative research | 2000
Sara Porat; Marina Biberstein; Larry Koved; Bilha Mendelson
Archive | 2005
Marina Biberstein; Eitan Farchi; Shmuel Ur
Archive | 2003
Marina Biberstein; Eitan Farchi; Yarden Nir; Shmuel Ur
Archive | 2008
Marina Biberstein; Yuval Harel; Andre Heilper