Pedro C. Diniz
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pedro C. Diniz.
ubiquitous computing | 2010
Davide Figo; Pedro C. Diniz; Diogo R. Ferreira; João M. P. Cardoso
The ubiquity of communication devices such as smartphones has led to the emergence of context-aware services that are able to respond to specific user activities or contexts. These services allow communication providers to develop new, added-value services for a wide range of applications such as social networking, elderly care and near-emergency early warning systems. At the core of these services is the ability to detect specific physical settings or the context a user is in, using either internal or external sensors. For example, using built-in accelerometers, it is possible to determine whether a user is walking or running at a specific time of day. By correlating this knowledge with GPS data, it is possible to provide specific information services to users with similar daily routines. This article presents a survey of the techniques for extracting this activity information from raw accelerometer data. The techniques that can be implemented in mobile devices range from classical signal processing techniques such as FFT to contemporary string-based methods. We present experimental results to compare and evaluate the accuracy of the various techniques using real data sets collected from daily activities.
ieee international conference on high performance computing data and analytics | 2014
Marc Snir; Robert W. Wisniewski; Jacob A. Abraham; Sarita V. Adve; Saurabh Bagchi; Pavan Balaji; Jim Belak; Pradip Bose; Franck Cappello; Bill Carlson; Andrew A. Chien; Paul W. Coteus; Nathan DeBardeleben; Pedro C. Diniz; Christian Engelmann; Mattan Erez; Saverio Fazzari; Al Geist; Rinku Gupta; Fred Johnson; Sriram Krishnamoorthy; Sven Leyffer; Dean A. Liberty; Subhasish Mitra; Todd S. Munson; Rob Schreiber; Jon Stearley; Eric Van Hensbergen
We present here a report produced by a workshop on ‘Addressing failures in exascale computing’ held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to establish a common taxonomy about resilience across all the levels in a computing system, discuss existing knowledge on resilience across the various hardware and software layers of an exascale system, and build on those results, examining potential solutions from both a hardware and software perspective and focusing on a combined approach. The workshop brought together participants with expertise in applications, system software, and hardware; they came from industry, government, and academia, and their interests ranged from theory to implementation. The combination allowed broad and comprehensive discussions and led to this document, which summarizes and builds on those discussions.
ACM Transactions on Programming Languages and Systems | 1997
Martin C. Rinard; Pedro C. Diniz
This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granularity to discover when operations commute (i.e., generate the same final result regardless of the order in which they execute). If all of the operations required to perform a given computation commute, the compiler can automatically generate parallel code. We have implemented a prototype compilation system that uses commutativity analysis as its primary analysis technique. We have used this system to automatically parallelize three complete scientific computations: the Barnes-Hut N-body solver, the Water liquid simulation code, and the String seismic simulation code. This article presents performance results for the generated parallel code running on the Stanford DASH machine. These results provide encouraging evidence that commutativity analysis can serve as the basis for a successful parallelizing compiler.
programming language design and implementation | 2002
Byoungro So; Mary W. Hall; Pedro C. Diniz
The current practice of mapping computations to custom hardware implementations requires programmers to assume the role of hardware designers. In tuning the performance of their hardware implementation, designers manually apply loop transformations such as loop unrolling. designers manually apply loop transformations. For example, loop unrolling is used to expose instruction-level parallelism at the expense of more hardware resources for concurrent operator evaluation. Because unrolling also increases the amount of data a computation requires, too much unrolling can lead to a memory bound implementation where resources are idle. To negotiate inherent hardware space-time trade-offs, designers must engage in an iterative refinement cycle, at each step manually applying transformations and evaluating their impact. This process is not only error-prone and tedious but also prohibitively expensive given the large search spaces and with long synthesis times. This paper describes an automated approach to hardware design space exploration, through a collaboration between parallelizing compiler technology and high-level synthesis tools. We present a compiler algorithm that automatically explores the large design spaces resulting from the application of several program transformations commonly used in application-specific hardware designs. Our approach uses synthesis estimation techniques to quantitatively evaluate alternate designs for a loop nest computation. We have implemented this design space exploration algorithm in the context of a compilation and synthesis system called DEFACTO, and present results of this implementation on five multimedia kernels. Our algorithm derives an implementation that closely matches the performance of the fastest design in the design space, and among implementations with comparable performance, selects the smallest design. We search on average only 0.3% of the design space. This technology thus significantly raises the level of abstraction for hardware design and explores a design space much larger than is feasible for a human designer.
programming language design and implementation | 1997
Pedro C. Diniz; Martin C. Rinard
This paper presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated code alternately performs sampling phases and production phases. Each sampling phase measures the overhead of each version in the current environment. Each production phase uses the version with the least overhead in the previous sampling phase. The computation periodically resamples to adjust dynamically to changes in the environment.We have implemented dynamic feedback in the context of a parallelizing compiler for object-based programs. The generated code uses dynamic feedback to automatically choose the best synchronization optimization policy. Our experimental results show that the synchronization optimization policy has a significant impact on the overall performance of the computation, that the best policy varies from program to program, that the compiler is unable to statically choose the best policy, and that dynamic feedback enables the generated code to exhibit performance that is comparable to that of code that has been manually tuned to use the best policy. We have also performed a theoretical analysis which provides, under certain assumptions, a guaranteed optimality bound for dynamic feedback relative to a hypothetical (and unrealizable) optimal algorithm that uses the best policy at every point during the execution.
field programmable custom computing machines | 2000
Pedro C. Diniz; Joonseok Park
Mapping computations written in high-level programming languages to FPGA-based computing engines requires programmers to create the datapath responsible for the core of the computation as well as the control structures to generate the appropriate signals to orchestrate its execution. This paper addresses the issue of automatic generation of data storage and control structures for FPGA-based reconfigurable computing engines using existing compiler data dependence analysis techniques. We describe a set of parameterizable data storage and control structures used as the target of our prototype compiler. We present a compiler analysis algorithm to derive the parameters of the data storage structures to minimize the required memory bandwidth of the implementation. We also describe a complete compilation scheme for mapping loops that manipulate multi-dimensional array variables to hardware. We present preliminary simulation results for complete designs generated manually using the results of the compiler analysis. These preliminary results show that it is possible to successfully integrate compiler data dependence analysis with existing commercial synthesis tools.
ACM Computing Surveys | 2010
João M. P. Cardoso; Pedro C. Diniz; Markus Weinhardt
Reconfigurable computing platforms offer the promise of substantially accelerating computations through the concurrent nature of hardware structures and the ability of these architectures for hardware customization. Effectively programming such reconfigurable architectures, however, is an extremely cumbersome and error-prone process, as it requires programmers to assume the role of hardware designers while mastering hardware description languages, thus limiting the acceptance and dissemination of this promising technology. To address this problem, researchers have developed numerous approaches at both the programming languages as well as the compilation levels, to offer high-level programming abstractions that would allow programmers to easily map applications to reconfigurable architectures. This survey describes the major research efforts on compilation techniques for reconfigurable computing architectures. The survey focuses on efforts that map computations written in imperative programming languages to reconfigurable architectures and identifies the main compilation and synthesis techniques used in this mapping.
Pervasive and Mobile Computing | 2010
André C. Santos; João M. P. Cardoso; Diogo R. Ferreira; Pedro C. Diniz; Paulo Chainho
The processing capabilities of mobile devices coupled with portable and wearable sensors provide the basis for new context-aware services and applications tailored to the user environment and daily activities. In this article, we describe the approach developed within the UPCASE project, which makes use of sensors available in the mobile device as well as sensors externally connected via Bluetooth to provide user contexts. We describe the system architecture from sensor data acquisition to feature extraction, context inference and the publication of context information in web-centered servers that support well-known social networking services. In the current prototype, context inference is based on decision trees to learn and to identify contexts dynamically at run-time, but the middleware allows the integration of different inference engines if necessary. Experimental results in a real-world setting suggest that the proposed solution is a promising approach to provide user context to local mobile applications as well as to network-level applications such as social networking services.
aspect-oriented software development | 2012
João M. P. Cardoso; Tiago Carvalho; José Gabriel F. Coutinho; Wayne Luk; Ricardo Nobre; Pedro C. Diniz; Zlatko Petrov
The development of applications for high-performance embedded systems is typically a long and error-prone process. In addition to the required functions, developers must consider various and often conflicting non-functional application requirements such as performance and energy efficiency. The complexity of this process is exacerbated by the multitude of target architectures and the associated retargetable mapping tools. This paper introduces an As-pect-Oriented Programming (AOP) approach that conveys domain knowledge and non-functional requirements to optimizers and mapping tools. We describe a novel AOP language, LARA, which allows the specification of compi-lation strategies to enable efficient generation of software code and hardware cores for alternative target architectures. We illustrate the use of LARA for code instrumentation and analysis, and for guiding the application of compiler and hardware synthesis optimizations. An important LARA feature is its capability to deal with different join points, action models, and attributes, and to generate an aspect intermediate representation. We present examples of our aspect-oriented hardware/software design flow for mapping real-life application codes to embedded platforms based on Field Programmable Gate Array (FPGA) technology.
international parallel processing symposium | 1999
Kiran Bondalapati; Pedro C. Diniz; Phillip Duncan; John J. Granacki; Mary W. Hall; Rajeev Jain; Heidi E. Ziegler
The lack of high-level design tools hampers the widespread adoption of adaptive computing systems. Application developers have to master a wide range of functions, from the high-level architecture design, to the timing of actual control and data signals. In this paper we describe DEFACTO, an end-to-end design environment aimed at bridging the gap in tools for adaptive computing by bringing together parallelizing compiler technology and synthesis techniques.