David A. Rennels
University of California, Los Angeles
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David A. Rennels.
IEEE Transactions on Computers | 1971
Algirdas Avizienis; George Gilley; Francis P. Mathur; David A. Rennels; John A. Rohr; David K. Rubin
This paper presents the results obtained in a continuing investigation of fault-tolerant computing which is being conducted at the Jet Propulsion Laboratory. Initial studies led to the decision to design and construct an experimental computer with dynamic (standby) redundancy, including replaceable subsystems and a program rollback provision to eliminate transient errors. This system, called the STAR computer, began operation in 1969. The following aspects of the STAR system are described: architecture, reliability analysis, software, automatic maintenance of peripheral systems, and adaptation to serve as the central computer of an outerplanet exploration spacecraft.
international symposium on low power electronics and design | 1995
Ramesh Panwar; David A. Rennels
In current processors, the cache controller, which contains the cache directory and other logic such as tag comparators, is active for each instruction fetch and is responsible for 20-25% of the power consumed in the Icache. Reducing the power consumed by the cache controller is important for low power I-cache design. We present three architectural modi cations, which in concert, allow us to reduce the cache controller activity to less than 2% for most applications. The rst modi cation involves comparing cache tags for only those instructions that result in fetches from a new cache block. The second modi cation involves the tagging of those branches that cause instructions to be fetched from a new cache block. The third modi cation involves augmenting the I-cache with a small on-chip memory called the S-cache. The most frequently executed basic blocks of code are statically allocated to the S-cache before program execution. We present empirical data to show the e ect that these modi cations have on the cache con-
dependable systems and networks | 2002
Henrique Madeira; Raphael R. Some; Francisco Moreira; Diamantino Costa; David A. Rennels
This paper evaluates the impact of transient errors in the operating system of a COTS-based system (CETIA board with two PowerPC 750 processors running LynxOS) and quantifies their effects at both the OS and at the application level. The study has been conducted using a Software-Implemented Fault Injection tool (Xception) and both realistic programs and synthetic workloads (to focus on specific OS features) have been used. The results provide a comprehensive picture of the impact of faults on LynxOS key features (process scheduling and the most frequent system calls), data integrity, error propagation, application termination, and correctness of application results.
ieee international symposium on fault tolerant computing | 1988
Yuval Tamir; Marc Tremblay; David A. Rennels
The authors present a technique, called micro rollback, which allows most of the performance penalty for concurrent error detection to be eliminated. Detection is performed in parallel with the transmission of information between modules, thus removing the delay for detection from the critical path. Erroneous information may thus reach its destination module several clock cycles before an error indication. Operations performed on this erroneous information are undone using a hardware mechanism for fast rollback of a few cycles. The authors discuss the implementation of a VLSI processor capable of micro rollback as well as several critical issues related to its use in a complete system.<<ETX>>
dependable systems and networks | 2002
Keith Whisnant; Ravishankar K. Iyer; Phillip H. Jones; Raphael R. Some; David A. Rennels
Presents an experimental evaluation of a software-implemented fault tolerance (SIFT) environment built around a set of self-checking processes called ARMORs running on different machines that provide error detection and recovery services to themselves and to spaceborne scientific applications. The experiments are split into three groups of error injections, with each group successively stressing the SIFT error detection and recovery more than the previous group. The results show that the SIFT environment adds negligible overhead to the application during failure-free runs. Only 11 cases were observed in which either the application failed to start or the SIFT environment failed to recognize that the application had completed. Further investigations showed that assertions within the SIFT processes-coupled with object-based incremental checkpointing-were effective in preventing system failures by protecting dynamic data within the SIFT processes.
Archive | 1992
Jacob A. Abraham; Don Lee; David A. Rennels; George Gilley
This paper describes a novel approach for evaluating the reliability of large fault-tolerant systems. The design hierarchy of the system is preserved during the evaluation, allowing large systems to be analyzed. Semi-Markov models are used at each level in the hierarchy, and a numerical technique is used to combine models from a given level for use at the next level. Different values of parameters, such as coverage, can then be used appropriately at any level, resulting in a much more accurate prediction of reliability. The proposed technique has been validated through comparison with analytical calculations, results from existing tools and Monte-Carlo simulation.
ieee international symposium on fault tolerant computing | 1994
David A. Rennels; Hyeongil Kim
This paper examines architectural techniques for providing concurrent error detection in self-timed VLSI pipelines. Signal pairs from Differential Cascode Voltage Switch Logic are compared with a checker that is composed of a tree of dual-rail (morphic) comparators to detect errors and signal completion. An efficient implementation is shown that compares favorably in speed and area with conventional completion signal generators. A simple pipeline is examined with error checkers at each computation stage and hand-shaking control circuits that are modified to improve error detection. Its error-detecting properties are discussed, and preliminary error simulation results are presented. Based on these studies we have concluded that self-timed logic offers considerable fault-tolerance potential due to its built-in redundancy that can be effectively exploited for error checking.<<ETX>>
pacific rim international symposium on fault tolerant systems | 1997
David A. Rennels; Douglas Wyche Caldwell; Riki Hwang; Malena R. Mesarina
The paper presents a design approach for implementing a fault-tolerant embedded computing node based on the use of low-cost commodity microcontrollers. A combination of software and relatively simple external logic is used to implement fault-tolerance in a redundant set of microcontrollers. A node can be protected with different amounts of redundancy (duplex, triplex, hybrid) depending upon the needs of its host subsystem, and is intended to be interconnected with other nodes into a modular distributed network. The structure of the node, and fault detection and recovery algorithms are described, along with a description of an experimental testbed that is being implemented.
ieee/aiaa digital avionics systems conference | 1997
D. W. Caldwell; David A. Rennels
Microcontrollers provide very dense functionality for embedded applications ranging from telephones to automobiles. The acceptance of these devices for space applications has been hindered by their manufacture which often uses multiple semiconductor fabrication techniques and thereby compromises radiation tolerance. If such concerns could be mitigated, microcontrollers would provide substantial increase in performance for builders of spacecraft electronics. This paper presents hardware considerations for using commercial microcontrollers in space applications. The motivations for starting with commercial devices associated with their use are advantages of software versus hardware voting schemes to mitigate single-event effects are discussed. Interprocess communications approaches and schemes for improving I/O robustness are presented.
[1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium | 1991
David A. Rennels; Hyeongil Kim
A VLSI implementation of a design concept for a self-checking self-exercising (SCSE) memory system described by D. Rennels and S. Chau (see Proc. 16th Int. Symp. on Fault-Tolerant Computing p.358-63 (1986)) is presented. The design, which provides a way of detecting faults and correcting errors in RAMs within milliseconds while concurrently performing normal execution of programs, is reviewed. The approach is to add two parity bits to each row in the storage arrays of the RAM chips and to provide hardware scrubbing interleaved with normal program cycles. The RAM and MIBB (memory interface building block) chip designs, and some of the augmentations and changes required from the original conceptual design, are examined. The approach has been determined to be feasible, and the three-year design process has also demonstrated the large distance between a conceptual design and its realization. Errors and deficiencies were found in the original design and corrected, and new useful functions were identified and added.<<ETX>>