Daniel P. Siewiorek | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel P. Siewiorek is active.

Explore More

Publication

Featured researches published by Daniel P. Siewiorek.

national computer conference | 1977

Cm*: a modular, multi-microprocessor

Richard J. Swan; Samuel H. Fuller; Daniel P. Siewiorek

This paper describes the architecture of a new large multi-processor computer system being built at Carnegie-Mellon University. The system allows close cooperation between large numbers of inexpensive processors. All processors share access to a single virtual memory address space. There are no arbitrary limits on the number of processors, amount of memory or communication bandwidth in the system. Considerable support is provided for low level operating system primitives and inter-process communication.

IEEE Transactions on Reliability | 1990

Error log analysis: statistical modeling and heuristic trend analysis

Ting-Ting Y. Lin; Daniel P. Siewiorek

Most error-log analysis studies perform a statistical fit to the data assuming a single underlying error process. The authors present the results of an analysis that demonstrates that the log is composed of at least two error processes: transient and intermittent. The mixing of data from multiple processes requires many more events to verify a hypotheses using traditional statistical analysis. Based on the shape of the interarrival time function of the intermittent errors observed from actual error logs, a failure-prediction heuristic, the dispersion frame technique (DFT), is developed. The DFT was implemented in a distributed system for the campus-wide Andrew file system at Carnegie Mellon University. Data collected from 13 file servers over a 22-month period were analyzed using both the DFT and conventional statistical methods. It is shown that the DFT can extract intermittent errors from the error log and uses only one fifth of the error-log entry points required by statistical methods for failure prediction. The DFT achieved a 93.7% success rate in predicting failures in both electromechanical and electronic devices. >

[1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium | 1990

Effects of transient gate-level faults on program behavior

Edward W. Czeck; Daniel P. Siewiorek

Effects of gate-level faults on program behavior are described and used as a basis for fault models at the program level. A simulation model of the IBM RT PC was developed and injected with 18900 gate-level transient faults. A comparison of the system state of good and faulted runs was made to observe internal propagation of errors, while memory traffic and program flow comparisons detected errors in program behavior. Results show several distinct classes of program-level error behavior, including program flow changes, incorrect memory bus traffic, and undetected but corrupted program state. Additionally, the dependencies of fault location, injection time, and workload on error detection coverage are reported. For the IBM RT PC, the error detection latency was shown to follow a Weibull distribution dependent on the error detection mechanism and the two selected workloads. These results aid in the understanding of the effects of gate-level faults and allow for the generation and validation of new fault models, fault injection methods, and error detection mechanisms.<<ETX>>

design automation conference | 1983

Functional Testing of Digital Systems

Kwok-Woon Lai; Daniel P. Siewiorek

Functional testing is testing aimed at validating the correct operation of a digital system with respect to its functional specification. We have designed and implemented a practical test generation methodology that can generate tests directly from a systems high-level specification. Solutions adopted include multi-level fault models and multi-stage test generation. Tests generated from the methodology were compared against test programs supplied by a computer manufacturer and were found to detect more faults with much better efficiency. The experiment demonstrated that functional testing can be both practical and efficient. Automatic generation of design validation tests is now closer to reality.

IEEE Transactions on Computers | 1981

Synchronization and voting

Stephen R. Mcconnel; Daniel P. Siewiorek

This is an elaboration of the paper `Synchronization and matching in redundant systems by Davies and Wakerly (ibid., vol.27, p.531-9, 1978). The design of voters for synchronization is strongly dependent on the signaling convention used. This correspondence presents voter designs for three different signaling conventions (transition, level, and pulse). The issue of improved voter performance is also addressed.

Proceedings of the IEEE | 1981

Testing of digital systems

Daniel P. Siewiorek; Larry Kwok-Woon Lai

This paper is intended to be both a tutorial on hardware testing and a brief survey of existing techniques. Testing is discussed at all levels in the digital system hierarchy as well as at every stage of a systems life. The paper is organized into three parts. In the first part, fundamental concepts are presented. The second part reviews various testing techniques, with special emphasis on those that have gained wide acceptance. Finally, design techniques which promise to produce easily testable hardware are explored.

IEEE Transactions on Reliability | 1992

Modification of: error log analysis: statistical modeling and heuristic trend analysis

Harold E. Ascher; Ting-Ting Y. Lin; Daniel P. Siewiorek

Results useful to: Failure data (error log) analysts and reliability Zx(x) - Fi(x)/Fx(x) analysts Fx(x) is the survivor function (complementary Cd). Abstract - The original paper used traditional statistical 24. analysis to demonstrate the superiority of the proposed disper. For the Weibull distribution. sion frame technique. The purpose was to distinguish between tran- sient and intermittent errors and predict the occurrence of inter- ZX(X) = -a. (X x) ,-I, and it mittent errors. TlIs note shows that those traditional statistical methods were too traditional since they involved fitting a decreases in x, for a I. stationary data are briefly discussed, and reasom are proffered for the persistence of too traditional statistical methods in the For a > > 1, zx(x) increases very rapidly as x increases, and reliability literature.

Reliable Computer Systems (Second Edition)#R##N#Design and Evaluatuion | 1992

5 – EVALUATION CRITERIA

Stephen R. Mcconnel; Daniel P. Siewiorek

Publisher Summary nThis chapter discusses the criteria for evaluating system reliability and availability. The highest level of modeling is the system level, in which the entire system is considered as a black box. Probabilistic modeling, based on relative component failure and repair rates, is the most frequently used to evaluate hardware reliability. The hazard function is easy to measure in ascertaining the operational reliability of physical systems because it can be calculated from a histogram of times between failures. The reliability function can be used to derive many of the other reliability measures. The mission time function is particularly well suited for applications with a minimum lifetime requirement either because of impossible or prohibitively expensive repair or due to fixed intervals between maintenance. For systems that can be repaired, the availability function defines the probability that the system is operational at any given time. The mean time to repair is often used to measure the repairability of a system. It is the expected time for repair of a failed system or subsystem.

Reliable Computer Systems (Second Edition)#R##N#Design and Evaluatuion | 1992

A DESIGN METHODOLOGY

Daniel P. Siewiorek; David Johnson

This chapter focuses on proposing a top-down design methodology and discusses its application in a detailed example, the VAXft 310. The definition of system objectives imposes the needs of the selected set of applications onto the key fault-tolerant metrics. Error-detection techniques should be established at the various boundaries to ensure that the coverage holes from one level to the next do not align. The percentage of faults detected is the single most important factor in successful recovery. The purpose of reconfiguration/recovery is to return the system to an operational state. A fault-tolerant computer system is measured in terms of the degree to which the attributes of data integrity, computational integrity, availability, and recovery time are realized. Given the need for an application-independent, fault-tolerant platform, a basic design tenet was to implement a hardware-intensive, rather than a software-intensive, fault-tolerant system.

IFIP | 1987

Experimental Research in Dependable Computing at Carnegie Mellon University

Daniel P. Siewiorek; Roy A. Maxion; Priya Narasimhan

In 1945 the Carnegie Plan for higher education was evolved. The basic philosophy of the plan is “learning by doing”. The strong emphasis on experimental research at Carnegie Mellon University (CMU) is an example of the Carnegie plan in operation. Research in reliable computing at Carnegie Mellon University has spanned three decades. In the early 1960’s, Westinghouse Corporation in Pittsburgh had an active research program in the use of redundancy to enhance system reliability. William Mann, who had been associated with Carnegie Mellon University, was one of those researchers. In 1962, a symposium on redundancy techniques was held in Washington, DC.; it lead to the first comprehensive book on the topic of redundancy and reliability. Bill Mann was one of the coauthors of that book [73]. One of the papers in that volume, on adaptive voting, was written by CMU’s Professor William H. Pierce [41]. Later Professor Pierce published one of the first text books on redundancy [42].

Explore More