Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Richard E. Matick is active.

Publication


Featured researches published by Richard E. Matick.


international solid-state circuits conference | 2007

A 500MHz Random Cycle 1.5ns-Latency, SOI Embedded DRAM Macro Featuring a 3T Micro Sense Amplifier

John E. Barth; William Robert Reohr; Paul C. Parries; Gregory J. Fredeman; John Golz; Stanley E. Schuster; Richard E. Matick; Hillery C. Hunter; Charles Tanner; Joseph Harig; Hoki Kim; Babar A. Khan; John A. Griesemer; R.P. Havreluk; Kenji Yanagisawa; Toshiaki Kirihata; Subramanian S. Iyer

A prototype SOI embedded DRAM macro is developed for high-performance microprocessors and introduces performance-enhancing 3T micro sense amplifier architecture (muSA). The macro was characterized via a test chip fabricated in a 65nm SOI deep-trench DRAM process. Measurements confirm 1.5ns random access time with a 1V supply at 85deg and low voltage operation with a 600mV supply.


Proceedings of the IEEE | 1968

Transmission line pulse transformers—Theory and applications

Richard E. Matick

The advent of fast rise-time pulse techniques and their increasing importance brought on by high-speed microminiature circuits and the computer industry has resulted in an increased demand for pulse transformers of various types. The basic idea of constructing transmission line type transformers has been known and used for a number of years. However, such devices have not gained widespread usage, partly because their existence is not well known, but largely because of a lack of basic understanding of their operating principles in terms of elementary fundamentals as well as their capabilities and limitations. The purpose of this paper is twofold. One aim is to develop in step-by-step fashion the basic ideas of transmission line transformers from ordinary transmission line theory. The subject will be approached from the point of view of pulse response rather than ac excitation as is usually the case. Both impedance transformers and balanced-to-unbalanced (balun) transformers, including inverters, will be considered with physical insights into their operation. Several fundamental concepts will be developed and explored in detail (without mathematics), since they have a strong bearing on practical applications. The second purpose is to present new information and pulse measurements which will be useful in the design and applications of such devices, showing their capabilities and hitherto unexplored limitations, as derived from the fundamental concepts. Thus, this paper is partly supplementary to other published work and partly new work with the goal of providing a convenient fundamental understanding of these devices and their inherent potential and shortcomings. Although the intention is not to give a detailed design procedure, some approximate calculations and discussion of significant design criteria are included.


Transactions of the American Institute of Electrical Engineers, Part I: Communication and Electronics | 1960

Potentials in D-C corona fields

Gaylord W. Penney; Richard E. Matick

A probe method has been devised for measuring potentials in d-c corona fields which are uniform for a short distance in one direction. In unipolar corona, a probe supported by perfect insulators assumes a potential sufficiently above the space so that no more ions can reach the probe. For a cylindrical probe with its axis along an equipotential, this excess or “error potential” is proportional to the diameter of the cylinder. To minimize this error a fine wire is used as the probe. A frame is designed which both supports the wire and shields it from fields other than the field being measured. The fine wire is electrically connected to an electrostatic voltmeter, and the frame to a source of variable potential. A potential is measured by slowly increasing the potential of the frame and recording both frame and wire potential. Under proper conditions, the space potential is the point where wire and frame potentials are equal.


Ibm Journal of Research and Development | 2001

Analytical analysis of finite cache penalty and cycles per instruction of a multiprocessor memory hierarchy using miss rates and queuing theory

Richard E. Matick; Thomas J. Heller; Michael Ignatowski

Advances in technology have provided a continuing improvement in processor speed and capacity of attached main memory. The increasing gap between main memory and processor cycle times has required increasingly more levels of caching to prevent performance degradation. The net result is that the inherent delay of a memory hierarchy associated with any computing system is becoming the major performance-determining factor and has inspired many types of analysis methods. While an accurate performance-evaluation tool requires the use of trace-driven simulators, good approximations and significant insight can be obtained by the use of analytical models to evaluate finite cache penalties based on miss rates (or miss ratios) and queuing theory combined with empirical relations between various levels of a memory hierarchy. Such tools make it possible to readily determine trends in performance vs. changes in input parameters. This paper describes such an analysis approach--one which has been implemented in a spreadsheet and used successfully to perform early engineering tradeoffs for many uniprocessor and multiprocessor memory hierarchies.


Proceedings of the IEEE | 1972

Review of current proposed technologies for mass storage systems

Richard E. Matick

Several years ago, let us say 1965, there were few serious technological challengers for large main memories and file storage systems for computers. Magnetic cores and rotating magnetic disks were the undisputed champions. While there was no lack of alternatives in the mid 1960s, cores and disks still offered potential improvements which could not easily be challenged by other technologies. The status of computing systems has advanced to a point where currently, as in the past, larger and faster access storage is needed. While there still exists room for improvement in cores and disk technology, the tradeoffs between size, speed, and cost suggest that other technologies may now offer certain advantages. As a result, numerous technologies have appeared recently to offer alternatives for large storage systems. Some of the more notable proposed technologies currently receiving considerable attention in the published literature will be reviewed. After a discussion of the limitations and technical aspects of magnetic recording and a systems analysis of direct access storage, a review of the more advanced technologies of surface wave acoustics, magnetooptic beam addressed memories, magnetic bubbles, switchable resistances, and integrated circuit memories of various types is undertaken. A discussion of the device concept with a possible system implementation for mass storage is presented along with conceivable densities, speed, advantages and disadvantages, and inherent limitations. The essential conclusion is that when all the necessary requirements for a storage system are taken into consideration, none of the more advanced technologies have a clear-cut distinct appeal over any other. Each has its own special advantages and attraction but only at the expense of some other features, such as special operating environments (e.g., low temperature), special operating modes (e.g., shift register), a limited range of applicability, or limited potential for further improvements in performance. There remains room for other inventions, particularly for storage systems with access times considerably faster than mechanical devices (i.e., one millisecond) but slower and cheaper than commonplace electronic speeds (i.e., one microsecond). However, because of the stringent requirements imposed by storage systems and cost/performance tradeoffs, it does not appear likely that these will be forthcoming in the near future.


Ibm Journal of Research and Development | 1984

All points addressable raster display memory

Frederick Hayes Dill; Satish Gupta; Daniel T. Ling; Richard E. Matick

This paper discusses display designs which store the image point by point in random access memory, so that independent update of every pixel is possible. A frequent bottleneck in the design of high performance displays of this type is the available bandwidth of the memory subsystem. In this paper, we focus on this issue and present features of a customized dynamic RAM chip which can readily provide the necessary bandwidth and thus greatly simplify the design of very high performance APA raster scan displays. The customized RAM chip is quasi-two-ported. After briefly introducing APA raster displays, we discuss display memory system design and the design of the proposed custom memory chip. We describe the second port for the video refresh, which makes the primary port available for update almost continuously. We also discuss modifications to the existing primary port to make it easily usable for the parallel update required for high update performance as well as for other applications.


Ibm Journal of Research and Development | 2003

Comparison of analytic performance models using closed mean-value analysis versus open-queuing theory for estimating cycles per instruction of memory hierarchies

Richard E. Matick

Analytic models provide a simple but approximate method for predicting the performance of complex processing systems early in the design cycle. Over the years, extensive use has been made of various queuing models to analyze the memory hierarchies of multiprocessor systems in order to estimate the finite cache penalty and resulting system performance measured in cycles per instruction executed. Two general modeling techniques widely used for such performance evaluation are the open-system and closed-system queuing theories. Closed-queuing models can be solved by various methods, but mean value analysis is the most common for closed systems of the type considered here. The basic differences between these two approaches have been somewhat obscure, making them difficult to compare. This work explores some fundamental issues from a practical engineering viewpoint with the intention of illuminating the essential differences in the general techniques at the very basic level. In addition, the results of a detailed study comparing the two in a moderately complex multiprocessor memory hierarchy are presented.


Ibm Systems Journal | 1984

Architecture implications in the design of microprocessors

Richard E. Matick; Daniel T. Ling

This paper examines how architecture, the definition of the instruction set and other facilities that are available to the user, can influence the implementation of a very large scale integration (VLSI) microsystem. The instruction set affects the system implementation in a number of direct ways. The instruction formats determine the complexity of instruction decoding. The addressing modes available determine not only the hardware needed (multiported register files or three-operand adders), but also the complexity of the overall machine pipeline as greater variability is introduced in the time it takes to obtain an operand. Naturally, the actual operations specified by the instructions determine the hardware needed by the execution unit. In a less direct way, the architecture also determines the memory bandwidth required. A few key parameters are introduced that characterize the architecture and can be simply obtained from a typical workload. These parameters are used to analyze the memory bandwidth required and indicate whether the system is CPU- or memory-limited at a given design point. The implications of caches and virtual memories are also briefly considered.


IEEE Transactions on Reliability | 1983

Comparison of Memory Chip Organizations vs Reliability in Virtual Memories

Richard E. Matick

Random access memory organizations typically are chosen for maximum reliability, based on the operation of the memory box itself without concern for the remainder of the computing system. This had led to widespread use of the 1-bit-per-chip, or related organization which uses error correcting codes to minimize the effects of failures occurring in some basic unit such as a word or double word (32 to 64 bits). Such memory boxes are used quite commonly in paged virtual memory systems where the unit for protection is really a page (4K bytes), or in a cache where the unit for protection is a block (32 to 128 bytes), not a double word. With typical high density memory chips and typical ranges of failure rates, the 1-bit-per-chip organization can often maximize page failures in a virtual memory system. For typical cases, a paged virtual memory using a page-per-chip organization can substantially improve reliability, and is potentially far superior to other organizations. This paper first describes the fundamental considerations of organization for memory systems and demonstrates the underlying problems with a simplified case. Then the reliability in terms of lost pages per megabyte due to hard failures over any time period is analyzed for a paged virtual memory organized in both ways. Normalized curves give the lost pages per Mbyte as a function of failure rate and accumulated time. Assuming reasonable failure rates can be achieved, the page-per-chip organization can be 10 to 20 times more reliable than a 1-bit-per-chip scheme.


Ibm Journal of Research and Development | 1989

Architecture, design, and operating characteristics of a 12-ns CMOS functional cache chip

Richard E. Matick; Robert Mao; Samuel T. Ray

S24 The architecture, design, and implementation of a high-performance cache require a detailed consideration of the overall system functions closely coupled with the proper mapping and integration of these functions Into the circuits and arrays. This approach has resulted in a new cache chip which incorporates a number of unique on-chip functions as well as unique design, providing a one-cycle cache in which translation can be overlapped with cache access. In order to achieve high average performance, a cache should give the appearance of being a two-ported array in order to provide high bandwidth both to the processor during normal execution and to the main

Researchain Logo
Decentralizing Knowledge