Mika Nyström
California Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mika Nyström.
conference on advanced research in vlsi | 1997
Alain J. Martin; Andrew Lines; Rajit Manohar; Mika Nyström; Paul I. Pénzes; Robert Southworth; Uri Cummings; Tak Kwan Lee
The design of an asynchronous clone of a MIPS R3000 microprocessor is presented. In 0.6 /spl mu/m CMOS, we expect performance close to 280 MIPS, for a power consumption of 7 W. The paper describes the structure of a high-performance asynchronous pipeline, in particular precise exceptions, pipelined caches, arithmetic, and registers, and the circuit techniques developed to achieve high throughput.
Proceedings of the IEEE | 2006
Alain J. Martin; Mika Nyström
SoC design will require asynchronous techniques as the large parameter variations across the chip will make it impossible to control delays in clock networks and other global signals efficiently. Initially, SoCs will be globally asynchronous and locally synchronous (GALS). But the complexity of the numerous asynchronous/synchronous interfaces required in a GALS will eventually lead to entirely asynchronous solutions. This paper introduces the main design principles, methods, and building blocks for asynchronous VLSI systems, with an emphasis on communication and synchronization. Asynchronous circuits with the only delay assumption of isochronic forks are called quasi-delay-insensitive (QDI). QDI is used in the paper as the basis for asynchronous logic. The paper discusses asynchronous handshake protocols for communication and the notion of validity/neutrality tests, and completion tree. Basic building blocks for sequencing, storage, function evaluation, and buses are described, and two alternative methods for the implementation of an arbitrary computation are explained. Issues of arbitration, and synchronization play an important role in complex distributed systems and especially in GALS. The two main asynchronous/synchronous interfaces needed in GALS-one based on synchronizer, the other on stoppable clock-are described and analyzed.
symposium on asynchronous circuits and systems | 2003
Alain J. Martin; Mika Nyström; Karl Papadantonakis; Paul I. Pénzes; Piyush Prakash; Catherine G. Wong; Jonathan Chang; Kevin S. Ko; Benjamin N. Lee; Elaine Ou; Jim Pugh; Eino-Ville Talvala; James T. Tong; Ahmet Tura
We describe the Lutonium, an asynchronous 8051 microcontroller designed for low Et/sup 2/. In 0.18 /spl mu/m CMOS, at nominal 1.8 V, we expect a performance of 0.5 nJ per instruction at 200 MIPS. At 0.5 V, we expect 4 MIPS and 40 pJ/instruction, corresponding to 25,000 MIPS/Watt. We describe the structure of a fine-grain pipeline optimized for Et/sup 2/ efficiency, some of the peripherals implementation, and the advantages of an asynchronous implementation of a deep-sleep mechanism.
Power aware computing | 2002
Alain J. Martin; Mika Nyström; Paul I. Pénzes
We investigate an efficiency metric for VLSI computation that includes energy. E, and time, t, in the form Et2. We apply the metric to CMOS circuits operating outside velocity saturation when energy and delay can be exchanged by adjusting the supply voltage; we prove that under these assumptions, optimal Et2 implies optimal energy and delay. We give experimental and simulation evidences of the range and limits of the assumptions. We derive several results about sequential, parallel, and pipelined computations optimized for Et2, including a result about the optimal length of a pipeline.We discuss transistor sizing for optimal Et2 and show that, for fixed, nonzero execution rates, the optimum is achieved when the sum of the transistor-gate capacitances is twice the sum of the parasitic capacitances--not for minimum transistor sizes. We derive an approximation for Etn (for arbitrary n) of an optimally sized system that can be computed without actually sizing the transistors; we show that this approximation is accurate. We prove that when multiple, adjustable supply voltages are allowed, the optimal Et2 for the sequential composition of components is achieved when the supply voltages are adjusted so that the components consume equal power. Finally, we give rules for computing the Et2 of the sequential and parallel compositions of systems, when the Et2 of the components are known.
IEEE Design & Test of Computers | 2003
Alain J. Martin; Mika Nyström; Catherine G. Wong
We trace the evolution of Caltech asynchronous processors from a simple proof of concept, to a high-performance MIPS-like processor using a different buffer circuit for better performance, to the latest 8051 clone targeting low-energy operation. We describe the control aspects of the evolving circuit styles. We describe these three generations of asynchronous microprocessors (Caltech asynchronous processors, MiniMIPS and Lutonium) and the corresponding circuit families and design methods. The asynchronous circuits we use are called quasidelay-insensitive (QDI) circuits. A QDI circuit involves no assumption about, or knowledge of, delays in operators and wires, except for isochronic forks, which the designer assumes have similar delays on the different branches. QDI circuits are the most conservative asynchronous circuits in terms of delays.
conference on advanced research in vlsi | 2001
Rajit Manohar; Mika Nyström; Alain J. Martin
The presence of precise exceptions in a processor leads to complications in its design. Some recent processor architectures have sacrificed this requirement for performance reasons at the cost of software complexity. We present an implementation strategy for precise exceptions in asynchronous processors that does not block the instruction fetch when exceptions do not occur; the cost of the exception handling mechanism is only encountered when an exception occurs during execution - an infrequent event.
Proceedings of the 8th ACM/IEEE international workshop on Timing issues in the specification and synthesis of digital systems | 2002
Paul I. Pénzes; Mika Nyström; Alain J. Martin
This paper studies the problem of transistor sizing of CMOS circuits optimized for energy-delay efficiency, i.e., for optimal Etn where E is the energy consumption and t is the delay of the circuit, while n is a fixed positive optimization index that reflects the chosen trade-off between energy and delay.We propose a set of analytical formulas that closely approximate the optimal transistor sizes. We then study an efficient iteration procedure that can further improve the original analytical solution. Based on these results, we introduce a novel transistor sizing algorithm for energy-delay efficiency.
symposium on asynchronous circuits and systems | 2004
Mika Nyström; Elaine Ou; Alain J. Martin
Asynchronous pulse logic (APL) is an adaptation of quasi delay-insensitive (QDI) techniques using easily controllable timing assumptions that speed up the handshakes without changing the high-level dataflow model. We review the basic properties of APL circuits and techniques for describing them in and compiling them from a higher-level representation. We describe a reasonably complex test chip consisting of an 8-bit integer divider. Finally, we describe performance results from low-level SPICE simulations of the test chip. The results show that it is possible to design, with a high degree of automation, complex systems with a throughput of 10 CMOS transitions (less than 15 F04 delays) per cycle.
Archive | 1998
Alain J. Martin; Andrew Lines; Rajit Manohar; Uri Cummings; Mika Nyström
Archive | 2002
Mika Nyström; Alain J. Martin