Lauri Koskinen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lauri Koskinen is active.

Explore More

Publication

Featured researches published by Lauri Koskinen.

Microelectronics Journal | 2014

A cellular computing architecture for parallel memristive stateful logic

Eero Lehtonen; Jari Tissari; Jussi H. Poikonen; Mika Laiho; Lauri Koskinen

We present a cellular memristive stateful logic computing architecture and demonstrate its operation with computational examples such as vectorized XOR, circular shift, and content-addressable memory. The considered architecture can perform parallel elementary memristor programming and stateful logic operations, namely implication and converse nonimplication. The topology of the crossbar structure used for computing can be dynamically reconfigured, enabling combinations of local and global operations with varying granularity. In the CMOS cells used for controlling the memristors, we apply a new type of capacitive keeper circuit, which allows for energy efficient implementation of logic operations. The correct operation of this architecture is verified by detailed HSPICE simulations for a structure containing eight memristive crossbars. This work presents a hardware platform which enables future work on parallel stateful computing.

symposium on vlsi circuits | 2015

Fully integrated DC-DC converter and a 0.4V 32-bit CPU with timing-error prevention supplied from a prototype 1.55V Li-ion battery

Matthew Turnquist; Markus Hiienkari; Jani Mäkipää; Ruzica Jevtic; Elina Pohjalainen; Tanja Kallio; Lauri Koskinen

We introduce an ultra-low-energy system comprised of a prototype 1.55V Li-ion battery, fully integrated switched-capacitor (SC) DC-DC 3:1 converter, and a 32-bit RISC CPU with timing-error prevention (TEP). The DC-DC converter and CPU are manufactured in 28nm UTBB FD-SOI. The DC-DC converter uses the batterys flat discharge curve and low nominal voltage to achieve a peak efficiency of 85%. The CPU operates from 0.3V-0.5V and with energy as low as 4.9pJ/cyc. The battery, DC-DC converter, and CPU system is able to operate with an average energy of 8pJ/cyc over 95% of the batterys discharge curve in the temperature range of -20oC to 70oC.

IEEE Journal on Emerging and Selected Topics in Circuits and Systems | 2015

Recursive Algorithms in Memristive Logic Arrays

Eero Lehtonen; Jussi H. Poikonen; Jari Tissari; Mika Laiho; Lauri Koskinen

In memristive stateful logic memristors store logic values as their memristance states and perform logical operations on them. This form of logic has been studied intensively since it was first empirically demonstrated in the work of Borghetti , 2010. It has been previously noted that substantial parallelism in stateful computation is required to make this form of logic competetive with conventional logic computing paradigms. In this work we show how a certain class of vectorized recursive algorithms can be computed in a semiconductor/memristor hybrid array structure. This class of algorithms allows efficient computation of many practically important vector operations; examples considered in this paper include the binary sum of vectors, the parity of a vector, and the Hamming weight of a vector. We present theoretical analysis of the time and space complexity of this class of operations, and show examples of this computing method using circuit-level simulations. We also discuss possible applications of these operations in massively parallel memristive array computing.

custom integrated circuits conference | 2014

A 3.15pJ/cyc 32-bit RISC CPU with timing-error prevention and adaptive clocking in 28nm CMOS

Markus Hiienkari; Jukka Teittinen; Lauri Koskinen; Matthew Turnquist; Mikko Kaltiokallio

The increased performance from technology scaling makes it feasible to operate digital circuits at ultra-low voltages without the significant performance limitation of earlier process generations. The theoretical minimum energy point resides in near-threshold voltages in current processes, but device and environment variations make it a challenge to operate the circuits reliably. This paper presents an ASIC implementation of a 32-bit RISC CPU in 28nm CMOS employing timing-error prevention with clock stretching to enable it to operate with minimal safety margins while maximizing energy efficiency. Measurements show 3.15pJ/cyc energy consumption at 400mV/2.4MHz, which corresponds to 39% energy savings and 83% EDP reduction compared to operation based on static signoff timing.

IEEE Transactions on Very Large Scale Integration Systems | 2016

Implementing Minimum-Energy-Point Systems With Adaptive Logic

Lauri Koskinen; Markus Hiienkari; Jani Mäkipää; Matthew Turnquist

Timing-error-detection (TED)-based systems have been shown to reduce power consumption or increase yield due to reduced margins. This paper shows that the increased adaptability can be a great advantage in the system design in addition to the well-known mitigated susceptibility to ambient and internal variations. Specifically, the design tolerances of the power management are relaxed to enable even greater system-level energy savings than what can be achieved in the logic alone. In addition, the system is simultaneously able to operate near the minimum error point. Here, the power management is a simplified dc-dc converter and the TED is based on time borrowing. The target application is a single-chip system on chip without external discrete components; thus, switched capacitors are used for the dc-dc. The system achieves 7.9% energy reduction at the minimum energy point simultaneously with a 36.4% energy-delay product decrease and a 15% increase in dc-dc efficiency. In addition, the effect of local variations on average system performance is reduced by 12%.

international conference on management of data | 2015

Towards Hardware-driven Design of Low-energy Algorithms for Data Analysis

Indre Zliobaite; Jaakko Hollmén; Lauri Koskinen; Jukka Teittinen

In the era of big data, data analysis algorithms need to be efficient. Traditionally researchers would tackle this problem by considering small data algorithms, and investigating how to make them computationally more efficient for big data applications. The main means to achieve computational efficiency would be to revise the necessity and order of subroutines, or to approximate calculations. This paper presents a viewpoint that in order to be able to cope with the new challenges of the growing digital universe, research needs to take a combined view towards data analysis algorithm design and hardware design, and discusses a potential research direction in taking an intreated approach in terms of algorithm design and hardware design. Analyzing how data mining algorithms operate at the elementary operations level can help do design more specialized and dedicated hardware, that, for instance, would be more energy efficient. In turn, understanding hardware design can help to develop more effective algorithms.

ieee soi 3d subthreshold microelectronics technology unified conference | 2014

Effects of back-gate bias on switched-capacitor DC-DC converters in UTBB FD-SOI

Matthew Turnquist; Guerric de Streel; David Bol; Markus Hiienkari; Lauri Koskinen

This paper explores the effects of back-gate bias on switched-capacitor (SC) DC-DC converters in 28 nm UTBB FD-SOI. By using back-gate bias to optimize the control circuitry and switches, the SC converter can operate with a peak efficiency of 72% in sleep mode (100 nW load) and 83% in active mode (100 μW load).

Cellular Nanoscale Networks and their Applications (CNNA), 2014 14th International Workshop on | 2014

A cellular architecture for memristive stateful logic

Jari Tissari; Eero Lehtonen; Mika Laiho; Lauri Koskinen; Jussi H. Poikonen

In this paper, we present a CMOS/memristor hybrid architecture for massively parallel logic computations in a CMOL-type memristive memory. The considered architecture enables bit-parallel stateful logic operations, which can be used to efficiently implement vector computations. As examples of computing schemes that benefit from the considered processing architecture, we consider the implementation of a content-addressable memory and binary cellular automata. We verify the correct operation of the considered processing architecture and algorithms using HSPICE simulations.

asian solid state circuits conference | 2015

A fully integrated self-oscillating switched-capacitor DC-DC converter for near-threshold loads

Matthew Turnquist; Markus Hiienkari; Jani Mäkipää; Lauri Koskinen

We introduce a fully integrated step-down self-oscillating switched-capacitor DC-DC converter that delivers near-threshold (NT) output voltages. The converter is built in 28 nm UTBB FD-SOI and occupies 0.0104 mm2. Back-gate biasing is utilized to increase the load power range. Measurements show a peak efficiency of 87%, self start-up capability, and a minimum efficiency of 75% for 79 nW to 200 μW (ideal) loads. Measurements with an off-chip NT processor load also show high efficiency. The converters large load power range and high efficiency are a good fit for energy-constrained NT processors.

Microelectronics Journal | 2017

A 5.3 pJ/op approximate TTA VLIW tailored for machine learning

Jukka Teittinen; Markus Hiienkari; Indr liobait; Jaakko Hollmén; Heikki Berg; Juha Heiskala; Timo Viitanen; Jesse Simonsson; Lauri Koskinen

To achieve energy efficiency in the Internet-of-Things (IoT), more intelligence is required in the wireless IoT nodes. Otherwise, the energy required by the wireless communication of raw sensor data will prohibit battery lifetime, the backbone of IoT. One option to achive this intelligence is to implement a variety of machine learning algorithms on the IoT sensor instead of the cloud. Shown here is sub-milliwatt machine learning accelerator operating at the Ultra-Low Voltage Minimum-Energy Point. The accelerator is a Transport Triggered Architecture (TTA) Application-Specific Instruction-Set Processor (ASIP) targeted for running various Machine Learning algorithms. The ASIP is implemented in 28nm FDSOI (Fully Depleted Silicon On Insulator) CMOS process, with an operating voltage of 0.35V, and is capable of 5.3pJ/cycle and 1.8nJ/iteration when performing conventional machine learning algorithms. The ASIP also includes hardware and compiler support for approximate computing. With the machine learning algorithms, computing approximately brings a maximum of 4.7% energy savings.

Explore More