Hubert Kaeslin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hubert Kaeslin is active.

Explore More

Publication

Featured researches published by Hubert Kaeslin.

custom integrated circuits conference | 1994

A 177 Mb/s VLSI implementation of the International Data Encryption Algorithm

Reto Zimmermann; Andreas Curiger; H. Bonnenberg; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner

A VLSI implementation of the International Data Encryption Algorithm is presented. Security considerations led to novel system concepts in chip design including protection of sensitive information and on-line failure detection capabilities. BIST was instrumental for reconciling contradicting requirements of VLSI testability and cryptographic security. The VLSI chip implements data encryption and decryption in a single hardware unit. All important standardized modes of operation of block ciphers, such as ECB, CBC, CFB, OFB, and MAC, are supported. In addition, new modes are proposed and implemented to fully exploit the algorithms inherent parallelism. With a system clock frequency of 25 MHz the device permits a data conversion rate of more than 177 Mb/s. Therefore, the chip can be applied to on-line encryption in high-speed networking protocols like ATM or FDDI. >

international conference on asic | 1999

Globally-asynchronous locally-synchronous architectures to simplify the design of on-chip systems

J. Muttersbach; Thomas Villiger; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner

A novel methodology for realizing Globally-Asynchronous Locally-Synchronous (GALS) architectures is reported. We developed a library of predesigned modules that facilitate the assembly of independently clocked modules to on-chip systems. The components of this library establish high-performance data exchange channels which are instrumental in constructing flexible architectures. The validity of our concept is proven by applying it to an ASIC design with real-world complexity.

asia pacific conference on circuits and systems | 2008

Gram-Schmidt-based QR decomposition for MIMO detection: VLSI implementation and comparison

Peter Luethi; Christoph Studer; Sebastian Duetsch; Eugen Zgraggen; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner

The QR decomposition (QRD) is an important prerequisite for many different detection algorithms in multiple-input multiple-output (MIMO) wireless communication systems. This paper presents an optimized fixed-point VLSI implementation of the modified Gram-Schmidt (MGS) QRD algorithm that incorporates regularization and additional sorting of the MIMO channel matrix. Integrated in 0.18 mum CMOS technology, the proposed VLSI architecture processes up to 1.56 million complex-valued 4times4-dimensional matrices per second. The implementation results of this work are extensively compared to the Givens rotation (GR)-based QRD implementation of Luethi et al., ISCAS 2007. In order to ensure a fair comparison, both QRD circuits have been integrated in the same IC manufacturing technology, with equal functionality, and the same numeric precision. The comparison of the implementation results clearly showed superiority of the GR-based VLSI solution in terms of area, processing cycles, and throughput.

midwest symposium on circuits and systems | 2003

Efficient ASIC implementation of a real-time depth mapping stereo vision system

Michael Kuhn; Stephan Moser; Oliver Isler; Frank K. Gürkaynak; Andreas Burg; Norbert Felber; Hubert Kaeslin; Wolfgang Fichtner

This paper presents a fast and area-efficient implementation of a real-time stereo vision algorithm for spatial depth mapping. The design combines two well-known area-based approaches to stereo matching and includes an occlusion detection method. Hardware efficiency is achieved by storing only partial images on-chip, avoiding full-sized frame buffers. A low-latency dataflow-oriented structure makes it possible to process 256/spl times/192 pixel. Input streams with a rate in excess of 50 frames per second, amounting to more than 54 million pixel /spl times/ disparity measurements per second (PDS) (for a 25-pixel disparity range), or roughly 18 GOPS. The design has been integrated in a 0.25 /spl mu/m standard CMOS technology and occupies an area of less than 3 mm/sup 2/.

IEEE Journal of Solid-state Circuits | 1991

Regular VLSI architectures for multiplication modulo (2/sup n/+1)

Andreas Curiger; H. Bonnenberg; Hubert Kaeslin

The authors describe VLSI architectures for multiplication modulo p, where p is a Fermat prime. With increasing p, ROM-based table lookup methods become unattractive for integration due to excessive memory requirements. Three novel methods are discussed and compared to ROM implementations with regard to their speed and complexity characteristics. The first method is based on an (n+1)*(n+1)-bit array multiplier, the second on modulo p carry-save addition, and the third on modulo (p-1) carry-save addition using a bit-pair recoding scheme. All allow very high throughputs in pipelined implementations. While the former is very convenient for CAD (computer-aided design) environments providing a pipelined multiplier macrocell, the latter two are well-suited to full-custom implementation. >

IEEE Journal on Emerging and Selected Topics in Circuits and Systems | 2012

VLSI Design of Approximate Message Passing for Signal Restoration and Compressive Sensing

Patrick Maechler; Christoph Studer; David E. Bellasi; Arian Maleki; Andreas Burg; Norbert Felber; Hubert Kaeslin; Richard G. Baraniuk

Sparse signal recovery finds use in a variety of practical applications, such as signal and image restoration and the recovery of signals acquired by compressive sensing. In this paper, we present two generic very-large-scale integration (VLSI) architectures that implement the approximate message passing (AMP) algorithm for sparse signal recovery. The first architecture, referred to as AMP-M, employs parallel multiply-accumulate units and is suitable for recovery problems based on unstructured (e.g., random) matrices. The second architecture, referred to as AMP-T, takes advantage of fast linear transforms, which arise in many real-world applications. To demonstrate the effectiveness of both architectures, we present corresponding VLSI and field-programmable gate array implementation results for an audio restoration application. We show that AMP-T is superior to AMP-M with respect to silicon area, throughput, and power consumption, whereas AMP-M offers more flexibility.

european solid-state circuits conference | 2004

Towards an AES crypto-chip resistant to differential power analysis

N. Pramstaller; Frank K. Gürkaynak; Simon Haene; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner

Differential power analysis (DPA) implies measuring the supply current of a cipher-circuit in an attempt to uncover part of a cipher-key. Cryptographic security gets compromised if the current waveforms so obtained correlate with those from a hypothetical power model of the circuit. Such correlations can be minimized by masking datapath operations with random bits in a reversible way. We analyze such countermeasures and discuss how they perform and how well they lend themselves to being incorporated into dedicated hardware implementations of the advanced encryption standard (AES) block cipher. Our favorite masking scheme entails a performance penalty of some 40-50%. We also present a VLSI design that can serve for practical experiments with DPA.

ieee international symposium on asynchronous circuits and systems | 2006

GALS at ETH Zurich: success or failure?

Frank K. Gürkaynak; Stephan Oetiker; Hubert Kaeslin; Norbert Felber; Wolfgang Fichtner

The Integrated Systems Laboratory (IIS) of ETH Zurich (Swiss Federal Institute of Technology) has been active in globally-asynchronous locally-synchronous (GALS) research since 1998. During this time, a number of GALS circuits have been fabricated and tested successfully on silicon. From a hardware designers point of view, this article summarizes the evolution from proof of concept designs over multi-point interconnects to applications that specifically take advantage of GALS operation to improve cryptographic security. In spite of the fact that they fail to address numerous idiosyncrasies of GALS (such as good partitioning into synchronous islands, port controller design, pausable clock generators, design for test, etc.), hierarchical design flows have been found to form a workable basis. What prevents GALS from gaining a wider acceptance mainly is the initial effort required to come up with a design flow that is efficient and dependable

IEEE Journal of Solid-state Circuits | 1989

In-place updating of path metrics in Viterbi decoders

M. Biver; Hubert Kaeslin; C. Tommasini

An area-efficient in-place computation scheme for updating path metrics in solid-state Viterbi decoders is proposed. The permutation of items in memory, resulting as a by-product from in-place updating, is formally shown to be cyclic address rotation, which can be compensated for with almost no extra hardware. >

international conference on electronics, circuits, and systems | 2012

High-speed compressed sensing reconstruction on FPGA using OMP and AMP

Lin Bai; Patrick Maechler; Michael Muehlberghuber; Hubert Kaeslin

Compressed sensing allows to reconstruct sparse signals sampled at sub-Nyquist rates. However, reconstruction of the original signal requires high computational effort, even for problems of moderate size. Especially for applications with real-time requirements, software realizations are not fast enough. We therefore present generic high-speed FPGA implementations of two fast reconstruction algorithms: orthogonal matching pursuit (OMP) and approximate message passing (AMP). Our implementations also support less sparse signals, which makes them suitable for, e.g., image reconstruction. The two implementations are optimized for highly parallel processing on FPGAs and have similar hardware structures, which allows comparisons in terms of resource usage and performance.

Explore More