Tobias G. Noll
RWTH Aachen University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tobias G. Noll.
european solid-state circuits conference | 2007
Alexander Flocke; Tobias G. Noll
As a possible successor for CMOS memory, hysteretic materials organized in crossbar structures are currently being investigated. Here, passive materials are of special importance as they maintain their functionality even when scaled down to the nanometer domain. With their regularity and inherent device density so-called nano-scaled crossbars seem to be very interesting for future components beyond the present scope of the ITRS-CMOS roadmap. But, due to their passive behavior they will not be capable of operating on their own without active devices that restore signal levels. This work investigates the limitations resistive hysteretic crossbars face due to their very nature and what performance CMOS read circuits will have to offer to let hybrid circuits result in a functional new technology.
application-specific systems, architectures, and processors | 2002
Holger Blume; H. Hübert; H. T. Feldkämper; Tobias G. Noll
The exploration of the design space for heterogeneous Systems on Chip (SoC) becomes more and more important. As modern SoCs include a variety of different architecture blocks ensuring flexibility as well as highest performance, it is mandatory to prune the design space in an early stage of the design process in order to achieve short innovation cycles for new products. Thus, the goal of this work is to provide estimations of implementation specific parameters like throughput rate, power dissipation and silicon area by means of cost functions featuring reasonable accuracy at low modeling effort. A model based exploration strategy supporting the design flow for heterogeneous SoCs is presented. In order to demonstrate the feasibility of this exploration strategy, in a first step implementation cost parameters are provided for a variety of basic operations frequently required in digital signal processing which were implemented on discrete components like DSPs, FPGAs or dedicated ASICs. These implementation parameters serve as a basis for deriving cost models for the design space exploration concept.
IEEE Journal of Solid-state Circuits | 2002
Tobias Gemmeke; Michael Gansen; Tobias G. Noll
Todays data reconstruction in digital communication systems requires designs of highest throughput rate at low power. The Viterbi algorithm is a key element in such digital signal processing applications. The nonlinear and recursive nature of the Viterbi decoder makes its high-speed implementation challenging. Several promising approaches to achieve either high throughput or low power have been proposed in the past. A combination of these is developed in this paper. Additional new concepts allow building a signal-flow graph suitable for the design of high-speed Viterbi decoders with low power. Using a flexible datapath generator facilitates the essential quantitative optimization from architectural down to physical level to fully exploit the low-power and high-speed potential of a given technology. With parameterizable design entry, this datapath generator establishes the basis of a scalable platform-based design library. Altogether, this allows coverage of the range of todays industrial interest in high throughput rates, from 150 Msymbols/s up to 1.2 Gsymbols/s using conventional CMOS logic. The features of two exemplary Viterbi decoder implementations prove the benefit of this physically oriented design methodology in terms of speed and low power, when compared to other leading edge implementations.
application-specific systems, architectures, and processors | 2000
V. S. Gierenz; Oliver Weiss; Tobias G. Noll; I. Carew; Jonathan J. Ashley; Razmik Karabed
In todays high-speed disk drive read channel ICs maximum likelihood detection using the Viterbi algorithm is a key component in reconstructing digital data sequences. The presented Viterbi decoder was realized in a 0.25 /spl mu/m CMOS technology. Using the proposed comparison approach, it achieves a throughput rate of 550 Mb/s.
international conference on acoustics, speech, and signal processing | 2005
Gordian Prescher; Tobias Gemmeke; Tobias G. Noll
The paper presents a high performance turbo decoder. Its major building blocks, the maximum-a-posteriori decoder and the interleaver, are optimized from architecture to layout level to achieve high-throughput at low-power. This includes a novel architecture for parallel interleaving, that sustains any interleaving scheme. Moreover, the key features of the major building blocks are analyzed and modeled for quick design space exploration e.g. achieving 760 Mb/s at 570 mW in a 0.13 /spl mu/m-CMOS-technology. Finally, the characterized implementations are benchmarked.
IEEE Journal of Solid-state Circuits | 2004
Tobias Gemmeke; Michael Gansen; Heinrich J. Stockmanns; Tobias G. Noll
In recent years, power dissipation along with silicon area has become the key figure in chip design. The increasing demands on system performance require high-performance digital signal processing (DSP) systems to include dedicated number-crunching units as individually optimized building blocks. The various design methodologies in use stress one of the following figures: power dissipation, throughput, or silicon area. This paper presents a design methodology reducing any combination of cost drivers subject to a specified throughput. As a basic principle, the underlying optimization regards the existing interactions within the design space of a building block. Crucial in such optimization is the proper dimensioning of device sizes in contrast to the common use of minimal dimensions in low-power implementations. Taking the design space of an FIR filter as an example, the different steps of the design process are highlighted resulting in a low-power high-throughput filter implementation. It is part of an industrial read-write channel chip for hard disks with a worst case throughput of 1.6 GSamples/s at 23 mW in a 0.13-/spl mu/m CMOS technology. This filter requires less silicon area than other state-of-the-art filter implementations, and it disrupts the average trend of power dissipation by a factor of 6.
application-specific systems, architectures, and processors | 1997
Michael Gansen; Frank Richter; Oliver Weiss; Tobias G. Noll
A new flexible datapath generator which allows the automated design of full-custom macros covering dedicated filter structures as well as programmable DSP cores is presented. The underlying concept combines the advantages of full-custom designs concerning power dissipation, silicon area, and throughput rate with a moderate design effort. In addition, the datapath generator can be easily included in existing semi-custom design flows. This enables highly efficient VLSI implementations of optimized full-custom macros (datapaths) embedded into synthesized standard cell designs covering uncritical structures in terms of area, power, and throughput (e.g. control paths) using common design flows. In order to demonstrate the datapath generator assisted design flow, the implementation of a time-shared correlator is presented as an example.
Journal of Systems Architecture | 2007
Holger Blume; Daniel Becker; Lisa Rotenberg; Martin Botteck; Jörg Brakensiek; Tobias G. Noll
In this contribution the concept of functional- level power analysis (FLPA) for power estimation of programmable processors is extended in order to model embedded as well as heterogeneous processor architectures featuring different embedded processor cores. The basic FLPA approach is based on the separation of the processor architecture into functional blocks like, e.g. processing unit, clock network, internal memory, etc. The power consumption of these blocks is described by parameterized arithmetic models. By application of a parser based automated analysis of assembler codes the input parameters of the arithmetic functions like e.g. the achieved degree of parallelism or the kind and number of memory accesses can be computed. For modeling an embedded general purpose processor (here, an ARM940T) the basic FLPA modeling concept had to be extended to a so-called hybrid functional-level and instruction-level (FLPA/ILPA) model in order to achieve a good modeling accuracy. In order to show the applicability of this approach even a heterogeneous processor architecture (OMAP5912) featuring an ARM926EJ-S core and a C55x DSP core has been modeled using the hybrid FLPA/ILPA technique described before. The approach is exemplarily demonstrated and evaluated applying a variety of basic digital signal processing tasks ranging from basic filters to complete audio decoders or classical benchmark suits. Estimated power figures for the inspected tasks are compared to physically measured values for both inspected processor architectures. A resulting maximum estimation error of 9% for the ARM940T and less than 4% for the OMAP5912 is achieved.
application-specific systems, architectures, and processors | 2006
T. von Sydow; B. Neumann; Holger Blume; Tobias G. Noll
Embedding FPGAs (eFPGAs) in modern SoCs provides a high amount of flexibility while high-throughput digital signal processing algorithms can be realised efficiently. An analysis of eFPGA architectures and corresponding structural elements is presented to determine the optimisation potential for eFPGAs tailored to an arithmetic oriented application domain. The applied design flow incorporating an automated layout generation approach and the utilised simulation environment is discussed. An eFPGA macro designed and realised for arithmetic oriented applications is quantitatively compared to an actual commercial FPGA in terms of area, power consumption and delay time. It can be shown that this optimised eFPGA macro outperforms a state of the art commercial device for a couple of arithmetic operators which are commonly applied in arithmetic datapaths
international conference on nanotechnology | 2008
Alexander Flocke; Tobias G. Noll; Carsten Kügeler; C. Nauenheim; Rainer Waser
While materials with a linear IV-characteristic yield in a practically unusable voltage swing when used in crossbar arrays, materials with a nonlinear IV-curve were expected to yield better results. With a fundamental analytical approach, we can show that the gain is theoretically limited to the square root of the number of word lines used in the crossbar. Furthermore the degree of nonlinearity must not exceed a certain value, otherwise the voltage swing decreases. TiO2 with its nonlinear IV-characteristic outperforms any possible material with a linear IV-characteristic but the voltage swing for large crossbars is still below 10%VDD and would demand for high costly sense amplifiers.