Stefan Bitterlich | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stefan Bitterlich is active.

Explore More

Publication

Featured researches published by Stefan Bitterlich.

international conference on asic | 2000

ICORE: a low-power application specific instruction set processor for DVB-T acquisition and tracking

Tilman Glökler; Stefan Bitterlich; Heinrich Meyr

A design methodology is presented to optimize application specific instruction set processors (ASIPs) with respect to performance and power. The methodology uses semi-custom design with incremental datapath and instruction set enhancements of a conventional, unoptimized architecture. ICORE, a low-power ASIP for DVBT acquisition and tracking algorithms, demonstrates the huge potential concerning power savings of these optimizations.

international conference on acoustics, speech, and signal processing | 2001

Power efficient semi-automatic instruction encoding for application specific instruction set processors

Tilman Glökler; Stefan Bitterlich

A novel design methodology for the implementation of control units for application specific instruction set processors (ASIPS) is described. This methodology uses automatic instruction encoding and semi-automatic generation of the hardware instruction decoder to speed up the ASIP design. Significant power savings due to optimized instruction encoding are achieved. Results for ICORE (ISS-Core), which is an ASIP for digital video broadcasting algorithms of Infineon Technologies, demonstrate the efficiency and applicability of this approach.

signal processing systems | 2000

Increasing the power efficiency of application specific instruction set processors using datapath optimization

Tilman Glökler; Stefan Bitterlich; Heinrich Meyr

Application specific instruction set processors (ASIPs) can be optimized both for speed and power taking advantage of the flexibility of a synthesized semi-custom implementation. The current case study evaluates the effect of datapath and instruction set optimisation using two examples from terrestrial digital video broadcasting (DVB-T) acquisition and tracking algorithms. Starting from a conventional, unoptimized instruction set architecture, which uses simple instructions like any commercially available DSP, incremental application specific optimizations are performed. Results in terms of cycle count and energy per task are used to evaluate the feasibility and the power efficiency of each implementation. The results show a huge potential (>6x) concerning power savings using this design methodology.

international conference on application specific array processors | 1993

Efficient scalable architectures for Viterbi decoders

Stefan Bitterlich; H. Meyr

Viterbi decoders (VDs) are widely used today for the decoding of convolutional codes in forward error correction schemes. Efficient deeply pipelined VLSI architectures, the generalized cascade VD and the trellis pipeline-interleaving (TPI) VD are adaptable to a given data rate only to a limited extent. The authors propose a novel unified class of deeply pipelined architectures, the scalable parallel Viterbi decoders (SPVD) that allows for a smoother adaptation to a given data rate. Therefore, the designer is able to choose an architecture that nearly exactly fulfills the throughput demands of the application without wasting silicon area by using a badly adapted architecture. This class of SPVDs contains the GCVD, TPI, node-serial and node-parallel architectures as important subclasses. Thus, it provides a framework for a unified description of the existing architectures as well. Furthermore, architectures can be derived that allow for 100% utilization making the complicated rate synchronization superfluous or trivial.<<ETX>>

international symposium on circuits and systems | 1992

Trellis pipeline-interleaving: a novel method for efficient Viterbi decoder implementation

Herbert Dawid; Stefan Bitterlich; Heinrich Meyr

The authors derive the novel trellis-pipeline interleaving (TPI) technique for introducing pipeline-interleaving into the nonlinear data-dependent add-compare-select (ACS) recursion in Viterbi decoders. It is shown that the overall recursion can be split into loosely coupled parts which can be computed in an interleaved way. A formal method is given to introduce various degrees of interleaving into the recursion making use of the topological equivalence of various different trellis representations. Conventional high speed Viterbi decoder architectures are coarse-grain pipelined at the ACS level. Using TPI far more efficient solutions are obtained by using fewer processing elements than states. The additional concurrency available through TPI is exploited by fine-grain pipelined architectures. The results agree with the general concept of first introducing pipelining to the maximum possible extent before introducing parallelism to achieve the most efficient solutions.<<ETX>>

international conference on asic | 1999

A universal coprocessor and its application for an ADSL modem

Tilman Glökler; Stefan Bitterlich; Heinrich Meyr

Hardware implementations of digital signal processing tasks are typically considered inflexible and difficult to reuse. The current case study focuses on the design of a universal reusable coprocessor with application specific configurations to accelerate computational intensive tasks. The design methodology and the architecture is described for essential parts of an ADSL transceiver.

global communications conference | 1992

Boosting the implementation efficiency of Viterbi decoders by novel scheduling schemes

Stefan Bitterlich; Herbert Dawid; Heinrich Meyr

The scheduling scheme presented allows one to dramatically cut down the area needed to implement the ACS (add-compare-select) unit, which is the principal area consumer in high-speed Viterbi decoders (VDs). Since computation speed is only slightly compromised, a boost in implementation efficiency over existing methods can be achieved. Since a systematic derivation for a very efficient scheduling scheme is presented, the proposed scheme can be used to construct good schedules for traditional node serial processors as well. Therefore, the scheme is also well suited for VDs with a very high number of states. With the proposed scheduling scheme the same hardware structure can be used to efficiently decode a set of different codes that may even have different constraint lengths. Therefore, the scheme allows for easy and efficient implementation of programmable, high-speed VDs as well. It is shown that the novel scheduling scheme applied to a real-world Viterbi decoder, the 64-states industry standard rate 1/2, k=7 VD (CCSDS 101.0-B-2), leads to an increase of implementation efficiency of up to 600%.<<ETX>>

international conference on acoustics, speech, and signal processing | 2000

DSP core verification using automatic test case generation

Tilman Glökler; Stefan Bitterlich; Heinrich Meyr

The verification methodology for a TMS320C25 compatible embedded DSP core is described. The DSP core has been implemented in synthesizable VHDL and has been cosimulated with the original DSP to verify correct behavior. Automatic test case generation together with hand-crafted code has been used as a means of providing stimuli to achieve increased RTL-simulation coverage. The cosimulation environment for this verification and the process of automatic test case generation is described in detail. Experimental results in terms of simulation coverage are discussed. Finally, a classification of all identified design flaws in the implementation is given and error-prone parts of the HDL design are identified.

Design Automation for Embedded Systems | 1998

Design Methodology for a DVB Satellite Receiver ASIC

Martin Vaupel; Uwe Lambrette; Herbert Dawid; Olaf J. Joeressen; Stefan Bitterlich; Heinrich Meyr; Focko Frieling; Karsten Müller; Götz Kluge

This contribution describes design methodology and implementation of a single-chip timing and carrier synchronizer and channel decoder for digital video broadcasting over satellite (DVB-S). The device consists of an A /D converter with AGC, timing and carrier synchronizer with matched filter, Viterbi decoder including node synchronization, byte and frame synchronizer, convolutional de-interleaver, Reed Solomon decoder, and a descrambler.The system was designed in accordance with the DVB specifications. It is able to perform Viterbi decoding at data rates up to 56 Mbit /s and to sample the analog input values with up to 88 MHz. The chip allows automatic acquisition of the convolutional code rate and the position of the puncturing mask. The symbol synchronization is performed fully digitally by means of interpolation and controlled decimation. Hence, no external analog clock recovery circuit is needed.For algorithm design, system performance evaluation, co-verification of the building blocks, and functional hardware verification an advanced design methodology and the corresponding tool framework are presented which guarantee both short design time and highly reliable results. The chip has been fabricated in a 0.5 µm CMOS technology with three metal layers. A die photograph is included.

Archive | 1997

An All-Digital Single-Chip Symbol Synchronizer and Channel Decoder for DVB

Martin Vaupel; Uwe Lambrette; Herbert Dawid; Olaf J. Joeressen; Stefan Bitterlich; Heinrich Meyr; Focko Frieling; K. Müller

In this contribution, design process and implementation of a single-chip timing and carrier synchronizer and channel decoder for digital video broadcasting over satellite (DVB-S) is described. The device consists of an A-to-D-converter with AGC, timing and carrier synchronizer including matched filter, Viterbi decoder including node synchronization, byte and frame synchronizer, convolutional de-interleaver, Reed Solomon decoder, and a descrambler. The system was designed in accordance with the DVB specifications. It is able to perform Viterbi decoding at data rates up to 56 Mbit/s and to sample the analog input values with up to 88 MHz. The chip allows automatic acquisition of the convolutional code rate and the position of the puncturing mask. The synchronization to the variable sample rates is performed fully digital by means of interpolation and controlled decimation. Hence, no external analog clock recovery circuit is needed. For algorithm design, system performance evaluation, and co-verification of the building blocks an advanced design methodology was used. This guarantees both short design time and high reliability. The chip has been fabricated in a 0.5 Am CMOS technology with three metal layers. A die photograph is presented.

Explore More