Ernst Martin Witte | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ernst Martin Witte is active.

Explore More

Publication

Featured researches published by Ernst Martin Witte.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2010

A Scalable VLSI Architecture for Soft-Input Soft-Output Single Tree-Search Sphere Decoding

Ernst Martin Witte; Filippo Borlenghi; Gerd Ascheid; Rainer Leupers; Heinrich Meyr

Multiple-input multiple-output (MIMO) wireless transmission imposes huge challenges on the design of efficient hardware architectures for iterative receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping, often approached by sphere decoding (SD). In this brief, we introduce-to our best knowledge-the first VLSI architecture for SISO SD applying a single tree-search approach. Compared with a soft-output-only base architecture similar to the one proposed by Studer in IEEE J-SAC 2008, the architectural modifications for soft input still allow a one-node-per-cycle execution. For a 4×4 antennas system using quadrature amplitude modulation (QAM) with order 16, the area increases by 57%, and the operating frequency degrades by 34% only.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2007

Application-Specific Instruction-Set Processor for Retinex-Like Image and Video Processing

Sergio Saponara; Luca Fanucci; Stefano Marsi; Giovanni Ramponi; David Kammler; Ernst Martin Witte

This brief presents an application-specific instruction-set processor (ASIP) for real-time Retinex image and video filtering. Design optimizations are addressed at algorithmic and architectural levels, the latter including a dedicated memory structure, an adapted pipeline, bypasses, a custom address generator and special looping structures. Synthesized in CMOS technology, the ASIP stands for its better energy-flexibility tradeoff versus reference ASIC and digital signal processing Retinex implementations.

asian solid state circuits conference | 2011

A 772Mbit/s 8.81bit/nJ 90nm CMOS soft-input soft-output sphere decoder

Filippo Borlenghi; Ernst Martin Witte; Gerd Ascheid; Heinrich Meyr; Andreas Burg

Multiple-input multiple-output (MIMO) wireless transmission can approach its full potential in terms of spectral efficiency only with iterative decoding, i.e., by exchanging soft information between the MIMO detector and the channel decoder. Solving the soft-input soft-output (SISO) MIMO detection problem entails a very high complexity, which can typically be reduced only at the cost of a communication-performance penalty. The single tree-search (STS) sphere-decoding (SD) algorithm covers a wide range of this complexity-performance tradeoff. In this paper, we describe the silicon implementation of SISO STS SD. The 90 nm CMOS ASIC operates at a lower signal-to-noise ratio than other MIMO detectors. The maximum throughput is 772 Mbit/s at an energy efficiency of 8.81 bit/nJ.

rapid system prototyping | 2005

Optimization techniques for ADL-driven RTL processor synthesis

Oliver Schliebusch; Anupam Chattopadhyay; Ernst Martin Witte; David Kammler; Gerd Ascheid; Rainer Leupers; Heinrich Meyr

Nowadays, architecture description languages (ADLs) are becoming popular for speeding up the development of complex SoC design, by performing design space exploration at a higher level of abstraction. This increase in the abstraction level traditionally comes at the cost of low performance of the final application specific instruction-set processor (ASIP) implementation, which is generated automatically from the ADL. There is a pressing need for novel optimization techniques for high level synthesis from ADLs, to compensate for this loss of performance. Two important aspects of these optimizations are the efficient usage of available structural information in the high level architecture descriptions and prudent pruning of overhead, introduced by mapping from ADL to register transfer level (RTL). In this paper, we present two high level optimization techniques, path sharing and decision minimization. These optimization techniques are shown to be of lower complexity, by at least two orders, compared to similar optimization during gate-level synthesis. The optimizations are tested for a RISC architecture, a VLIW architecture and two industrial embedded processors, Motorola M68HC11 and Infineon ICORE. The results indicate a significant improvement in overall performance.

military communications conference | 2009

Efficient and portable SDR waveform development: The Nucleus concept

Venkatesh Ramakrishnan; Ernst Martin Witte; Torsten Kempf; David Kammler; Gerd Ascheid; Rainer Leupers; Heinrich Meyr; Marc Adrat; Markus Antweiler

Future wireless communication systems should be flexible to support different waveforms (WFs) and be cognitive to sense the environment and tune themselves. This has lead to tremendous interest in software defined radios (SDRs). Constraints like throughput, latency and low energy demand high implementation efficiency. The tradeoff of going for a highly efficient WF implementation is the increase of porting effort to a new HW platform. In this paper, we propose a novel concept for WF development, the Nucleus concept, that exploits the common structure in various wireless signal processing algorithms and provides a way for efficient and portable implementation. Tool assisted WF mapping and exploration is done efficiently by propagating the implementation and interface properties of Nuclei. The Nucleus concept aims at providing software flexibility with high level programmability, but at the same time limiting HW flexibility to maximize area and energy efficiency.

design, automation, and test in europe | 2006

Automatic ADL-based Operand Isolation for Embedded Processors

Anupam Chattopadhyay; B. Geukes; David Kammler; Ernst Martin Witte; Oliver Schliebusch; Harold Ishebabi; Rainer Leupers; Gerd Ascheid; Heinrich Meyr

Cutting-edge applications of future embedded systems demand highest processor performance with low power consumption to get acceptable battery-life times. Therefore, low power optimization techniques are strongly applied during the development of modern application specific instruction set processors (ASIPs). Electronic system level design tools based on architecture description languages (ADL) offer a significant reduction in design time and effort by automatically generating the software tool-suite as well as the register transfer level (RTL) description of the processor. In this paper, the automation of power optimization in ADL-based RTL generation is addressed. Operand isolation is a well-known power optimization technique applicable at all stages of processor development. With increasing design complexity several efforts have been undertaken to automate operand isolation. In pipelined datapaths, where isolating signals are often implicitly available, the traditional RTL-based approach introduces unnecessary overhead. We propose an approach which extracts high-level structural information from the ADL representation and systematically uses the available control signals. Our experiments with state-of-the-art embedded processors show a significant power reduction (improvement in power efficiency)

international conference on computer design | 2005

Applying resource sharing algorithms to ADL-driven automatic ASIP implementation

Ernst Martin Witte; Anupam Chattopadhyay; Oliver Schliebusch; David Kammler; Rainer Leupers; Gerd Ascheid; Heinrich Meyr

Presently, architecture description languages (ADLs) are widely used to raise the abstraction level of the design space exploration of application specific instruction-set processors (ASIPs), benefiting from automatically generated software tool suite and RTL implementation. The increase of abstraction level and automated implementation traditionally comes at the cost of low area, delay or power efficiency. The standard synthesis flow starting at RTL abstraction fails to compensate for this loss of performance. Thus, high level optimizations during RTL synthesis from ADLs are obligatory. Currently, ADL-based optimization schemes do not perform resource sharing. In this paper, we present an iterative algorithm for performing resource sharing on the basis of global dataflow graph matching criteria. This ADL-based resource sharing optimization is performed over a RISC and a VLIW architecture and two industrial embedded processors. The results indicate a significant improvement in overall performance. A comparative study with manually written RTL code is presented, too.

international conference on vlsi design | 2007

Power-efficient Instruction Encoding Optimization for Embedded Processors

Anupam Chattopadhyay; Diandian Zhang; David Kammler; Ernst Martin Witte; Rainer Leupers; Gerd Ascheid; Heinrich Meyr

The increasing complexity of applications with shortening time-to-market window created a strong research interest towards high-performance and flexible processors. A huge application domain, chiefly consisting of wireless and handheld devices, strongly requires this class of processors to be power-efficient, too. Within this domain, a demanding problem is to determine the instruction encoding of the processor for achieving minimum power consumption in the instruction bus and in the instruction memory. In this paper, a framework for determining power-efficient instruction encoding is presented. The authors have integrated existing and novel techniques in this framework and have proposed novel heuristic approaches. The framework accepts an existing processor instruction-set and a group of applications. The output, which is an optimized instruction encoding under the constraints of a well-defined cost model, minimizes the power consumption of the instruction bus and the instruction memory. This results in strong reduction of the overall power consumption. Case studies with commercial embedded processors show the effectiveness of this framework

international symposium on vlsi design, automation and test | 2006

Automatic Low Power Optimizations during ADL-driven ASIP Design

Anupam Chattopadhyay; David Kammler; Ernst Martin Witte; Oliver Schliebusch; Harold Ishebabi; B. Geukes; Rainer Leupers; Gerd Ascheid; Heinrich Meyr

Increasing complexity of cutting-edge applications for future embedded systems demand even higher processor performance with a strong consideration for battery-life. Low power optimization techniques are, therefore, widely applied towards the development of modern application specific instruction-set processors (ASIPs). Architecture description languages (ADLs) offer the ASIP designers a quick and optimal design convergence by automatically generating the software tool-suite as well as the register transfer level (RTL) description of the processor. The automatically generated processor description is then subjected to the traditional RTL-based synthesis flow. Power-specific optimizations, often found in RTL-based commercial tools, cannot take the full advantage of the architectural knowledge embedded in the ADL description, resulting in sub-optimal power efficiency. In this paper, we address this issue by describing an efficient and universal technique of automatic insertion of gated clocks during the ADL-based ASIP design flow. Experiments with ASIP benchmarks show the dramatic impact of our approach by reducing power consumption up to 41% percent compared to naive RTL synthesis from ADL description, without any incurred overhead for area and speed

design, automation, and test in europe | 2006

ASIP Design and Synthesis for Non Linear Filtering in Image Processing

Luca Fanucci; Michele Cassiano; Sergio Saponara; David Kammler; Ernst Martin Witte; Oliver Schliebusch; Gerd Ascheid; Rainer Leupers; Heinrich Meyr

This paper presents an application specific instruction set processor (ASIP) design for the implementation of a class of nonlinear image processing algorithms, the Retinex-like filters. Starting from high level descriptions, first algorithmic optimization is accomplished. Then a processor architecture and an instruction set are customized with special respect to the algorithmic computations in order to achieve the specified timing at reasonable complexity. Taking advantage of the programmability of processor architectures, the flexibility of the system is increased, involving e.g. dynamic parameter adjustment and color treatment. ASIP implementation results in 0.13 mum CMOS technology are presented

Explore More