Kristof Denolf | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kristof Denolf is active.

Explore More

Publication

Featured researches published by Kristof Denolf.

EURASIP Journal on Advances in Signal Processing | 2007

SPRINT: a tool to generate concurrent transaction-level models from sequential code

Johan Cockx; Kristof Denolf; Bart Vanhoof; Richard Stahl

A high-level concurrent model such as a SystemC transaction-level model can provide early feedback during the exploration of implementation alternatives for state-of-the-art signal processing applications like video codecs on a multiprocessor platform. However, the creation of such a model starting from sequential code is a time-consuming and error-prone task. It is typically done only once, if at all, for a given design. This lack of exploration of the design space often leads to a suboptimal implementation. To support our systematic C-based design flow, we have developed a tool to generate a concurrent SystemC transaction-level model for user-selected task boundaries. Using this tool, different parallelization alternatives have been evaluated during the design of an MPEG-4 simple profile encoder and an embedded zero-tree coder. Generation plus evaluation of an alternative was possible in less than six minutes. This is fast enough to allow extensive exploration of the design space.

signal processing systems | 2002

Initial memory complexity analysis of the AVC codec

Kristof Denolf; Carolina Blanch; Gauthier Lafruit; A. Bormans

The Advanced Video Codec (AVC), currently being defined in a joined standardisation effort of ISO/IEC MPEG and ITU-T VCEG, aims at enhanced compression efficiency and network friendliness. To achieve these goals, a motion compensated hybrid DCT algorithm is introduced using advanced and complicated compression tools. As video coding is typically a data dominated process, we quantify the complexity cost in a memory centric way. The AVC codec is characterised by a large memory footprint and increased data transfer rate (an order of magnitude for the encoder) compared to previous video coding standards. The motion estimation/compensation are the initial implementation bottlenecks.

IEEE Transactions on Circuits and Systems for Video Technology | 2002

Algorithmic and architectural co-design of a motion-estimation engine for low-power video devices

C. De Vleeschouwer; T. Nilsson; Kristof Denolf; J. Bormans

Due to the large amount of data transfers it involves, the motion estimation (ME) engine is one of the most power-consuming components of any predictive video codec. As a consequence, power-optimized video coding primarily relies on a carefully designed motion estimator. This paper first presents a block ME algorithm that meets high-quality inter-frame prediction and low computational complexity requirements. It relies on a set of rules common to all recent fast and adaptive ME algorithms, but is designed in order to allow for easy and prolific data reuse. The adjacent order of the candidate positions during the search increases the locality and maintains a near-regular data flow, which results in a decrease of the data transfers and a low control complexity. Together with the computational complexity reduction, it enables cost-efficient very large scale integration realizations. A pipelined parallel architecture is then proposed and discussed. It is generic in the sense that it is suited both to the full-pel and half-pel ME. It is efficient because it allows for close to 100% hardware utilization and a sharp decrease of the peak memory bandwidth. It is suited to low-power implementation, as it enables larger data reuse factors for the most probable stages of the adaptive algorithm, which reduces the average memory bandwidth and power consumption.

international symposium on circuits and systems | 2000

3D computational graceful degradation

Gauthier Lafruit; Lode Nachtergaele; Kristof Denolf; J. Bormans

With new multimedia standards, such as MPEG-4, media can progressively be transmitted at different levels of detail, which allows dynamic adaptation to the available network bandwidth. We present a new technique, 3D Computational Graceful Degradation (CGD), that exploits this incremental coding/decoding process to constrain the terminals processing requirements to a predefined level, independently of the degree of complexity of the incoming data. Our attention is directed at 3D scenes, for which the variability of content complexity can range over several orders of magnitude. We provide evidence that the load of the 3D decoding and rendering modules can predictively be estimated and controlled, using a limited amount of statistical measures of the incoming 3D data.

IEEE Transactions on Circuits and Systems for Video Technology | 2005

Memory centric design of an MPEG-4 video encoder

Kristof Denolf; C. De Vleeschouwer; R. Turney; Gauthier Lafruit; J. Bormans

The cost-efficient implementation of video codecs requires a set of methodologies and decision taking at different levels in the design flow. We combine upfront algorithmic tuning with memory centric optimizations to transform the video application into a system consisting of functional blocks with localized data processing and a tailored memory hierarchy. This memory optimized functional description is the leverage for the cost-efficient mapping of the system on integrated multimedia platforms. It closely reflects the real implementation constraints and consequently allows for steering the architecture selection in a correct way. The proposed approach is demonstrated on a MPEG-4 video encoder and leads to its implementation as a pipelined system. Hardware development of the motion estimation validates that the high-level memory centric concepts are applicable and realizable at the lowest level. The motion estimation kernel supports up to 30 CIF f/s with minimized processing element requirements and data input rates.

EURASIP Journal on Advances in Signal Processing | 2007

Exploiting the Expressiveness of Cyclo-Static Dataflow to Model Multimedia Implementations

Kristof Denolf; Marco Jan Gerrit Bekooij; Johan Cockx; Diederik Verkest; Henk Corporaal

The design of increasingly complex and concurrent multimedia systems requires a description at a higher abstraction level. Using an appropriate model of computation helps to reason about the system and enables design time analysis methods. The nature of multimedia processing matches in many cases well with cyclo-static dataflow (CSDF), making it a suitable model. However, channels in an implementation often use for cost reasons a kind of shared buffer that cannot be directly described in CSDF. This paper shows how such implementation specific aspects can be expressed in CSDF without the need for extensions. Consequently, the CSDF graph remains completely analyzable and allows reasoning about its temporal behavior. The obtained relation between model and implementation enables a buffer capacity analysis on the model while assuring the throughput of the final implementation. The capabilities of the approach are demonstrated by analyzing the temporal behavior of an MPEG-4 video encoder with a CSDF graph.

power and timing modeling optimization and simulation | 2000

Cost-Efficient C-Level Design of an MPEG-4 Video Decoder

Kristof Denolf; Peter Vos; Jan Bormans; Ivo Bolsens

Advanced multimedia systems intrinsically have a high memory cost, making the design of high performance, low power solutions a real challenge. Rather than spending most effort on implementation platform dependent optimization steps, we advocate a methodology and tool that involve C-level platform independent optimizations. This approach is applied to an MPEG-4 video decoder, leading to high performance, reusable C code. When mapped on (embedded) processors, this allows for lower clock rates, enabling low power realizations.

field-programmable logic and applications | 2009

Using C-to-gates to program streaming image processing kernels efficiently on FPGAs

Kristof Denolf; Stephen Neuendorffer; Kees A. Vissers

Effectively exploiting the variety of computational and storage resources available in common FPGA architectures for complex applications, such as the real-time implementation of vision algorithms, is often difficult in standard HDL design methodologies. Higher-level design tools can enable a design to more quickly explore a range of different architectures. In this paper we apply algorithmic C-to-FPGA synthesis technology in a structured design approach and demonstrate its added value on two relevant vision processing kernels: optical flow and debayering. The impact of the proposed approach on the design time, the FPGA resource consumption and the throughput is measured.

asia and south pacific design automation conference | 2002

Systematic Address and Control Code Transformations for Performance Optimisation of a MPEG-4 Video Decoder

Martin Palkovic; Miguel Miranda; Kristof Denolf; Peter Vos; Francky Catthoor

A cost-efficient realisation of an advanced multimedia system requires high-level memory optimisations to deal with the dominant memory cost. This typically results in more efficient code for both power and system bus load. However, significant performance improvement can also be achieved when carefully optimising the address functionality. This paper shows how the nature of this addressing code and the related control flow allows transformation of the complex index, iterator and condition expressions into efficient arithmetic. We apply our address optimisation (ADOPT) design technology to a low power memory optimised MPEG-4 decoder When mapped on popular programmable multimedia processor architectures, we obtain a factor of 2 in performance gain.

design, automation, and test in europe | 2004

A power optimized display memory organization for handheld user terminals

Lieven Hollevoet; Andy Dewilde; Kristof Denolf; Francky Catthoor; Filip Louagie

Todays handheld devices become more and more multimedia capable. One subsystem of a multimedia terminal that accounts for a considerable amount of the total power consumption is the display unit. The backlight is the major culprit there. As new display units without backlights emerge, the data transfers required to put data on the screen start using up an increasingly important part of the platforms power. We have examined a novel system view that allows for power savings by decreasing the required number of memory accesses to put a frame on the screen. A two-step optimization method for existing platforms is presented. Measurements on a multimedia application show that, on average, power savings of 72% can be obtained on the display related memory accesses. For the proposed optimizations methods to work, it is important that both hardware and software designers become aware of the impact their design-time decisions have on the final power consumption of a system.

Explore More