Mark Christiaens | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark Christiaens is active.

Explore More

Publication

Featured researches published by Mark Christiaens.

Communications of The ACM | 2003

Record/replay for nondeterministic program executions

Michiel Ronsse; Koenraad De Bosschere; Mark Christiaens; Jacques Chassin de Kergommeaux; Dieter Kranzlmüller

Controlling the nondeterministic features within multithreaded and highly responsive applications enables the continued use of all traditional software development techniques.

Software - Practice and Experience | 2004

JaRec: a portable record/replay environment for multi-threaded Java applications

Andy Georges; Mark Christiaens; Michiel Ronsse; K. De Bosschere

This paper describes JaRec, a portable record/replay system for Java. It correctly replays multi‐threaded, data‐race free Java applications, by recording the order of synchronization operations, and by executing them in the same order during replay. The record/replay infrastructure is developed in Java, and does not require a modification of the Java Virtual Machine (JVM) if it provides the JVM Profiler Interface (JVMPI). If the JVM does not support JVMPI, which is used for intercepting the loaded classes, only a minor modification to the JVM is required in order to run the system. On ystems with limited memory resources, JaRec can be executed in a distributed fashion. This also makes it suitable to aid debugging of multi‐threaded applications on embedded systems. Copyright

field-programmable technology | 2006

Optimizing the critical loop in the H.264/AVC CABAC decoder

Hendrik Eeckhaut; Mark Christiaens; Dirk Stroobandt; Vincent Nollet

This paper presents an innovative hardware implementation of the H.264/AVC CABAC binary arithmetic decoder and context modeler capable of decoding one symbol per clock cycle at high clock frequencies while maintaining a slim hardware footprint. This was achieved by substantially decreasing the latency of the central feedback loop through extensive use of speculative prefetching and aggressive pipelining. Actual synthesis results targeted at the state-of-the-art FPGA families show that our approach results in a fast and compact IP core, ideal for a SoC H.264/AVC implementation

high performance embedded architectures and compilers | 2007

Finding and Applying Loop Transformations for Generating Optimized FPGA Implementations

Harald Devos; Kristof Beyls; Mark Christiaens; Jan Van Campenhout; Erik H. D'Hollander; Dirk Stroobandt

When implementing multimedia applications, solutions in dedicated hardware are chosen only when the required performance or energy-efficiency cannot be met with a software solution. The performance of a hardware design critically depends upon having high levels of parallelism and data locality. Often a long sequence of high-level transformations is needed to sufficiently increase the locality and parallelism. The effect of the transformations is known only after translating the high-level code into a specific design at the circuit level. When the constraints are not met, hardware designers need to redo the high-level loop transformations, and repeat all subsequent translation steps, which leads to long design times. We propose a method to reduce design time through the synergistic combination of techniques (a) to quickly pinpoint the loop transformations that increase locality; (b) to refactor loops in a polyhedral model and check whether a sequence of refactorings is legal; (c) to generate efficient structural VHDL from the optimized refactored algorithm. The implementation of these techniques in a tool suite results in a far shorter design time of hours instead of days or weeks. A 2D-inverse discrete wavelet transform was taken as a case study. The results outperform those of a commercial C-to-VHDL compiler, and compare favorably with existing published approaches.

Journal of Systems Architecture | 1999

A fast, cache-aware algorithm for the calculation of radiological paths exploiting subword parallelism

Mark Christiaens; Bjorn De Sutter; Koen De Bosschere; Jan Van Campenhout; Ignace Lemahieu

Abstract The calculation of radiological paths is the most important part in statistical positron emission tomography image reconstruction algorithms. We present a new, faster algorithm which replaces Siddons. Further code transformations on this algorithm prove to be beneficial in a Maximum Likelihood–Expectation Maximization reconstruction algorithm and the result is perfectly suitable for an implementation that exploits the VISual instruction set from Sun or other modern architectural extensions providing subword parallelism. The final speed-up achieved with this new algorithm and its subword parallel implementation is 13. Though smaller data formats are used in subword parallelism, the resulting images are as good as the original ones.

scalable information systems | 2006

Scalable hardware accelerator for comparing DNA and protein sequences

Philippe Faes; Bram Minnaert; Mark Christiaens; Eric Bonnet; Yvan Saeys; Dirk Stroobandt; Yves Van de Peer

Comparing genetic sequences is a well-known problem in bioinformatics. Newly determined sequences are being compared to known sequences stored in databases in order to investigate biological functions. In recent years the number of available sequences has increased exponentially. Because of this explosion a speedup in the comparison process is highly required. To meet this demand we implemented a dynamic programming algorithm for sequence alignment on reconfigurable hardware. The algorithm we implemented, Smith-Waterman-Gotoh (SWG) has not been implemented in hardware before. We show a speedup factor of 40 in a design that scales well with the size of the available hardware. We also demonstrate the limits of larger hardware for small problems, and project our design on the largest Field Programmable Gate Array (FPGA) available today.

IEEE Transactions on Multimedia | 2007

Scalable, Wavelet-Based Video: From Server to Hardware-Accelerated Client

Hendrik Eeckhaut; Harald Devos; Peter Lambert; Davy De Schrijver; W. Van Lancker; Vincent Nollet; Prabhat Avasare; Tom Clerckx; Fabio Verdicchio; Mark Christiaens; Peter Schelkens; R. Van de Walle; Dirk Stroobandt

Video source, carrier and client diversification have led the video coding community to develop scalable video codecs supporting efficient decoding at varying resolution, frame rate and quality. Scalable video has several advantages over a nonscalable approach, but a large scale deployment is far from trivial and a lot of open questions remain. To resolve these, we developed a complete video delivery chain for scalable wavelet-based video. This includes a video server, a negotiation framework, a video scaling infrastructure and two scalable video clients, one pure software client and one real-time, hardware accelerated client. This paper describes the complete chain and identifies and quantifies the impact of using scalable video in every link of this chain.

field-programmable logic and applications | 2005

FPGA-aware garbage collection in Java

Philippe Faes; Mark Christiaens; Dries Buytaert; D. Strooband

During codesign of a system, one still runs into the impedance mismatch between the software and hardware worlds. This paper identifies the different levels of abstraction of hardware and software as a major culprit of this mismatch. For example, when programming in high-level object-oriented languages like Java, one has disposal of objects, methods, memory management, that facilitates development but these have to be largely abandoned when moving the same functionality into hardware. As a solution, this paper presents a virtual machine, based on the Jikes Research Virtual Machine, that is able to bridge the gap by providing the same capabilities to hardware components as to software components. This seamless integration is achieved by introducing an architecture and protocol that allow reconfigurable hardware and software to communicate with each other in a transparent manner i.e. no component of the design needs to be aware whether other components are implemented in hardware or in software. Further, in this paper we present a novel technique that allows reconfigurable hardware to manage dynamically allocated memory. This is achieved by allowing the hardware to hold references to objects and by modifying the garbage collector of the virtual machine to be aware of these references in hardware. We present benchmark results that show, for four different, well-known garbage collectors and for a wide range of applications, that a hardware-aware garbage collector results in a marginal overhead and is therefore a worthwhile addition to the developers toolbox.

design, automation, and test in europe | 2005

A Hardware-Friendly Wavelet Entropy Codec for Scalable Video

Hendrik Eeckhaut; Harald Devos; Benjamin Schrauwen; Mark Christiaens; Dirk Stroobandt

A scalable video codec provides the ability to produce a smaller video stream with reduced frame rate, resolution or image quality starting from the original encoded video stream with almost no additional computation. This is important for portable devices that have different quality of service (QoS) requirements and power restrictions. Conventional video codecs do not possess this property; reduced quality is obtained through the arduous process of decoding the encoded video stream and recoding it at a lower quality. Producing such a smaller stream has therefore a very high computational cost. In this article, we present the results of our investigation into the hardware implementation of such a scalable video codec. In particular, we found that the implementation of the entropy codec is a significant bottleneck. We present an alternative, hardware friendly algorithm for entropy coding with superior data locality (both temporal and spatial), with a smaller memory footprint and superior compression while maintaining all required scalability properties.

international conference / workshop on embedded computer systems: architectures, modeling and simulation | 2004

Reconfigurable Hardware for a Scalable Wavelet Video Decoder and Its Performance Requirements

Dirk Stroobandt; Hendrik Eeckhaut; Harald Devos; Mark Christiaens; Fabio Verdicchio; Peter Schelkens

Multimedia applications emerge on portable devices everywhere. These applications typically have a number of stringent requirements: (i) a high amount of computational power together with real-time performance and (ii) the flexibility to modify the application or the characteristics of the application at will. The performance requirements often drive the design towards a hardware implementation while the flexibility requirement is better served by a software implementation. In this paper we try to reconcile these two requirements by using an FPGA to implement the performance critical parts of a scalable wavelet video decoder. Through analytical means we first explore the performance and resource requirements. We find that modern FPGAs offer enough computational power to obtain real-time performance of the decoder, but that reaching the necessary memory bandwidth will be a challenge during this design.

Explore More