Jan Kuper | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jan Kuper is active.

Explore More

Publication

Featured researches published by Jan Kuper.

worst case execution time analysis | 2010

A mathematical approach towards hardware design

Gerardus Johannes Maria Smit; Jan Kuper; Christiaan Baaij

Modelling of real-time systems requires accurate and tight estimates of the Worst-Case Execution Time (WCET) of each task scheduled to run. In the past two decades, two main paradigms have emerged within the field of WCET analysis: static analysis and hybrid measurement-based analysis. These techniques have been succesfully implemented in prototype and commercial toolsets. Yet, comparison among the WCET estimates derived by such tools remains somewhat elusive as it requires a common set of benchmarks which serve a multitude of needs. The Mälardalen WCET research group maintains a large number of WCET benchmark programs for this purpose. This paper describes properties of the existing benchmarks, including their relative strengths and weaknesses. We propose extensions to the benchmarks which will allow any type of WCET tool evaluate its results against other state-of-the-art tools, thus setting a high standard for future research and development. We also propose an organization supporting the future work with the benchmarks. We suggest to form a committee with a responsibility for the benchmarks, and that the benchmark web site is transformed to an open wiki, with possibility for the WCET community to easily update the benchmarks.

design, automation, and test in europe | 2008

Run-time spatial mapping of streaming applications to a heterogeneous multi-processor system-on-chip (MPSoC)

P.K.F. Holzenspies; Johann L. Hurink; Jan Kuper; Gerardus Johannes Maria Smit

In this paper, we present an algorithm for run-time allocation of hardware resources to software applications. We define the sub-problem of run-time spatial mapping and demonstrate our concept for streaming applications on heterogeneous MPSoCs. The underlying algorithm and the methods used therein are implemented and their use is demonstrated with an illustrative example.

digital systems design | 2010

C?aSH: Structural Descriptions of Synchronous Hardware Using Haskell

Christiaan Baaij; Matthijs Kooijman; Jan Kuper; Arjan Boeijink; Marco Egbertus Theodorus Gerards

CλaSH is a functional hardware description language that borrows both its syntax and semantics from the functional programming language Haskell. Polymorphism and higher-order functions provide a level of abstraction and generality that allow a circuit designer to describe circuits in a more natural way than possible with the language elements found in the traditional hardware description languages. Circuit descriptions can be translated to synthesizable VHDL using the prototype CλaSH compiler. As the circuit descriptions, simulation code, and test input are also valid Haskell, complete simulations can be done by a Haskell compiler or interpreter, allowing high-speed simulation and analysis.

emerging technologies and factory automation | 2007

SensorScheme: Supply chain management automation using Wireless Sensor Networks

Leon Evers; Paul J.M. Havinga; Jan Kuper; M.E.M. Lijding

The supply chain management business can benefit greatly from automation, as recent developments with RFID technology shows. The use of Wireless Sensor Network technology promises to bring the next leap in efficiency and quality of service. However, current WSN system software does not yet provide the required functionality, flexibility and safety. This paper discusses a scenario showing how WSN technology can benefit supply chain management, and presents SensorScheme, a platform for realizing the scenario. SensorScheme is a general purpose WSN platform, providing a safe execution environment for dynamically loaded programs. It uses high level programming primitives like marshalled communication, automatic memory management, and multiprocessing facilities. SensorScheme makes efficient use of the little available memory present in WSN nodes, to allow larger and more complex programs than the state of the art. We present a SensorScheme implementation and provide experimental results to show its compactness, speed of operation and energy efficiency.

high performance embedded architectures and compilers | 2013

Optimal DPM and DVFS for frame-based real-time systems

Marco Egbertus Theodorus Gerards; Jan Kuper

Dynamic Power Management (DPM) and Dynamic Voltage and Frequency Scaling (DVFS) are popular techniques for reducing energy consumption. Algorithms for optimal DVFS exist, but optimal DPM and the optimal combination of DVFS and DPM are not yet solved. In this article we use well-established models of DPM and DVFS for frame-based systems. We show that it is not sufficient—as some authors argue—to consider only individual invocations of a task. We define a schedule that also takes interactions between invocations into account and prove—in a theoretical fashion—that this schedule is optimal.

software engineering and advanced applications | 2012

Max-Plus Algebraic Throughput Analysis of Synchronous Dataflow Graphs

Robert de Groote; Jan Kuper; Haitze J. Broersma; Gerardus Johannes Maria Smit

In this paper we present a novel approach to throughput analysis of synchronous dataflow (SDF) graphs. Our approach is based on describing the evolution of actor firing times as a linear time-invariant system in max-plus algebra. Experimental results indicate that our approach is faster than state-of-the-art approaches to throughput analysis of SDF graphs. The efficiency of our approach is due to an exploitation of the regular structure of the max-plus systems graphical representation, the properties of which we thoroughly prove.

IEEE Transactions on Computers | 2014

On the interplay between global DVFS and scheduling tasks with precedence constraints

Marco Egbertus Theodorus Gerards; Johann L. Hurink; Jan Kuper

Many multicore processors are capable of decreasing the voltage and clock frequency to save energy at the cost of an increased delay. While a large part of the theory oriented literature focuses on local dynamic voltage and frequency scaling (local DVFS), where every cores voltage and clock frequency can be set separately, this article presents an in-depth theoretical study of the more commonly available global DVFS that makes such changes for the entire chip. This article shows how to choose the optimal clock frequencies that minimize the energy for global DVFS, and it discusses the relationship between scheduling and optimal global DVFS. Formulas are given to find this optimum under time constraints, including proofs thereof. The problem of simultaneously choosing clock frequencies and a schedule that together minimize the energy consumption is discussed, and based on this a scheduling criterion is derived that implicitly assigns frequencies and minimizes energy consumption. Furthermore, this article studies the effectivity of a large class of scheduling algorithms with regard to the derived criterion, and a bound on the maximal relative deviation is given. Simulations show that with our techniques an energy reduction of 30% can be achieved with respect to state-of-the-art research.

design, automation, and test in europe | 2010

Run-time spatial resource management for real-time applications on heterogeneous MPSoCs

T.D. ter Braak; P.K.F. Holzenspies; Jan Kuper; Johann L. Hurink; Gerardus Johannes Maria Smit

Design-time application mapping is limited to a predefined set of applications and a static platform. Resource management at run-time is required to handle future changes in the application set, and to provide some degree of fault tolerance, due to imperfect production processes and wear of materials. This paper concerns resource allocation at run-time, allowing multiple real-time applications to run simultaneously on a heterogeneous MPSoC. Low-complexity algorithms are required, in order to respond fast enough to unpredictable execution requests. We present a decomposition of this problem into four phases. The allocation of tasks to specific locations in the platform is the main contribution of this work. Experiments on a real platform show the feasibility of this approach, with execution times in tens of milliseconds for a single allocation attempt.

digital systems design | 2009

Streaming Reduction Circuit

Marco Egbertus Theodorus Gerards; Jan Kuper; Andre B.J. Kokkeler; Bert Molenkamp

Reduction circuits are used to reduce rows of ﬂoating point values to single values. Binary ﬂoating point operators often have deep pipelines, which may cause hazards when many consecutive rows have to be reduced. We present an algorithm by which any number of consecutive rows of arbitrary lengths can be reduced by a pipelined commutative and associative binary operator in an efficient manner. The algorithm is simple to implement, has a low latency, produces results in-order, and requires only small buffers. Besides, it uses only a single pipeline for the involved operation. The complexity of the algorithm depends on the depth of the pipeline, not on the length of the input rows. In this paper we discuss an implementation of this algorithm and we prove its correctness.

parallel, distributed and network-based processing | 2014

Analytic Clock Frequency Selection for Global DVFS

Marco Egbertus Theodorus Gerards; Johann L. Hurink; P.K.F. Holzenspies; Jan Kuper; Gerardus Johannes Maria Smit

Computers can reduce their power consumption by decreasing their speed using Dynamic Voltage and Frequency Scaling (DVFS). A form of DVFS for multicore processors is global DVFS, where the voltage and clock frequency is shared among all processor cores. Because global DVFS is efficient and cheap to implement, it is used in modern multicore processors like the IBM Power 7, ARM Cortex A9 and NVIDIA Tegra 2. This theory oriented paper discusses energy optimal DVFS algorithms for such processors. There are no known provably optimal algorithms that minimize the energy consumption of nontrivial real-time applications on a global DVFS system. Such algorithms only exist for single core systems, or for simpler application models. While many DVFS algorithms focus on tasks, this theoretical study is conceptually different and focuses on the amount of parallelism. We provide a transformation from a multicore problem to a single core problem, by using the amount of parallelism of an application. Then existing single core algorithms can be used to find the optimal solution. Furthermore, we extend an existing single core algorithm such that it takes static power into account.

Explore More