Filip Claeys | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Filip Claeys is active.

Explore More

Publication

Featured researches published by Filip Claeys.

IEEE Transactions on Parallel and Distributed Systems | 2009

Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids

Maria Chtepen; Filip Claeys; Bart Dhoedt; F. De Turck; Piet Demeester; Peter Vanrolleghem

A grid is a distributed computational and storage environment often composed of heterogeneous autonomously managed subsystems. As a result, varying resource availability becomes commonplace, often resulting in loss and delay of executing jobs. To ensure good grid performance, fault tolerance should be taken into account. Commonly utilized techniques for providing fault tolerance in distributed systems are periodic job checkpointing and replication. While very robust, both techniques can delay job execution if inappropriate checkpointing intervals and replica numbers are chosen. This paper introduces several heuristics that dynamically adapt the above mentioned parameters based on information on grid status to provide high job throughput in the presence of failure while reducing the system overhead. Furthermore, a novel fault-tolerant algorithm combining checkpointing and replication is presented. The proposed methods are evaluated in a newly developed grid simulation environment dynamic scheduling in distributed environments (DSiDE), which allows for easy modeling of dynamic system and job behavior. Simulations are run employing workload and system parameters derived from logs that were collected from several large-scale parallel production systems. Experiments have shown that adaptive approaches can considerably improve system performance, while the preference for one of the solutions depends on particular system characteristics, such as load, job submission patterns, and failure frequency.

Environmental Modelling and Software | 2008

Tools to support a model-based methodology for emission/immission and benefit/cost/risk analysis of wastewater systems that considers uncertainty

Lorenzo Benedetti; Davide Bixio; Filip Claeys; Peter Vanrolleghem

This paper presents a set of tools developed to support an innovative methodology to design and upgrade wastewater treatment systems in a probabilistic way. For the first step, data reconstruction, two different tools were developed, one for situations where data are available and another one where no data are available. The second step, modelling and simulation, implied the development of a new simulation platform and of distributed computation software to deal with the simulation load generated by the third step, uncertainty analysis, with Monte Carlo simulations of the system over one year, important dynamics and stiff behaviour. For the fourth step, evaluation of alternatives, the evaluator tool processes the results of the simulations and plots the relevant information regarding the robustness of the process against input and parameters uncertainties, as well as concentration-duration curves for the risk of non-compliance with effluent and receiving water quality limits. This paper illustrates the merits of these tools to make the innovative methodology of practical interest. The design practice should move from conventional procedures suited for the relatively fixed context of emission limits, to more advanced, transparent and cost-effective procedures appropriate to cope with the flexibility and complexity introduced by integrated water management approaches.

international conference on conceptual structures | 2007

Providing Fault-Tolerance in Unreliable Grid Systems Through Adaptive Checkpointing and Replication

Maria Chtepen; Filip Claeys; Bart Dhoedt; Filip De Turck; Peter Vanrolleghem; Piet Demeester

As grids typically consist of autonomously managed subsystems with strongly varying resources, fault-tolerance forms an important aspect of the scheduling process of applications. Two well-known techniques for providing fault-tolerance in grids are periodic task checkpointing and replication. Both techniques mitigate the amount of work lost due to changing system availability but can introduce significant run-time overhead. The latter largely depends on the length of checkpointing interval and the chosen number of replicas, respectively. This paper presents a dynamic scheduling algorithm that switches between periodic checkpointing and replication to exploit the advantages of both techniques and to reduce the overhead. Furthermore, several novel heuristics are discussed that perform on-line adaptive tuning of the checkpointing period based on historical information on resource behavior. Simulation-based comparison of the proposed combined algorithm versus traditional strategies based on checkpointing and replication only, suggests significant reduction of average task makespan for systems with varying load.

information technology interfaces | 2009

Adaptive checkpointing in dynamic grids for uncertain job durations

Maria Chtepen; Bart Dhoedt; Filip De Turck; Piet Demeester; Filip Claeys; Peter Vanrolleghem

Adaptive checkpointing is a relatively new approach that is particularly suitable for providing fault-tolerance in dynamic and unstable grid environments. The approach allows for periodic modification of checkpointing intervals at run-time, when additional information becomes available. In this paper an adaptive algorithm, named MeanFailureCP+, is introduced that deals with checkpointing of grid applications with execution times that are unknown a priori. The algorithm modifies its parameters, based on dynamically collected feedback on its performance. Simulation results show that the new algorithm performs even better than adaptive approaches that make use of exact information on job execution times.

The Journal of Supercomputing | 2012

Online execution time prediction for computationally intensive applications with periodic progress updates

Maria Chtepen; Filip Claeys; Bart Dhoedt; Filip De Turck; Jan Fostier; Piet Demeester; Peter Vanrolleghem

The effectiveness of distributed execution of computationally intensive applications (jobs) largely depends on the quality of the applied scheduling approach. However, most of the existing non-trivial scheduling algorithms rely on prior knowledge or on prediction of application parameters, such as execution time, size of input and output, dependencies, etc., to assign applications to the available computational resources. A major issue is that these parameters are hard to determine in advance, especially if the end user does not possess an extensive history of previous application runs.In this work we propose an online method for execution time prediction of applications, for which execution progress can be collected at run-time. Using dynamic progress information, the total job execution time can be predicted using extrapolation. However, the predictions achieved by extrapolation are far from precise and often vary over time as a result of changing application dynamics and varying resource load. Therefore, to compute the actual job execution time we match a number of predefined prediction evolution models against the consecutive extrapolations, by adopting nonlinear curve-fitting. The “best-fit” coefficients allow for more accurate execution time prediction.The predictions made are used to enhance a dynamic scheduling algorithm for workflows introduced in our earlier work. The scheduling algorithm is run with and without curve-fitting, showing a performance improvement of up to 15% in the former case.

Water Science and Technology | 2013

Add Control: plant virtualization for control solutions in WWTP.

M. Maiza; A. Bengoechea; P. Grau; W. De Keyser; Ingmar Nopens; Doris Brockmann; J.P. Steyer; Filip Claeys; Gorka Urchegui; O. Fernández; E. Ayesa

This paper summarizes part of the research work carried out in the Add Control project, which proposes an extension of the wastewater treatment plant (WWTP) models and modelling architectures used in traditional WWTP simulation tools, addressing, in addition to the classical mass transformations (transport, physico-chemical phenomena, biological reactions), all the instrumentation, actuation and automation & control components (sensors, actuators, controllers), considering their real behaviour (signal delays, noise, failures and power consumption of actuators). Its ultimate objective is to allow a rapid transition from the simulation of the control strategy to its implementation at full-scale plants. Thus, this paper presents the application of the Add Control simulation platform for the design and implementation of new control strategies at the WWTP of Mekolalde.

international conference on computational science | 2005

Computational complexity and distributed execution in water quality management

Maria Chtepen; Filip Claeys; Bart Dhoedt; Peter Vanrolleghem; Piet Demeester

Tourist beaches on the southern coast of Turkey are surveyed in order to facilitate a standardised fuzzy approach to be used in litter prediction and to assess the aesthetic state of the coastal environment for monitoring programs. During these surveys the number of litter items on beaches were counted and recorded in different categories. The main source of litter on beaches was determined as “beach users”. A fuzzy system was developed to predict the classification of the beaches, since uncertainty was generally inherent in beach work due to the high variability of beach characteristics and the sources of litter categories. This resulted in effective utilization of “the judgment and knowledge of beach users” in the evaluation of beach gradings.

Journal of Hydroinformatics | 2003