Sébastien Lafond
Åbo Akademi University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sébastien Lafond.
parallel, distributed and network-based processing | 2013
Fareed Jokhio; Adnan Ashraf; Sébastien Lafond; Ivan Porres; Johan Lilius
This paper presents prediction-based dynamic resource allocation algorithms to scale video transcoding service on a given Infrastructure as a Service cloud. The proposed algorithms provide mechanisms for allocation and deallocation of virtual machines (VMs) to a cluster of video transcoding servers in a horizontal fashion. We use a two-step load prediction method, which allows proactive resource allocation with high prediction accuracy under real-time constraints. For cost-efficiency, our work supports transcoding of multiple on-demand video streams concurrently on a single VM, resulting in a reduced number of required VMs. We use video segmentation at group of pictures level, which splits video streams into smaller segments that can be transcoded independently of one another. The approach is demonstrated in a discrete-event simulation and an experimental evaluation involving two different load patterns.
signal processing systems | 2011
Jani Boutellier; Christophe Lucarz; Sébastien Lafond; Victor Martin Gomez; Marco Mattavelli
The upcoming Reconfigurable Video Coding (RVC) standard from MPEG (ISO / IEC SC29WG11) defines a library of coding tools to specify existing or new compressed video formats and decoders. The coding tool library has been written in a dataflow/actor-oriented language named CAL. Each coding tool (actor) can be represented with an extended finite state machine and the data communication between the tools are described as dataflow graphs. This paper proposes an approach to model the CAL actor network with Parameterized Synchronous Data Flow and to derive a quasi-static multiprocessor execution schedule for the system. In addition to proposing a scheduling approach for RVC, an extension to the well-known permutation flow shop scheduling problem that enables rapid run-time scheduling of RVC tasks, is introduced.
ieee/acm international symposium cluster, cloud and grid computing | 2013
Adnan Ashraf; Fareed Jokhio; Tewodros Deneke; Sébastien Lafond; Ivan Porres; Johan Lilius
This paper presents a novel approach for stream-based admission control and job scheduling for video transcoding called SBACS (Stream-Based Admission Control and Scheduling). SBACS uses queue waiting time of transcoding servers to make admission control decisions for incoming video streams. It implements stream-based admission control with per stream admission. To ensure efficient utilization of the transcoding servers, video streams are segmented at the Group of Pictures level. In addition to the traditional rejection policy, SBACS also provides a stream deferment policy, which exploits cloud elasticity to allow temporary deferment of the incoming video streams. In other words, the admission controller can decide to admit, defer, or reject an incoming stream and hence reduce rejection rate. In order to prevent transcoding jitters in the admitted streams, we introduce a job scheduling mechanism, which drops a small proportion of video frames from a video segment to ensure continued delivery of video contents to the user. The approach is demonstrated in a discrete-event simulation with a series of experiments involving different load patterns and stream arrival rates.
parallel, distributed and network-based processing | 2012
Fareed Jokhio; Tewodros Deneke; Sébastien Lafond; Johan Lilius
This paper presents an approach to perform bit rate reduction Transco ding by video segmentation. The paper shows how a high performance distributed video transcoder can be built using multiple processing units and a Message Passing Interface based parallel programming model. The computation parallelization and data distribution among computing units is discussed. For data distribution coarse grain approach is used in which significant gain in terms of execution speedup is obtained. The segmentation of video stream with (1) equal size having unequal number of intra frames and (2) unequal size having equal number of intra frames is performed to achieve high performance. The results show that the proposed distributed video transcoder provides very short startup times.
Journal of Systems Architecture | 2007
Sébastien Lafond; Johan Lilius
In this paper we present a general framework for estimating the energy consumption of an embedded Java virtual machine (JVM). We have designed a number of experiments to find the constant overhead and establish an energy consumption cost for individual Java opcodes for two JVMs. The results show that there is a basic constant overhead for every Java program, and that a subset of Java opcodes have an almost constant energy cost. We also show that memory access is a crucial energy consumption component.
parallel, distributed and network-based processing | 2013
Simon Holmbacka; Wictor Lund; Sébastien Lafond; Johan Lilius
Spatial locality of task execution will become more important on future hardware platforms since the number of cores are steadily increasing. The large amount of cores requires more intelligent power management due to the notion of spatial locality, and the high chip density requires an increased thermal awareness in order to avoid thermal hotspots on the chip. At the same time, high performance of the CPU is only achieved by parallelizing tasks over the chip in order to fully utilize the hardware. This paper presents a task migration mechanism for distributed operating systems running on many-core platforms. In this work, we evaluate the performance and energy efficiency of an implemented task migration mechanism. This is shown by parallelizing tasks as the performance of a single core is not sufficient, and by collecting tasks to as few cores as possible as CPU load is low. The task migration mechanism is implemented as a library for FreeRTOS using 1300 lines of code, and introduced a total task migration overhead of 100 ms on a shared memory platform. With the presented task migration mechanism, we intend to improve the dynamism of power and performance characteristics in distributed many-core operating systems.
international symposium on intelligent signal processing and communication systems | 2011
Fareed Jokhio; Tewodros Deneke; Sébastien Lafond; Johan Lilius
In this paper different methods of video segmentation are analyzed to perform spatial resolution reduction video transcoding with multiple processing units. A distributed video transcoder is built in which different processing units perform the transcoding operation. To fully utilize the computational power of different processing units the distribution of computational load should be equal. In video transcoding different frames require different computational power hence inefficient video segmentation will lead towards lower performance. We have analyzed three possible methods of video segmentation: (1) each segment has equal size, (2) each segment has equal number of frames, and (3) each segment has equal number of group of pictures. The performance of the system, the relationship between the processing units used and speed in computation is measured in terms of standard deviation of transcoding time of different processing units.
conference on design and architectures for signal and image processing | 2014
Simon Holmbacka; Erwan Nogues; Maxime Pelcat; Sébastien Lafond; Johan Lilius
Parallelizing software is a popular way of achieving high energy efficiency since parallel applications can be mapped on many cores and the clock frequency can be lowered. Perfect parallelism is, however, not often reached and different program phases usually contain different levels of parallelism due to data dependencies. Applications have currently no means of expressing the level of parallelism, and the power management is mostly done based on only the workload. In this work, we provide means of expressing QoS and levels of parallelism in applications for more tight integration with the power management to obtain optimal energy efficiency in multi-core systems. We utilize the dataflow framework PREESM to create and analyze program structures and expose the parallelism in the program phases to the power management. We use the derived parameters in a NLP (Non Linear Programming) solver to determine the minimum power for allocating resources to the applications.
software engineering and advanced applications | 2013
Fareed Jokhio; Adnan Ashraf; Sébastien Lafond; Johan Lilius
Video transcoding refers to the process of converting a compressed digital video from one format to another. Since it is a compute-intensive operation, transcoding of a large number of on-demand videos requires a large scale cluster of transcoding servers. Moreover, storage of multiple transcoded versions of each source video requires a large amount of disk space. Infrastructure as a Service (IaaS) clouds provide virtual machines (VMs) for creating a dynamically scalable cluster of servers. Likewise, a cloud storage service may be used to store a large number of transcoded videos. Moreover, it may be possible to reduce the total IaaS cost by trading storage for computation, or vice versa. In this paper, we present a computation and storage trade-off strategy for cost-efficient video transcoding in the cloud called cost and popularity score based strategy. The proposed strategy estimates computation cost, storage cost, and video popularity of individual transcoded videos and then uses this information to make decisions on how long a video should be stored or how frequently it should be re-transcoded from a given source video. It is demonstrated in a discrete-event simulation and is evaluated in a series of experiments involving semi synthetic and realistic load patterns.
2013 24th Tyrrhenian International Workshop on Digital Communications - Green ICT (TIWDC) | 2013
Fredric Hallis; Simon Holmbacka; Wictor Lund; Robert Slotte; Sébastien Lafond; Johan Lilius
Webserver farms and datacenters currently use workload consolidation to match the dynamic workload with the available resources since switching off unused machines has been shown to save energy. The workload is placed on the active servers until the servers are saturated. The idea of workload consolidation can be brought also to chip level by the OS scheduler to pack as much workload to as few cores as possible in a many-core system. In this case all idle cores in the system are placed in a sleep state, and are woken up on-demand. Due to the relationship between static power dissipation and temperature, this paper investigates the thermal influence on the energy efficiency of chip level workload consolidation and its potential impact on the scheduling decisions. This work lay down the foundation for the development of a model for energy efficient OS scheduling for many-core processors taking into account external factors such as ambient and core level temperatures.