Malgorzata Michalska
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Malgorzata Michalska.
2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC) | 2016
Malgorzata Michalska; Simone Casale-Brunet; Endri Bezati; Marco Mattavelli
An important challenge for a dataflow designer is to efficiently explore the design space in order to find a set of configurations that satisfy the defined objective function. The exploration directions may involve the partitioning, scheduling and buffer dimensioning, and all together should drive the designer to maximally benefit from the potential parallelism of an application. Successful exploration can be strongly facilitated by means of performance estimation. This paper presents a tool that allows a high-precision estimation of a program execution on a given platform, when various sets of configurations can be applied. It demonstrates which information related to the multi-core program execution can be extracted and successfully used to drive the optimization procedures. The experimental results are confirmed by an actual execution on different types of platforms.
power and timing modeling optimization and simulation | 2016
Malgorzata Michalska; Junaid Jameel Ahmad; Endri Bezati; Simone Casale-Brunet; Marco Mattavelli
The exploration of different design configurations of dynamic dataflow programs executed on many-core or multi-core platforms is, in general, a very difficult task. Determining a close-to-optimal partitioning, scheduling and buffer dimensioning configuration, when associated with a performance optimization function, belongs to the class of NP-complete problems. In order to explore the space of feasible solutions with efficient heuristics looking for solutions of good quality, it is important to be able to evaluate the design points in terms of the performance optimization function with sufficient precision without having to physically execute the program on the platform. This paper presents a performance estimation approach and an associated SW tool capable of exploring, with a high level of accuracy, the space of feasible solutions by using only a limited set of measurements from the physical processing platform. Moreover, the estimation model allows an identification of possible improvements that can be applied to different configurations. The results reported validate the accuracy of the methodology using examples of dataflow implementations of dynamic video codec designs for two different classes of platforms: Transport Triggered Architecture and Intel platforms.
international conference on conceptual structures | 2016
Malgorzata Michalska; Endri Bezati; Simone Casale-Brunet; Marco Mattavelli
The definition of an efficient scheduling policy is an important, difficult and open design problem for the implementation of applications based on dynamic dataflow programs for which optimal closed-form solutions do not exist. This paper describes an approach based on the study of the execution of a dynamic dataflow program on a target architecture with different scheduling policies. The method is based on a representation of the execution of a dataflow program with the associated dependencies, and on the cost of using scheduling policy, expressed as a number of conditions that need to be verified to have a successful execution within each partition. The relation between the potential gain of the overall execution satisfying intrinsic data dependencies and the runtime cost of finding an admissible schedule is a key issue to find close-to-optimal solutions for the scheduling problem of dynamic dataflow applications.
international conference on conceptual structures | 2016
Malgorzata Michalska; Nicolas Zufferey; Marco Mattavelli
An important challenge of dataflow programming is the problem of partitioning dataflow components onto a target architecture. A common objective function associated to this problem is to find the maximum data processing throughput. This NP-complete problem is very difficult to solve with high quality close-to-optimal solutions for the very large size of the design space and the possibly large variability of input data. This paper introduces four variants of the tabu search metaheuristic expressly developed for partitioning components of a dataflow program. The approach relies on the use of a simulation tool, capable of estimating the performance for any partitioning configuration exploiting a model of the target architecture and the profiling results. The partitioning solutions generated with tabu search are validated for consistency and high accuracy with experimental platform executions.
IEEE Transactions on Multi-Scale Computing Systems | 2018
Malgorzata Michalska; Simone Casale-Brunet; Endri Bezati; Marco Mattavelli
The implementation and optimization of dynamic dataflow programs on multi/many-core platforms require solving a very difficult problem: how to partition and schedule the processing elements and dimension their interconnecting buffers according to given optimization functions in terms of throughput, memory usage, and energy consumption. This problem is NP-hard even for two cores. Thus, finding a close-to-optimal solution consists of exploring the design space by appropriate heuristics identifying those design points that maximize or minimize the desired (multiple) objective functions subject to a set of constraints. In general, exploring the design space efficiently is a challenging task due to the massive number of admissible design points. Efficient estimation methodologies are necessary to support an effective search of the design space by reducing to a minimum the cost and the number of measurements on the physical platform. This paper presents a new methodology that provides high-precision estimations of dynamic dataflow programs performances on multi/many-core platforms for any set of design configurations. The estimations rely on the execution trace post-processing obtained by a single profiled execution of the program. Furthermore, the paper describes the estimation methodology, implementation tools, and the type of information that is obtained from many/multi-core dataflow executions and used to drive the optimization heuristics. The results confirm a high level of accuracy achieved on different types of platforms and the effectiveness of the illustrated design space exploration methodology.
european signal processing conference | 2017
Malgorzata Michalska; Endri Bezati; Simone Casale-Brunet; Marco Mattavelli
Executing a dataflow program on a parallel platform requires assigning to each buffer a given size so that correct program executions take place without introducing any deadlock. Furthermore, in the case of dynamic dataflow programs, specific buffer size assignments lead to significant differences in the throughput, hence a more appropriate optimization problem is to specify the buffer sizes so that the throughput is maximized and the used resources are minimized. This paper introduces a new heuristic methodology for the buffer dimensioning of dynamic dataflow programs, which is considered as a stage of a more general design space exploration process.
asilomar conference on signals, systems and computers | 2016
Malgorzata Michalska; Simone Casale-Brunet; Endri Bezati; Marco Mattavelli; Jorn W. Janneck
Application performance on these processor array platforms is highly sensitive to how functionality is physically placed on the device, as this choice crucially determines communication latencies and congestion patterns of the on-chip inter-core communication. The problem of identifying the best, or just a good enough, partitioning and placement does not, in general, admit to an analytic solution, and its combinatorial nature makes solving it by pure experimentation impractical. This paper presents an approach that maps stream programs onto processor arrays using trace analysis as a technique for evaluating candidate solutions and for suggesting alternatives.
Pomiary Automatyka Robotyka | 2016
Malgorzata Michalska
This paper describes an application for automatic detection and correction of detuning in singing. It presents the observations that became the core of the work, application principles, limitations, perspectives and used algorithms. It explains in detail the experiments performed and the results obtained. Finally, it discusses some opportunities that have been revealed during the research and points to improvements and extensions possible in the future work.
Journal of Electrical and Computer Engineering | 2016
Malgorzata Michalska; Nicolas Zufferey; Marco Mattavelli
The problem of partitioning a dataflow program onto a target architecture is a difficult challenge for any application design. In general, since the problem is NP-complete, it consists of looking for high quality solutions in terms of maximizing the achievable data throughput. The difficulty is given by the exploration of the design space which results in being extremely large for parallel platforms. The paper describes a heuristic partitioning methodology applicable to dynamic dataflow programs. The methodology is based on two elements: an execution model of the dynamic dataflow program which is used as estimation of the performance for the exploration of the large design space and several partitioning algorithms competing to lead to specific high quality solutions. Experimental results are validated with executions on a virtual platform.
2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC) | 2016
Malgorzata Michalska; Nicolas Zufferey; Endri Bezati; Marco Mattavelli
The implementations of signal processing systems on the emerging many-core or multi-core processing platforms require to solve a very difficult problem: how to partition and schedule the processing tasks according to given optimization functions such as data throughput, memory usage, energy consumption. Implementations based on dataflow programming approaches are recognized to be particularly interesting for this challenge, because dataflow network components can be partitioned onto the processing units always yielding correct system behaviors. Moreover, the space of feasible configurations can be explored using heuristics and this is not the case for other implementation approaches for which, for each configuration, it is required to rewrite entire parts of the application programs. This paper investigates the features of the design space exploration problem by considering a new formal formulation of the partitioning, scheduling and buffer dimensioning problem for the case of dynamic dataflow programs. Furthermore, it demonstrates which heuristics, with the associated optimization functions relying on this formulation, can be identified for providing high quality solutions.