Anca Mariana Molnos
Delft University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anca Mariana Molnos.
IEEE Transactions on Computers | 2014
Radu Stefan; Anca Mariana Molnos; Kees Goossens
Networks-on-Chip (NoC) are seen as promising interconnect solutions, offering the advantages of scalability and high-frequency operation which the traditional bus interconnects lack. Several NoC implementations have been presented in the literature, some of them having mature tool-flows. The main differentiating factor between the various implementations is the set of services and communication patterns they offer to the end-user. In this paper we present dAElite, a TDM Network-on-Chip that offers a unique combination of features, namely, guaranteed bandwidth and latency per connection, built-in support for multicast, and a short connection set-up time. While our NoC was designed from the ground up, we leverage on existing tools for network dimensioning, analysis, and instantiation. We have implemented and tested our proposal in hardware and we compared it to Æthereal, a state-of-the-art NoC with similar features, but no multicast. We find that the connection set-up time is reduced by a factor of 10 and the network traversal latency is decreased by 33 percent. Moreover, considering realistic values of the network parameters, dAElite has a lower hardware area when synthesized in 65 nm technology.
ACM Sigbed Review | 2013
Kees Goossens; Arnaldo Azevedo; Karthik Chandrasekar; Manil Dev Gomony; Sven Goossens; Martijn Martijn Koedam; Yonghui Li; Davit Davit Mirzoyan; Anca Mariana Molnos; Ashkan Beyranvand Nejad; Andrew Nelson; Ss Shubhendu Sinha
Systems on chip (SOC) contain multiple concurrent applications with different time criticality (firm, soft, non real-time). As a result, they are often developed by different teams or companies, with different models of computation (MOC) such as dataflow, Kahn process networks (KPN), or time-triggered (TT). SOC functionality and (real-time) performance is verified after all applications have been integrated. In this paper we propose the CompSOC platform and design flows that offers a virtual execution platform per application, to allow independent design, verification, and execution. We introduce the composability and predictability concepts, why they help, and how they are implemented in the different resources of the CompSOC architecture. We define a design flow that allows real-time cyclo-static dataflow (CSDF) applications to be automatically mapped, verified, and executed. Mapping and analysis of KPN and TT applications is not automated but they do run composably in their allocated virtual platforms. Although most of the techniques used here have been published in isolation, this paper is the first comprehensive overview of the CompSOC approach. Moreover, three new case studies illustrate all claimed benefits: 1) An example firm-real-time CSDF H.263 decoder is automatically mapped and verified. 2) Applications with different models of computation (CSDF and TT) run composably. 3) Adaptive soft-real-time applications execute composably and can hence be verified independently by simulation.
Circuits and Systems | 2011
Benny Akesson; Anca Mariana Molnos; Andreas Hansson; Jude Ambrose Angelo; Kees Goossens
System-on-chip (soc) design gets increasingly complex, as a growing number of applications are integrated in modern systems. Some of these applications have real-time requirements, such as a minimum throughput or a maximum latency. To reduce cost, system resources are shared between applications, making their timing behavior inter-dependent. Real-time requirements must hence e verified for all possible combinations of concurrently executing applications, which is not feasible with commonly used simulation-based techniques. This chapter addresses this problem using two complexity-reducing concepts: composability and predictability. Applications in a composable system are completely isolated and cannot affect each others behaviors, enabling them to be independently verified. Predictable systems, on the other hand, provide lower bounds on performance, allowing applications to be verified using formal performance analysis. Five techniques to achieve composability and/or predictability in soc resources are presented and we explain their implementation for processors, interconnect, and memories in our platform.
Microprocessors and Microsystems | 2011
Andreas Hansson; Marcus Ekerhult; Anca Mariana Molnos; Aleksandar Milutinovic; Andrew Nelson; Jude Angelo Ambrose; Kees Goossens
Multi-Processor Systems on Chip (MPSoC) run multiple independent applications, often developed by different parties. The applications share the hardware resources, e.g. processors, memories and interconnect. The sharing typically causes interference between the applications, which severely complicates system integration and verification. Even if the applications are verified in isolation, the system designer must verify the combined behaviour, leading to an explosion in design complexity. Composable MPSoCs have no interference between applications, thus allowing independent design and verification. For an MPSoC to be composable, all the hardware resources must offer composability. A particularly challenging resource is the processors, often purchased as off-the-shelf intellectual property. In this work we present the design and implementation of CompOSe, a light-weight (only 1500 lines of code) composable operating system for MPSoCs. CompOSe uses fixed-size time slices, coupled with a composable scheduler, to enable composable processor sharing. Using instances of ARM7, ARM11 and the Xilinx MicroBlaze we experimentally demonstrate the ability to provide temporal composability, even in the presence of dynamic application behaviour and multiple use cases. We do so using a diverse set of processor architectures, without requiring any hardware modifications. We also show how CompOSe allows slack to be distributed within and between applications through a novel two-level scheduler and slack-distribution system.
design, automation, and test in europe | 2012
Radu Stefan; Anca Mariana Molnos; A Ambrose; Kgw Kees Goossens
Networks-on-Chip are seen as promising interconnect solutions, offering the advantages of scalability and high frequency operation which the traditional bus interconnects lack. Several NoC implementations have been presented in the literature, some of them having mature tool-flows and ecosystems. The main differentiating factor between the various implementations are the services and communication patters they offer to the end-user. In this paper we present dAElite, a TDM Network-on-Chip that offers a unique combinations of features, namely guaranteed bandwidth and latency per connection, built-in support for multicast, and a short connection set-up time. While our NoC was designed from the ground up, we leverage on existing tools for network dimensioning, analysis and instantiation. We have implemented and tested our proposal in hardware and we found it to compare favorably to the other NoCs in terms of hardware area. Compared with aelite, which is closest in terms of offered services our network offers connection set-up times faster by a factor of 10 network, traversal latencies decreased by 33%, and improved bandwidth.
digital systems design | 2009
Anca Mariana Molnos; Kees Goossens
Voltage-frequency scaling (VFS) trades a linear processor slowdown for a potentially quadratic reduction in energy consumption. Complex dependencies may exist between different tasks of an application. The impact of VFS on the end- to-end application performance is difficult to predict, especially when these tasks are mapped on multiple processors that are scaled independently. This is a problem for real-time (RT) applications that require guaranteed end-to-end performance. In this paper we first classify the slack existing in RT applications consisting of multiple dependent tasks mapped on multiple processors independently using VFS, resulting in static, work, and share slack. Then we concentrate on work and share slack as they can only be detected at run time, thus their conservative use is challenging. We propose SlackOS, a dynamic, dependency-aware, task scheduling that conserva- tively scales the voltage and frequency of each processor, to respect RT deadlines. When applied to a H.264 application, our method delivers 22% to 33% energy reduction, compared to dynamic RT scheduling that is not energy aware.
international conference on embedded computer systems: architectures, modeling, and simulation | 2011
Andrew Nelson; Anca Mariana Molnos; Kees Goossens
Embedded Multiprocessor Systems-on-Chip (MPSoCs) commonly run multiple applications at once. These applications may have different time criticalities, i.e. non real-time, soft real-time, and firm or hard real-time. Application-level composability is used to provide each application with its own virtual platform, such that each application may be developed, verified, and executed independently, given its virtual platform specification. Composability of functional and temporal properties has been demonstrated in previous work. In this paper, we extend composability to include power management, where each application can manage its energy usage independently. Each application receives an independent energy and/or power budget, which it can manage as it sees fit, with its own application-specific power-management policy. Time, energy, and power budgets allocated to each application ensure that its power-management policy cannot cause any interference to the functional, timing, and power behaviours of other applications. We implement our technique on an existing composable and predictable hardware platform (CompSoC), and extend its Real-Time Operating System (OS) with a power-management infrastructure. Applications use a power-management API to communicate with the OS that implements time, energy, and power budgets. We demonstrate the applicability of our techniques by running several concurrent applications with their own power managers on an FPGA prototype.
international conference on optimization of electrical and electronic equipment | 2010
Anca Mariana Molnos; Jude Angelo Ambrose; Andrew Nelson; Radu Stefan; Sorin Cotofana; Kees Goossens
Multi-processors systems on chip (MPSOC) platforms emerged in embedded systems as hardware solutions to support the continuously increasing functionality and performance demands in this domain. Such a platform has to execute a mix of applications with diverse performance and timing constraints, i.e., real-time or non-real-time, thus different application schedulers should co-exist on an MPSOC. Moreover, applications share many MPSOC resources, thus their timing depends on the arbitration at these resources. Arbitration may create inter-application dependencies, e.g., the timing of a low priority application depends on the timing of all higher priority ones. Application inter-dependencies make the functional and timing verification and the integration process harder. This is especially problematic for real-time applications, for which fulfilling the time-related constraints should be guaranteed by construction. Moreover, energy and power management, commonly employed in embedded systems, make this verification even more difficult. Typically, energy and power management involves scaling the resources operating point, which has a direct impact on the resource performance, thus influences the application time behaviour. Finally, a small change in one application leads to the need to re-verify all other applications, incurring a large effort. Composability is a property meant to ease the verification and integration process. A system is composable if the functionality and the timing behaviour of each application is independent of other applications mapped on the same platform. Composability is achieved by utilising arbiters that ensure applications independence. In this paper we present the concepts behind a composable, scalable, energy-managed MPSOC platform, able to support different real-time and nonreal time schedulers concurrently, and discuss its advantages and limitations.
IEEE Journal of Solid-state Circuits | 2014
Ivan Miro-Panades; Edith Beigne; Yvain Thonnart; Laurent Alacoque; Pascal Vivet; Suzanne Lesecq; Diego Puschini; Anca Mariana Molnos; Farhat Thabet; Benoit Tain; Karim Ben Chehida; Sylvain Engels; Robin Wilson; Didier Fuin
In order to optimize global energy efficiency in the context of dynamic process, voltage and temperature variations in advanced nodes, a fine-grain adaptive voltage and frequency scaling architecture is proposed for multiprocessor systems-on-chip (MPSoC), where each processing element is an independent voltage-frequency island. This architecture has been implemented on a 32 nm globally asynchronous locally-synchronous MPSoC. It shows up to 18.2% energy gains thanks to local adaptability compared with a global dynamic voltage and frequency scaling approach using 25% timing margins between slow and nominal process, by reducing margins to 60 ps of the real process. These gains are obtained for a total area overhead of 10% including local frequency/voltage actuators, sensors, and digital controller.
design, automation, and test in europe | 2014
Andrew Nelson; Ashkan Beyranvand Nejad; Anca Mariana Molnos; Martijn Martijn Koedam; Kees Goossens
The functionality of embedded systems is ever increasing. This has lead to mixed time-criticality systems, where applications with a variety of real-time requirements co-exist on the same platform and share resources. Due to inter-application interference, verifying the real-time requirements of such systems is generally non trivial. In this paper, we present the CoMik microkernel that provides temporally predictable and composable processor virtualisation. CoMiks virtual processors are cycle-accurately composable, i.e. their timing cannot affect the timing of co-existing virtual processors by even a single cycle. Real-time applications executing on dedicated virtual processors can therefore be verified and executed in isolation, simplifying the verification of mixed time-criticality systems. We demonstrate these properties through experimentation on an FPGA prototyped hardware platform.