Evangelia Kasapaki
Technical University of Denmark
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Evangelia Kasapaki.
networks on chips | 2012
Martin Schoeberl; Florian Brandner; Jens Sparsø; Evangelia Kasapaki
This paper explores the design of a circuit-switched network-on-chip (NoC) based on time-division-multiplexing (TDM) for use in hard real-time systems. Previous work has primarily considered application-specific systems. The work presented here targets general-purpose hardware platforms. We consider a system with IP-cores, where the TDM-NoC must provide directed virtual circuits - all with the same bandwidth - between all nodes. This may not be a frequent scenario, but a general platform should provide this capability, and it is an interesting point in the design space to study. The paper presents an FPGA-friendly hardware design, which is simple, fast, and consumes minimal resources. Furthermore, an algorithm to find minimum-period schedules for all-to-all virtual circuits on top of typical physical NoC topologies like 2D-mesh, torus, bidirectional torus, tree, and fat-tree is presented. The static schedule makes the NoC time-predictable and enables worst-case execution time analysis of communicating real-time tasks.
embedded software | 2015
Martin Schoeberl; Sahar Abbaspour; Benny Akesson; Neil C. Audsley; Raffaele Capasso; Jamie Garside; Kees Goossens; Sven Goossens; Scott Hansen; Reinhold Heckmann; Stefan Hepp; Benedikt Huber; Alexander Jordan; Evangelia Kasapaki; Jens Knoop; Yonghui Li; Daniel Prokesch; Wolfgang Puffitsch; Peter P. Puschner; André Rocha; Cláudio Silva; Jens Sparsø; Alessandro Tocchi
Real-time systems need time-predictable platforms to allow static analysis of the worst-case execution time (WCET). Standard multi-core processors are optimized for the average case and are hardly analyzable. Within the T-CREST project we propose novel solutions for time-predictable multi-core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared to other processors the WCET performance is outstanding.The T-CREST platform is evaluated with two industrial use cases. An application from the avionic domain demonstrates that tasks executing on different cores do not interfere with respect to their WCET. A signal processing application from the railway domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result of a collaborative research and development project executed by eight partners from academia and industry. The European Commission funded T-CREST.
IEEE Transactions on Very Large Scale Integration Systems | 2016
Evangelia Kasapaki; Martin Schoeberl; Rasmus Bo Sørensen; Christoph Thomas Muller; Kees Goossens; Jens Sparsø
In this paper, we present an area-efficient, globally asynchronous, locally synchronous network-on-chip (NoC) architecture for a hard real-time multiprocessor platform. The NoC implements message-passing communication between processor cores. It uses statically scheduled time-division multiplexing (TDM) to control the communication over a structure of routers, links, and network interfaces (NIs) to offer real-time guarantees. The area-efficient design is a result of two contributions: 1) asynchronous routers combined with TDM scheduling and 2) a novel NI microarchitecture. Together they result in a design in which data are transferred in a pipelined fashion, from the local memory of the sending core to the local memory of the receiving core, without any dynamic arbitration, buffering, and clock synchronization. The routers use two-phase bundled-data handshake latches based on the Mousetrap latch controller and are extended with a clock gating mechanism to reduce the energy consumption. The NIs integrate the direct memory access functionality and the TDM schedule, and use dual-ported local memories to avoid buffering, flow-control, and synchronization. To verify the design, we have implemented a 4 × 4 bitorus NoC in 65-nm CMOS technology and we present results on area, speed, and energy consumption for the router, NI, NoC, and postlayout.
design, automation, and test in europe | 2013
Jens Sparsø; Evangelia Kasapaki; Martin Schoeberl
Network interfaces (NIs) are used in multi-core systems where they connect processors, memories, and other IP-cores to a packet switched Network-on-Chip (NOC). The functionality of a NI is to bridge between the read/write transaction interfaces used by the cores and the packet-streaming interface used by the routers and links in the NOC. The paper addresses the design of a NI for a NOC that uses time division multiplexing (TDM). By keeping the essence of TDM in mind, we have developed a new area-efficient NI micro-architecture. The new design completely eliminates the need for FIFO buffers and credit based flow control - resources which are reported to account for 50–85% of the area in existing NI designs. The paper discusses the design considerations, presents the new NI micro-architecture, and reports area figures for a range of implementations.
ieee international symposium on asynchronous circuits and systems | 2014
Evangelia Kasapaki; Jens Sparsø
In this paper we explore the use of asynchronous routers in a time-division-multiplexed (TDM) network-on-chip (NOC), Argo, that is being developed for a multi-processor platform for hard real-time systems. TDM inherently requires a common time reference, and existing TDM-based NOC designs are either synchronous or mesochronous. We use asynchronous routers to achieve a simpler, smaller and more robust, self-timed design. Our design exploits the fact that pipelined asynchronous circuits also behave as ripple FIFOs. Thus, it avoids the need for explicit synchronization FIFOs between the routers. Argo has interesting elastic timing properties that allow it to tolerate skew between the network interfaces (NIs). The paper presents Argo NOC-architecture and provides a quantitative analysis of its ability of absorb skew between the NIs. Using a signal transition graph model and realistic component delays derived from a 65 nm CMOS implementation, a worst-case analysis shows that a typical design can tolerate a skew of 1-5 cycles (depending on FIFO depths and NI clock frequency). Simulation results of a 2 × 2 NOC confirm this.
european conference on circuit theory and design | 2015
Evangelia Kasapaki; Jens Sparsø
Argo is a network-on-chip developed for use in a multi-core platform designed specifically for hard real-time applications and it supports message passing across virtual end-to-end channels. Argo implements these channels using time-division-multiplexing (TDM) of the resources in the NOC following a static schedule. This requires some form of global synchrony across the platform. At the same time it is generally accepted that a large chip should employ some form of globally-asynchronous locally-synchronous (GALS) organization. By using asynchronous routers and by rethinking the microarchitecture of the network interfaces we have managed to combine TDM and GALS and obtain a very hardware-efficient implementation of the NOC. The paper gives a brief overview of the Argo NOC and focuses on two important issues: how to safely bring the NOC out of reset and timing analysis of the network of asynchronous routers.
norchip | 2014
Christoph Thomas Muller; Evangelia Kasapaki; Rasmus Bo Sørensen; Jens Sparsø
Asynchronous circuit design is well understood but design tools supporting asynchronous design are largely lacking, and designers are limited to using conventional EDA-tools. These tools have a built-in synchronous mind-set and this complicates their use for asynchronous implementation. One example is the key role that clock signals play in specifying time-constraints for the synthesis. In this paper explain how we handled the synthesis and layout of an asynchronous network-on-chip for a multi-core platform. Focus is on the design process while the actual NOC-design and its performance are presented elsewhere.
networks on chips | 2014
I. Kotleas; D. Humphreys; Rasmus Bo Sørensen; Evangelia Kasapaki; Florian Brandner; Jens Sparsø
This paper presents an asynchronous router design for use in time-division-multiplexed (TDM) networks-on-chip. Unlike existing synchronous, mesochronous and asynchronous router designs with similar functionality, the router is able to silently skip over cycles/TDM-slots where no traffic is scheduled and hence avoid all switching activity in the idle links and router ports. In this way switching activity is reduced to the minimum possible amount. The fact that this relaxed synchronization is sufficient to implement TDM scheduling represents a contribution at the conceptual level. The idea can only be implemented using asynchronous circuit techniques. To this end, the paper explores the use of “click-element” templates. Click-element templates use only flip-flops and conventional gates, and this greatly simplifies the design process when using conventional EDA tools and standard cell libraries. Few papers, if any, have explored this.
digital systems design | 2013
Evangelia Kasapaki; Jens Sparsø; Rasmus Bo Sørensen; Kees Goossens
Archive | 2015
Evangelia Kasapaki; Jens Sparsø; Martin Schoeberl