Kees Goossens
Eindhoven University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kees Goossens.
design, automation, and test in europe | 2005
Kees Goossens; John Dielissen; Om Prakash Gangwal; Santiago González Pestana; Andrei Radulescu; Edwin Rijpkema
Systems on chip (SOC) are composed of intellectual property blocks (IP) and interconnect. While mature tooling exists to design the former, tooling for interconnect design is still a research area. In this paper we describe an operational design flow that generates and configures application-specific network on chip (NOC) instances, given application communication requirements. The NOC can be simulated in SystemC and RTL VHDL. An independent performance verification tool verifies analytically that the NOC instance (hardware) and its configuration (software) together meet the application performance requirements. The Æthereal NOCs guaranteed performance is essential to replace time-consuming simulation by fast analytical performance validation. As a result, application-specific NOCs that are guaranteed to meet the applications communication requirements are generated and verified in minutes, reducing the number of design iterations. A realistic MPEG SOC example substantiates our claims.
international conference on hardware/software codesign and system synthesis | 2005
Andreas Hansson; Kees Goossens; Andrei Rǎdulescu
One of the key steps in Network-on-Chip (NoC) based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problem first map cores onto a topology and then route communication, using separated and possibly conflicting objective functions. In this paper we present a unified single-objective algorithm, called Unified MApping, Routing and Slot allocation (UMARS). As the main contribution we show how to couple path selection, mapping of cores and TDMA time-slot allocation such that the network required to meet the constraints of the application is minimized. The time-complexity of UMARS is low and experimental results indicate a run-time only 20% higher than that of path selection alone. We apply the algorithm to an MPEG decoder System-on-Chip (SoC), reducing area by 33%, power by 35% and worst-case latency by a factor four over a traditional multi-step approach.
Design Automation for Embedded Systems | 2002
Andre K. Nieuwland; Jeffrey Kang; Om Prakash Gangwal; Ramanathan Sethuraman; Natalino G. Busá; Kees Goossens; Rafael Peset Llopis; Paul E. R. Lippens
The key issue in the design of Systems-on-a-Chip (SoC) is to trade-off efficiency against flexibility, and time to market versus cost. Current deep submicron processing technologiesenable integration of multiple software programmable processors (e.g., CPUs,DSPs) and dedicated hardware components into a single cost-efficient IC. Ourtop-down design methodology with various abstraction levels helps designingthese ICs in a reasonable amount of time. This methodology starts with a high-levelexecutable specification, and converges towards a silicon implementation.A major task in the design process is to ensure that all components (hardwareand software) communicate with each other correctly. In this article, we tacklethis problem in the context of the signal processing domain in two ways: wepropose a modular, flexible, and scalable heterogeneous multi-processor architecturetemplate based on distributed shared memory, and we present an efficient andtransparent protocol for communication and (re)configuration. The protocolimplementations have been incorporated in libraries, which allows quick traversalof the various abstraction levels, so enabling incremental design. The designdecisions to be taken at each abstraction level are evaluated by means of(co-)simulation. Prototyping is used too, to verify the systems functionalcorrectness. The effectiveness of our approach is illustrated by a designcase of a multi-standard video and image codec.
design, automation, and test in europe | 2004
Santiago González Pestana; Edwin Rijpkema; Andrei Radulescu; Kees Goossens; Om Prakash Gangwal
A challenge facing designers of systems on chip (SoC) containing networks on chip (NoC) is to find NoC instances that balance the cost (e.g. area) and performance (e.g. latency and throughput). In this paper we present a simulation-based approach to address this problem. We use XML to instantiate network components (routers, network interfaces) and their composition. NoCs are evaluated in terms of cost and performance by sweeping over different parameters (e.g. network topology, network interface queue depth). We then show, how we can obtain trade-off plots by using the results obtained with our simulation environment. Finally, by means of two examples we illustrate how trade-off plots can help the NoC designers in selecting the right network based on a set of different constraints.
design, automation, and test in europe | 2004
Andrei Radulescu; John Dielissen; Kees Goossens; Edwin Rijpkema; Paul Wielage
In this paper we present a network interface for an on-chip network. Our network interface decouples computation from communication by offering a shared-memory abstraction, which is independent of the network implementation. We use a transaction-based protocol to achieve backward compatibility with existing bus protocols such as AXI, OCP and DTL. Our network interface has a modular architecture, which allows flexible instantiation. It provides both guaranteed and best-effort services via connections. These are configured via network interface ports using the network itself, instead of a separate control interconnect. An example instance of this network interface with 4 ports has an area of 0.143 mm/sup 2/ in a 0.13 /spl mu/m technology, and runs at 500 MHz.
embedded and real-time computing systems and applications | 2008
Benny Akesson; Liesbeth Steffens; Eelke Strooisma; Kees Goossens
The convergence of application domains in new systems-on-chip (SoC) results in systems with many applications with a mix of soft and hard real-time requirements. To reduce cost, resources, such as memories and interconnect, are shared between applications. However, resource sharing introduces interference between the sharing applications, making it difficult to satisfy their real-time requirements. Existing arbiters do not efficiently satisfy the requirements of applications in SoCs, as they either couple rate or allocation granularity to latency, or cannot run at high speeds in hardware with a low-cost implementation. The contribution of this paper is an arbiter called credit- controlled static-priority (CCSP), consisting of a rate regulator and a static-priority scheduler. The rate regulator isolates applications by regulating the amount of provided service in a way that decouples allocation granularity and latency. The static-priority scheduler decouples latency and rate, such that low latency can be provided to any application, regardless of the allocated rate. We show that CCSP belongs to the class of latency-rate servers and guarantees the allocated rate within a maximum latency, as required by hard real-time applications. We present a hardware implementation of the arbiter in the context of a DDR2 SDRAM controller. An instance with six ports running at 200 MHz requires an area of 0.0223 mm2 in a 90 nm CMOS process.
Vlsi Design | 2007
Andreas Hansson; Kees Goossens; Andrei Rădulescu
One of the key steps in Network-on-Chip-based design is spatial mapping of cores and routing of the communication between those cores. Known solutions to the mapping and routing problems first map cores onto a topology and then route communication, using separate and possibly conflicting objective functions. In this paper, we present a unified single-objective algorithm, called Unified MApping, Routing, and Slot allocation (UMARS+). As the main contribution, we show how to couple path selection, mapping of cores, and channel time-slot allocation to minimize the network required to meet the constraints of the application. The time-complexity of UMARS+ is low and experimental results indicate a run-time only 20% higher than that of path selection alone. We apply the algorithm to an MPEG decoder System-on-Chip, reducing area by 33%, power dissipation by 35%, and worst-case latency by a factor four over a traditional waterfall approach.
networks-on-chips | 2003
Kees Goossens; John Dielissen; J Jef van Meerbergen; Peter Poplavko; Andrei Rădulescu; Edwin Rijpkema; Erwin Waterlander; Paul Wielage
Users expect a predictable quality of service (QOS) of embedded systems, even for future, more dynamic, applications. System-on-chip designers use networks on chip (NOC) to solve deep submicron problems, and to divide global problems into local, decoupled problems. NOCs provide services through protocol stacks, and introducing guaranteed services enables IP re-use and platform-based design. It also provides globally predictable behaviour, as required by the user, when combining local, decoupled solutions. There are several levels of QOS commitment (correctness, completion, completion bounds), with increasing cost. A combination of guaranteed and best-effort (no commitment) services combines their respective attractive features: predictable behaviour, and good average resource utilisation. The AETHEREAL NOC is an example of this approach, and forms the basis of a QOS-based design style, as advocated in this chapter.
digital systems design | 2002
Paul Wielage; Kees Goossens
Continuing VLSI technology scaling raises several deep submicron (DSM) problems like relatively slow interconnect, power dissipation and distribution, and signal integrity. Those problems are encountered particularly on long wires for global interconnect. As clock frequencies increase, scaled wires become relatively slower and on-chip communication will be the limiting performance factor of future chips. We explain why efficiently sharing of the wires for long distance communication is the solution to this problem. We introduce networks on silicon (NoS), that route packets over shared (semi)-global wires. NoS performance is expected to be high, but comes at a cost. Balancing the performance and cost of a NoS is a major challenge, and we believe busses still have a role to play.
IEEE Transactions on Computers | 2014
Radu Stefan; Anca Mariana Molnos; Kees Goossens
Networks-on-Chip (NoC) are seen as promising interconnect solutions, offering the advantages of scalability and high-frequency operation which the traditional bus interconnects lack. Several NoC implementations have been presented in the literature, some of them having mature tool-flows. The main differentiating factor between the various implementations is the set of services and communication patterns they offer to the end-user. In this paper we present dAElite, a TDM Network-on-Chip that offers a unique combination of features, namely, guaranteed bandwidth and latency per connection, built-in support for multicast, and a short connection set-up time. While our NoC was designed from the ground up, we leverage on existing tools for network dimensioning, analysis, and instantiation. We have implemented and tested our proposal in hardware and we compared it to Æthereal, a state-of-the-art NoC with similar features, but no multicast. We find that the connection set-up time is reduced by a factor of 10 and the network traversal latency is decreased by 33 percent. Moreover, considering realistic values of the network parameters, dAElite has a lower hardware area when synthesized in 65 nm technology.