Is this you? Create Your Porfile

Jens Sparsø

Technical University of Denmark

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jens Sparsø is active.

Explore More

Publication

Featured researches published by Jens Sparsø.

design, automation, and test in europe | 2005

A Router Architecture for Connection-Oriented Service Guarantees in the MANGO Clockless Network-on-Chip

Tobias Bjerregaard; Jens Sparsø

On-chip networks for future system-on-chip designs need simple, high performance implementations. In order to promote system-level integrity, guaranteed services (GS) need to be provided. We propose a network-on-chip (NoC) router architecture to support this, and demonstrate with a CMOS standard cell design. Our implementation is based on clockless circuit techniques, and thus inherently supports a modular GALS-oriented design flow. Our router exploits virtual channels to provide connection-oriented GS, as well as connection-less best-effort (BE) routing. The architecture is highly flexible, in that support for different types of BE routing and GS arbitration can be easily plugged into the router.

IEEE Transactions on Very Large Scale Integration Systems | 1994

Low-power operation using self-timed circuits and adaptive scaling of the supply voltage

Lars Skovby Nielsen; Cornelis Niessen; Jens Sparsø; C. H. van Berkel

Recent research has demonstrated that for certain types of applications like sampled audio systems, self-timed circuits can achieve very low power consumption, because unused circuit parts automatically turn into a stand-by mode. Additional savings may be obtained by combining the self-timed circuits with a mechanism that adaptively adjusts the supply voltage to the smallest possible, while maintaining the performance requirements. This paper describes such a mechanism, analyzes the possible power savings, and presents a demonstrator chip that has been fabricated and tested. The idea of voltage scaling has been used previously in synchronous circuits, and the contributions of the present paper are: 1) the combination of supply scaling and self-timed circuitry which has some unique advantages, and 2) the thorough analysis of the power savings that are possible using this technique. >

networks on chips | 2008

ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology

Mikkel Bystrup Stensgaard; Jens Sparsø

This paper presents a network-on-chip (NoC) architecture that enables the network topology to be reconfigured. The architecture thus enables a generalized System.-on-Chip (SoC) platform in which the topology can be customized for the application that is currently running on the chip, including long links and direct links between IP-blocks. The configurability is inserted as a layer between routers and links, and the architecture can therefore be used in combination with existing NoC routers, making it a general architecture. The topology is configured using energy-efficient topology switches based on physical circuit-switching as found in FPGAs. The paper presents the ReNoC (Reconfigurable NoC) architecture and evaluates its potential. The evaluation design shows a 56% decrease in power consumption compared to a static 2D mesh topology.

Proceedings of the IEEE | 1999

Designing asynchronous circuits for low power: an IFIR filter bank for a digital hearing aid

Lars Skovby Nielsen; Jens Sparsø

This paper addresses the design of asynchronous circuits for low power through an example: a filter bank for a digital hearing aid. The asynchronous design re-implements an existing synchronous circuit which is used in a commercial product. For comparison, both designs have been fabricated in the same 0.7 /spl mu/m CMOS technology. When processing typical data (less than 50 dB sound pressure), the asynchronous control and data-path logic, an improved RAM design, and by a mechanism that adapts the number range to the actual need (exploiting the fact that typical audio signals are characterized by numerically small samples). Apart from the improved RAM design, these measures are only viable in an asynchronous design. The principles and techniques explained in this paper are of a general nature, and they apply to the design of asynchronous low-power digital signal-processing circuits in a broader perspective. In fact, this understanding is one of the contributions of the paper. Finally, the paper can be read as an example-driven introduction to asynchronous low-power design.

ieee international symposium on asynchronous circuits and systems | 2005

Scheduling discipline for latency and bandwidth guarantees in asynchronous network-on-chip

Tobias Bjerregaard; Jens Sparsø

Guaranteed services (GS) are important in that they provide predictability in the complex dynamics of shared communication structures. This paper discusses the implementation of GS in an asynchronous network-on-chip. We present a novel scheduling discipline called asynchronous latency guarantee (ALG) scheduling, which provides latency and bandwidth guarantees in accessing a shared media, e.g. a physical link shared between a number of virtual channels. ALG overcomes the drawbacks of existing scheduling disciplines, in particular, the coupling between latency and bandwidth guarantees. A 0.12 /spl mu/m CMOS standard cell implementation of an ALG link has been simulated. The operation speed of the design was 702 MDI/s.

Integration | 1993

Delay-insensitive multi-ring structures

Jens Sparsø; Jørgen Staunstrup

Abstract This paper describes a set of simple design and performance analysis techniques that have been successfully used to design a number of non-trivial delay-insensitive circuits. Examples are building blocks for digital filters and a vector multiplier using a serial-parallel multiply and accumulate algorithm. The vector multiplier has been laid out, submitted for fabrication, and successfully tested. This design is described in detail to illustrate the design and the performance analysis techniques. The design technique is based on a data flow approach using pipelines and rings that are composed into larger multi-ring structures. For this restricted class of structures, it becomes possible — even for circuits of realistic size and complexity — to analyze the performance and establish an understanding of the bottlenecks. The paper combines a number of previously published results and techniques, and the main contribution of the paper is the comprehensive, integrated presentation of the material, including a thorough description of the vector multiplier design example.

european design automation conference | 1992

Design of delay insensitive circuits using multi-ring structures

Jens Sparsø; Jørgen Staunstrup; Michael Dantzer-Sørensen

The design and VLSI implementation of a delay insensitive circuit that computes the inner product of two vec.tors is described. The circuit is based on an iterative serial-parallel multiplication algorithm. The design is based on a data flow approach using pipelines and rings that are combined into larger multi ring structures by the joining and forking of signals. The implementation is based on a small set of building blocks (latches, combinational circuits and switches) that are composed of C-elements and simple gates. By following this approach, delay insensitive circuits with nontrivial functionality and reasonable performance are readily designed.<<ETX>>

networks on chips | 2012

A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems

Martin Schoeberl; Florian Brandner; Jens Sparsø; Evangelia Kasapaki

This paper explores the design of a circuit-switched network-on-chip (NoC) based on time-division-multiplexing (TDM) for use in hard real-time systems. Previous work has primarily considered application-specific systems. The work presented here targets general-purpose hardware platforms. We consider a system with IP-cores, where the TDM-NoC must provide directed virtual circuits - all with the same bandwidth - between all nodes. This may not be a frequent scenario, but a general platform should provide this capability, and it is an interesting point in the design space to study. The paper presents an FPGA-friendly hardware design, which is simple, fast, and consumes minimal resources. Furthermore, an algorithm to find minimum-period schedules for all-to-all virtual circuits on top of typical physical NoC topologies like 2D-mesh, torus, bidirectional torus, tree, and fat-tree is presented. The static schedule makes the NoC time-predictable and enables worst-case execution time analysis of communicating real-time tasks.

IEEE Journal of Solid-state Circuits | 1991

An area-efficient topology for VLSI implementation of Viterbi decoders and other shuffle-exchange type structures

Jens Sparsø; Henrik Jørgensen; Erik Paaske; Steen Pedersen; Thomas Rübner-Petersen

A topology for single-chip implementation of computing structures based on shuffle-exchange (SE)-type interconnection networks is presented. The topology is suited for structures with a small number of processing elements (i.e. 32-128) whose area cannot be neglected compared to the area required for interconnection. The processing elements are implemented in pairs that are connected to form a ring. In this way three-quarters of the interconnections are between neighbors. The ring structure is laid out in two columns and the interconnection of nonneighbors is routed in the channel between the columns. The topology has been used in a VLSI implementation of the add-compare-select (ACS) module of a fully parallel K=7, R=1/2 Viterbi decoder. Both the floor-planning issues and some of the important algorithm and circuit-level aspects of this design are discussed. The chip has been designed and fabricated in a 2- mu m CMOS process using MOSIS-like simplified design rules. The chip operates at speeds up to 19 MHz under worst-case conditions (V/sub DD/=4.75 V and T/sub A/=70 degrees C). The core of the chip (excluding pad cells) is 7.8*5.1 mm/sup 2/ and contains approximately 50000 transistors. The interconnection network occupies 32% of the area. >

design, automation, and test in europe | 2005

A Network Traffic Generator Model for Fast Network-on-Chip Simulation

Shankar Mahadevan; Federico Angiolini; Michael Storgaard; Rasmus Grøndahl Olsen; Jens Sparsø; Jan Madsen

For systems-on-chip (SoC) development, a predominant part of the design time is the simulation time. Performance evaluation and design space exploration of such systems in bit- and cycle-true fashion is becoming prohibitive. We propose a traffic generation (TG) model that provides a fast and effective network-on-chip (NoC) development and debugging environment. By capturing the type and the timestamp of communication events at the boundary of an IP core in a reference environment, the TG can subsequently emulate the cores communication behavior in different environments. Access patterns and resource contention in a system are dependent on the interconnect architecture, and our TG is designed to capture the resulting reactiveness. The regenerated traffic, which represents a realistic workload, can thus be used to undertake faster architectural exploration of interconnection alternatives, effectively decoupling simulation of IP cores and of interconnect fabrics. The results with the TG on an AMBA interconnect show a simulation time speedup above a factor of 2 over a complete system simulation, with close to 100 % accuracy.

Explore More