Is this you? Create Your Porfile

Jari Nurmi

Tampere University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jari Nurmi is active.

Explore More

Publication

Featured researches published by Jari Nurmi.

IEEE Transactions on Very Large Scale Integration Systems | 2007

Applying CDMA Technique to Network-on-Chip

Xin Wang; Tapani Ahonen; Jari Nurmi

The issues of applying the code-division multiple access (CDMA) technique to an on-chip packet switched communication network are discussed in this paper. A packet switched network-on-chip (NoC) that applies the CDMA technique is realized in register-transfer level (RTL) using VHDL. The realized CDMA NoC supports the globally-asynchronous locally-synchronous (GALS) communication scheme by applying both synchronous and asynchronous designs. In a packet switched NoC, which applies a point-to-point connection scheme, e.g., a ring topology NoC, data transfer latency varies largely if the packets are transferred to different destinations or to the same destination through different routes in the network. The CDMA NoC can eliminate the data transfer latency variations by sharing the data communication media among multiple users concurrently. A six-node GALS CDMA on-chip network is modeled and simulated. The characteristics of the CDMA NoC are examined by comparing them with the characteristics of an on-chip bidirectional ring topology network. The simulation results reveal that the data transfer latency in the CDMA NoC is a constant value for a certain length of packet and is equivalent to the best case data transfer latency in the bidirectional ring network when data path width is set to 32 bits.

system-level interconnect prediction | 2004

Topology optimization for application-specific networks-on-chip

Tapani Ahonen; David A. Sigüenza-Tortosa; Hong Bin; Jari Nurmi

Compared to the well understood macro networks, networks-on-chip introduce novel design challenges. The characteristics of the system data flows and the knowledge of the required wire lengths can be exploited to optimize for speed and power consumption. A component library for flexible construction of interconnection architectures is being developed at the Tampere University of Technology to enable the creation of application development platforms. The overall design flow of these development platforms is reviewed in this paper. Network-on-chip topology optimization is addressed by describing the methodologies used by an effective design automation tool. The detailed cost functions of the tool capture the factors contributing to the speed and power consumption of asynchronous interconnections, while different abstraction level input information is supported. A case study into the application domain of industrial process control and monitoring is presented in order to evaluate the result quality.

symposium/workshop on electronic design, test and applications | 2002

Interconnect IP node for future system-on-chip designs

Ilkka Saastamoinen; David A. Sigüenza-Tortosa; Jari Nurmi

An interconnect IP (intellectual Property) node architecture for flexible on-chip communication is introduced. This architecture is targeted for communication in future gigatransistor SoC (System-on-Chip) designs. The interconnect IP will be used as a testing platform when the efficiency of network topologies and routing schemes are investigated for the on-chip environment. The interconnect node uses packet based communication and forms a reusable component itself. The node is constructed from a collection of parameterized and reusable hardware blocks, which include components such as FIFO buffers, routing controllers and standardized interface wrappers. A node can be tuned to fulfill the desired characteristics of communication by selecting the internal architecture properly.

Integration | 2004

Issues in the development of a practical NoC: the Proteo concept

David A. Sigüenza-Tortosa; Tapani Ahonen; Jari Nurmi

Network-on-Chip will be one of the cornerstones of future electronics. At Tampere University of Technology we have been working on the development of our own proposal for a flexible on-chip communication network, called Proteo. Proteo introduces the concept of an open library of communication components that can be selected and configured to build highly-customized networks-on-chip. The designer of a new System-on-Chip platform starts with a description of the hardware components of the system and an abstract model of the problem application, and with the help of the Proteo software tools, obtains a synthesizable instance of a packet-switching network that, ideally, meets his requirements. The constraints placed on the type of designs that may use Proteo are minimal and an important part of the process should be automated. In this article we introduce the philosophy behind the project in relation to fundamental deep submicron technology problems, and some of our initial results.

international symposium on circuits and systems | 2003

Buffer implementation for Proteo network-on-chip

Ilkka Saastamoinen; M. Alho; Jari Nurmi

Proteo is a synthesizable packet switched NoC (Network-on-Chip) architecture which is built from a library of interconnect IP (Intellectual Property) blocks,. The library includes two types of blocks: interfaces to the network and routing nodes, which are the building blocks of the actual communication structure. When it is necessary to store packets, they are placed in FIFO buffers in the interconnect IPs. Compared to the control logic the buffers are functionally simple, but in networks they consume most of the silicon area. However the smaller the buffers are, the greater is the possibility that some traffic is lost. In this paper the properties of buffers are studied with a test network. Gate-level estimates of area of the networks, were generated using 0.18 /spl mu/m technology. Performance of the networks and utilization of buffers in the networks were studied by simulation. Simulation and synthesis show that there exists an optimal point where the product of the required silicon area and the required clock cycles of the simulation is minimized. Since buffers consume most of the silicon area in the networks, the results show that it is necessary to adjust packet and buffer sizes, when an optimal cost/performance ratio of the network is desired.

international symposium on system-on-chip | 2005

Network-on-Chip: A New Paradigm for System-on-Chip Design

Jari Nurmi

Network-on-chip is a novel category for on-chip communication where the abstraction of layered protocols is utilized to modularize the communication design. This also implies that the computation and communication are separated from each other. In this invited paper, the motivation for such an approach is explained, terminology within network-on-chip area is clarified, some of the proposed networks are analyzed, and the network-on-chip design aspects discussed.

Journal of Real-time Image Processing | 2008

A coarse-grain reconfigurable architecture for multimedia applications featuring subword computation capabilities

Claudio Brunelli; Fabio Garzia; Jari Nurmi

This paper presents the design and the implementation of a coarse-grain reconfigurable machine used as an accelerator for a programmable RISC core, to speed up the execution of computationally demanding tasks like multimedia applications. We created a VHDL model of the proposed architecture and implemented it on a FPGA board for prototyping purposes; then we mapped on our architecture some DSP and image processing algorithms as a benchmark. In particular, we provided the proposed architecture with subword computation capabilities, which turns out to be extremely effective especially when dealing with image processing algorithms, achieving significant benefits in terms of speed and efficiency in resource usage. To create the configuration bitstream (configware) we created a tool based on a graphical user interface (GUI) which provides a first step towards the automation of the programming flow of our design: the tool is meant to ease the life of the programmer, relieving him from the burden of calculating the configuration bits by hand. Synthesis results indicate that the area occupation and the operating frequency of our design are reasonable also when compared to other similar design. In addition to this, the amount of clock cycles taken by our machine to perform a given algorithm is orders of magnitude smaller than the one required by a corresponding software implementation on a RISC microprocessor.

signal processing systems | 2002

Flexible implementation of a WCDMA Rake receiver

Lasse Harju; Mika Kuulusa; Jari Nurmi

This paper presents an ASIC implementation of a WCDMA Rake receiver targeted for mobile terminals. The implementation is based on a FlexRake architecture that shares hardware resources between multipath components and uses data-level parallelism for despreading multiple code channels. This approach facilitates the flexibility of multipath operation and improves the receiver hardware efficiency. The architecture was implemented using register-transfer-level VHDL description and logic synthesis with standard cells. Synthesis for 0.18 μm CMOS technology resulted in 0.238 mm2 area and 45.5 μW power consumption at 1.6 V.

international symposium on system-on-chip | 2003

COFFEE - a core for free

J. Kylliainen; Jari Nurmi; M. Kuulusa

This paper presents design and implementation of an open source processor core developed at Tampere University of Technology, Finland. The design guidelines of a RISC core are introduced and some of the typical design tradeoffs are presented. The architecture of the developed processor engine, COFFEE RISC core, is explained.

field-programmable logic and applications | 2009

CREMA: A coarse-grain reconfigurable array with mapping adaptiveness

Fabio Garzia; Waqar Hussain; Jari Nurmi

This paper presents CREMA, a coarse-grain reconfigurable array with mapping adaptiveness. Mapping adaptiveness consists of tailoring the array to a specific application requirements. Run-time reconfigurability allows the re-usage of same PE with different functionality and interconnections among the ones supported. We proved this approach very efficient if compared with a standard CGRA. In our test cases CREMA gets a performance speed-up from 1.5X to 4X, reducing in the same time the area occupation by 80%–90% in comparison with Butter CGRA.

Explore More