Is this you? Create Your Porfile

Erno Salminen

Tampere University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Erno Salminen is active.

Explore More

Publication

Featured researches published by Erno Salminen.

ACM Transactions in Embedded Computing Systems | 2006

UML-based multiprocessor SoC design framework

Tero Kangas; Petri Kukkala; Heikki Orsila; Erno Salminen; Marko Hännikäinen; Timo D. Hämäläinen; Jouni Riihimäki; Kimmo Kuusilinna

This paper describes a complete design flow for multiprocessor systems-on-chips (SoCs) covering the design phases from system-level modeling to FPGA prototyping. The design of complex heterogeneous systems is enabled by raising the abstraction level and providing several system-level design automation tools. The system is modeled in a UML design environment following a new UML profile that specifies the practices for orthogonal application and architecture modeling. The design flow tools are governed in a single framework that combines the subtools into a seamless flow and visualizes the design process. Novel features also include an automated architecture exploration based on the system models in UML, as well as the automatic back and forward annotation of information in the design flow. The architecture exploration is based on the global optimization of systems that are composed of subsystems, which are then locally optimized for their particular purposes. As a result, the design flow produces an optimized component allocation, task mapping, and scheduling for the described application. In addition, it implements the entire system for FPGA prototyping board. As a case study, the design flow is utilized in the integration of state-of-the-art technology approaches, including a wireless terminal architecture, a network-on-chip, and multiprocessing utilizing RTOS in a SoC. In this study, a central part of a WLAN terminal is modeled, verified, optimized, and prototyped with the presented framework.

international symposium on circuits and systems | 2002

Overview of bus-based system-on-chip interconnections

Erno Salminen; Vesa Lahtinen; Kimmo Kuusilinna; Timo D. Hämäläinen

This paper introduces the basic properties, such as structure, transfer properties and arbitration of bus-based interconnections for System-on-Chip (SoC) designs. The overview shows that contemporary SoC buses differ only in minor details. As a result, practically every studied interconnection method could rather easily conform to a common interface. Such an interface would enhance design re-use and make system design easier. However, due to their similarity, the choice between buses is not a straightforward task.

Journal of Systems Architecture | 2007

Automated memory-aware application distribution for Multi-processor System-on-Chips

Heikki Orsila; Tero Kangas; Erno Salminen; Timo D. Hämäläinen; Marko Hännikäinen

Mapping of applications on a Multi-processor System-on-Chip (MP-SoC) is a crucial step to optimize performance, energy and memory constraints at the same time. The problem is formulated as finding solutions to a cost function of the algorithm performing mapping and scheduling under strict constraints. Our solution is based on simultaneous optimization of execution time and memory consumption whereas traditional methods only concentrate on execution time. Applications are modeled as static acyclic task graphs that are mapped on MP-SoC with customized simulated annealing. The automated mapping in this paper is especially purposed for MP-SoC architecture exploration, which typically requires a large number of trials without human interaction. For this reason, a new parameter selection scheme for simulated annealing is proposed that sets task mapping specific optimization parameters automatically. The scheme bounds optimization iterations to a reasonable limit and defines an annealing schedule that scales up with application and architecture complexity. The presented parameter selection scheme compared to extensive optimization achieves 90% goodness in results with only 5% optimization time, which helps large-scale architecture exploration where optimization time is important. The optimization procedure is analyzed with simulated annealing, group migration and random mapping algorithms using test graphs from the Standard Task Graph Set. Simulated annealing is found better than other algorithms in terms of both optimization time and the result. Simultaneous time and memory optimization method with simulated annealing is shown to speed up execution by 63% without memory buffer size increase. As a comparison, optimizing only execution time yields 112% speedup, but also increases memory buffers by 49%.

networks on chips | 2007

Towards Open Network-on-Chip Benchmarks

Cristian Grecu; André Ivanov; R. Pande; Axel Jantsch; Erno Salminen; Umit Y. Ogras; Radu Marculescu

Measuring and comparing performance, cost, and other features of advanced communication architectures for complex multi core/multiprocessor systems on chip is a significant challenge which has hardly been addressed so far. This document outlines the top-level view on a system of benchmarks for networks on chip (NoC), which intends to cover a wide spectrum of NoC design aspects, from application modeling to performance evaluation and post-manufacturing test and reliability. For performance benchmarking, requirements and features are described for application programs, synthetic micro-benchmarks, and abstract benchmark applications. Then, it proposes ways to measure and benchmark reliability, fault tolerance and testability of the on-chip communication fabric. This paper introduces the main concepts and ideas for benchmarking NoCs in a systematic and comparable way. It will be followed up by a report that will define a benchmark framework and the syntax of interfaces for benchmark programs that will allow the community to build-up a benchmark suite

digital systems design | 2007

On network-on-chip comparison

Erno Salminen; Ari Kulmala; Timo D. Hämäläinen

This paper presents the state-of-the-art in the field of network-on-chip (NoC) benchmarking and comparison. The study identifies the mainstream approaches, how NoCs are currently evaluated, and shows which aspects have been covered and those needing more research effort. No single article can cover all the aspects, and therefore, possibility to compare results from various sources must be ensured by proper scientific reporting. Basic guidelines for achieving that are given.

signal processing systems | 2006

HIBI Communication Network for System-on-Chip

Erno Salminen; Tero Kangas; Timo D. Hämäläinen; Jouni Riihimäki; Vesa Lahtinen; Kimmo Kuusilinna

This paper presents a communication network targeted for complex system-on-chip (SoC) and network-on-chip (NoC) designs. The Heterogeneous IP Block Interconnection (HIBI) aims at maximum efficiency and minimum energy per transmitted bit combined with quality-of-service (QoS) in transfers. Other features include support for hierarchical topologies with several clock domains, flexible scalability, and runtime reconfiguration of network parameters. HIBI is intended for integrating coarse-grain components such as intellectual property (IP) blocks that have size of thousands of gates.HIBI has been implemented in VHDL and SystemC and synthesized on several CMOS technologies and on FPGA. A 32-bit wrapper requires 5400 gates and runs with 315 MHz on 0.18 μ m technology which shows that only minimal area overhead is paid for the advanced features. The area and frequency results are well comparable to other NoC proposals.Furthermore, data transfers are shown to approach the maximum theoretical performance for protocol efficiency. HIBI network is accompanied with a design framework with tools for optimizing the system through automated design space exploration.

international symposium on circuits and systems | 2005

HIBI-based multiprocessor SoC on FPGA

Erno Salminen; Ari Kulmala; Timo D. Hämäläinen

An FPGA offers an excellent platform for a system-on-chip consisting of intellectual property (IP) blocks. The problem is that IP blocks and their interconnections are often FPGA vendor dependent. Our HIBI (heterogeneous IP block interconnection) network-on-chip (NoC) scheme solves the problem by providing a flexible interconnection network and IP block integration with an open core protocol (OCP) interface. Therefore, IP components can be of any type: processors; hardware accelerators; communication interfaces; memories. As a proof of concept, a multiprocessor system with eight soft processor cores and HIBI is prototyped on FPGA. The whole system uses 36,402 logic elements, 2.9 Mbits of RAM, and operates at 78 MHz frequency on the Altera Stratix 1S40, which is comparable to other FPGA multiprocessors. The most important benefit is significant reduction of the design effort compared to system specific interconnection networks. HIBI also presents the first OCP compliant IP-block integration in FPGA.

Journal of Systems Architecture | 2012

MARTE profile extension for modeling dynamic power management of embedded systems

Tero Arpinen; Erno Salminen; Timo D. Hämäläinen; Marko Hännikäinen

The profile for Modeling and Analysis of Real-time and Embedded systems (MARTE) is a standard UML profile promoted by the Object Management Group (OMG). MARTE defines a framework for annotating non-functional properties of embedded systems to UML models as well as a generic package for modeling power consumption and heat dissipation of HW components. However, for modeling and analysing systems that adopt complex dynamic power management (DPM) policies and techniques additional expression power is needed. This article presents a way of modeling system-wide dynamic power management aspects of embedded systems with a UML2 profile extension. The proposed profile is compatible with the MARTE profile and can be used as its extension. The main idea of our proposal is that each HW component is associated with a state machine description that defines its time-variant power characteristics. Based on these, the system-wide power configurations are identified and modeled. Finally, application use cases or operational modes are bound to execute on certain power configurations. The models can be analysed to estimate the total energy dissipation. The MARTE and proposed DPM profile are used to model two case study platforms with different kind of DPM strategies.

field-programmable logic and applications | 2005

A parallel MPEG-4 encoder for FPGA based multiprocessor SoC

Olli Lehtoranta; Erno Salminen; Ari Kulmala; Marko Hännikäinen; Timo D. Hämäläinen

A parallel MPEG-4 simple profile encoder for FPGA based multiprocessor system-on-chip (SoC) is presented. The goal is a computationally scalable framework independent of platform. The scalability is achieved by spatial parallelization where images are divided to horizontal slices. Slice coding tasks are mapped to the multiprocessor consisting of four soft-cores arranged into master-slave configuration. Also, the shared memory model is adopted where large images are stored in shared external memory while small on-chip buffers are used for processing. The interconnections between memories and processors are realized with our HIBI network. Our main contributions are the scalable encoder framework as well as methods for coping with limited memory of FPGA. The current software only implementation processes 6 QCIF frames/s with three encoding slaves. In practice, speed-ups of 1.7 and 2.3 have been measured with two and three slaves, respectively. FPGA utilization of current implementation is 59% requiring 24 207 logic elements on Altera Stratix EP1S40.

international symposium on system-on-chip | 2006

Parameterizing Simulated Annealing for Distributing Task Graphs on Multiprocessor SoCs

Heikki Orsila; Tero Kangas; Erno Salminen; Timo D. Hämäläinen

Mapping an application on multiprocessor system-on-chip (MPSoC) is a crucial step in architecture exploration. The problem is to minimize optimization effort and application execution time. Simulated annealing is a versatile algorithm for hard optimization problems, such as task distribution on MPSoCs. We propose a new method of automatically selecting parameters for a modified simulated annealing algorithm to save optimization effort. The method determines a proper annealing schedule and transition probabilities for simulated annealing, which makes the algorithm scalable with respect to application and platform size. Applications are modeled as static acyclic task graphs which are mapped to an MPSoC. The parameter selection method is validated by extensive simulations with 50 and 300 node graphs from the standard graph set

Explore More