Is this you? Create Your Porfile

Juha Plosila

Information Technology University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Juha Plosila is active.

Explore More

Publication

Featured researches published by Juha Plosila.

Vlsi Design | 2007

Online Reconfigurable Self-Timed Links for Fault Tolerant NoC

Teijo Lehtonen; Pasi Liljeberg; Juha Plosila

We propose link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. The protection against transient errors is realized using Hamming coding and interleaving for error detection and retransmission as the recovery method. We introduce two approaches for tackling the intermittent and permanent errors. In the first approach, spare wires are introduced together with reconfiguration circuitry. The other approach uses time redundancy, the transmission is split into two parts, where the data is doubled. In both structures the presence of permanent or intermittent errors is monitored by analyzing previous error syndromes. The links are based on self-timed signaling in which the handshake signals are protected using triple modular redundancy. We present the structures, operation, and designs for the different components of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

IEEE Transactions on Services Computing | 2015

Using Ant Colony System to Consolidate VMs for Green Cloud Computing

Fahimeh Farahnakian; Adnan Ashraf; Tapio Pahikkala; Pasi Liljeberg; Juha Plosila; Ivan Porres; Hannu Tenhunen

High energy consumption of cloud data centers is a matter of great concern. Dynamic consolidation of Virtual Machines (VMs) presents a significant opportunity to save energy in data centers. A VM consolidation approach uses live migration of VMs so that some of the under-loaded Physical Machines (PMs) can be switched-off or put into a low-power mode. On the other hand, achieving the desired level of Quality of Service (QoS) between cloud providers and their users is critical. Therefore, the main challenge is to reduce energy consumption of data centers while satisfying QoS requirements. In this paper, we present a distributed system architecture to perform dynamic VM consolidation to reduce energy consumption of cloud data centers while maintaining the desired QoS. Since the VM consolidation problem is strictly NP-hard, we use an online optimization metaheuristic algorithm called Ant Colony System (ACS). The proposed ACS-based VM Consolidation (ACS-VMC) approach finds a near-optimal solution based on a specified objective function. Experimental results on real workload traces show that ACS-VMC reduces energy consumption while maintaining the required performance levels in a cloud data center. It outperforms existing VM consolidation approaches in terms of energy consumption, number of VM migrations, and QoS requirements concerning performance.

design automation conference | 2013

Smart hill climbing for agile dynamic mapping in many-core systems

Mohammad Fattah; Masoud Daneshtalab; Pasi Liljeberg; Juha Plosila

Stochastic hill climbing algorithm is adapted to rapidly find the appropriate start node in the application mapping of network-based many-core systems. Due to highly dynamic and unpredictable workload of such systems, an agile run-time task allocation scheme is required. The scheme is desired to map the tasks of an incoming application at run-time onto an optimum contiguous area of the available nodes. Contiguous and unfragmented area mapping is to settle the communicating tasks in close proximity. Hence, the power dissipation, the congestion between different applications, and the latency of the system will be significantly reduced. To find an optimum region, we first propose an approximate model that quickly estimates the available area around a given node. Then the stochastic hill climbing algorithm is used as a search heuristic to find a node that has the required number of available nodes around it. Presented agile climber takes the steps using an adapted version of hill climbing algorithm named Smart Hill Climbing, SHiC, which takes the runtime status of the system into account. Finally, the application mapping is performed starting from the selected first node. Experiments show significant gain in the mapping contiguousness which results in better network latency and power dissipation, compared to state-of-the-art works.

networks on chips | 2012

HARAQ: Congestion-Aware Learning Model for Highly Adaptive Routing Algorithm in On-Chip Networks

Masoumeh Ebrahimi; Masoud Daneshtalab; Fahimeh Farahnakian; Juha Plosila; Pasi Liljeberg; Maurizio Palesi; Hannu Tenhunen

The occurrence of congestion in on-chip networks can severely degrade the performance due to increased message latency. In mesh topology, minimal methods can propagate messages over two directions at each switch. When shortest paths are congested, sending more messages through them can deteriorate the congestion condition considerably. In this paper, we present an adaptive routing algorithm for on-chip networks that provide a wide range of alternative paths between each pair of source and destination switches. Initially, the algorithm determines all permitted turns in the network including 180-degree turns on a single channel without creating cycles. The implementation of the algorithm provides the best usage of all allowable turns to route messages more adaptively in the network. On top of that, for selecting a less congested path, an optimized and scalable learning method is utilized. The learning method is based on local and global congestion information and can estimate the latency from each output channel to the destination region.

software engineering and advanced applications | 2013

LiRCUP: Linear Regression Based CPU Usage Prediction Algorithm for Live Migration of Virtual Machines in Data Centers

Fahimeh Farahnakian; Pasi Liljeberg; Juha Plosila

Virtualization is a vital technology of cloud computing which enables the partition of a physical host into several Virtual Machines (VMs). The number of active hosts can be reduced according to the resources requirements using live migration in order to minimize the power consumption in this technology. However, the Service Level Agreement (SLA) is essential for maintaining reliable quality of service between data centers and their users in the cloud environment. Therefore, reduction of the SLA violation level and power costs are considered as two objectives in this paper. We present a CPU usage prediction method based on the linear regression technique. The proposed approach approximates the short-time future CPU utilization based on the history of usage in each host. It is employed in the live migration process to predict over-loaded and under-loaded hosts. When a host becomes over-loaded, some VMs migrate to other hosts to avoid SLA violation. Moreover, first all VMs migrate from a host while it becomes under-loaded. Then, the host switches to the sleep mode for reducing power consumption. Experimental results on the real workload traces from more than a thousand Planet Lab VMs show that the proposed technique can significantly reduce the energy consumption and SLA violation rates.

networks on chips | 2011

Congestion aware, fault tolerant, and thermally efficient inter-layer communication scheme for hybrid NoC-bus 3D architectures

Amir-Mohammad Rahmani; Khalid Latif; Kameswar Rao Vaddina; Pasi Liljeberg; Juha Plosila; Hannu Tenhunen

Three-dimensional IC technology offers greater device integration and shorter interlayer interconnects. In order to take advantage of these attributes, 3D stacked mesh architecture was proposed which is a hybrid between packet-switched network and a bus. Stacked mesh is a feasible architecture which provides both performance and area benefits, while suffering from inefficient intermediate buffers. In this paper, an efficient architecture to optimize system performance, power consumption, and reliability of stacked mesh 3D NoC is proposed. The mechanism benefits from a congestion-aware and bus failure tolerant routing algorithm called AdaptiveZ for vertical communication. In addition, we hybridize the proposed adaptive routing with available algorithms to mitigate the thermal issues by herding most of the switching activities closer to the heat sink. Our extensive simulations with synthetic and real benchmarks, including the one with an integrated video-conference application, demonstrate significant power, performance, and peak temperature improvements compared to a typical stacked mesh 3D NoC.

parallel, distributed and network-based processing | 2014

Energy-Efficient Virtual Machines Consolidation in Cloud Data Centers Using Reinforcement Learning

Fahimeh Farahnakian; Pasi Liljeberg; Juha Plosila

Dynamic consolidation techniques optimize resource utilization and reduce energy consumption in Cloud data centers. They should consider the variability of the workload to decide when idle or underutilized hosts switch to sleep mode in order to minimize energy consumption. In this paper, we propose a Reinforcement Learning-based Dynamic Consolidation method (RL-DC) to minimize the number of active hosts according to the current resources requirement. The RL-DC utilizes an agent to learn the optimal policy for determining the host power mode by using a popular reinforcement learning method. The agent learns from past knowledge to decide when a host should be switched to the sleep or active mode and improves itself as the workload changes. Therefore, RL-DC does not require any prior information about workload and it dynamically adapts to the environment to achieve online energy and performance management. Experimental results on the real workload traces from more than a thousand PlanetLab virtual machines show that RL-DC minimizes energy consumption and maintains required performance levels.

IEEE Transactions on Computers | 2014

Path-Based Partitioning Methods for 3D Networks-on-Chip with Minimal Adaptive Routing

Masoumeh Ebrahimi; Masoud Daneshtalab; Pasi Liljeberg; Juha Plosila; Jose Flich; Hannu Tenhunen

Combining the benefits of 3D ICs and Networks-on-Chip (NoCs) schemes provides a significant performance gain in Chip Multiprocessors (CMPs) architectures. As multicast communication is commonly used in cache coherence protocols for CMPs and in various parallel applications, the performance of these systems can be significantly improved if multicast operations are supported at the hardware level. In this paper, we present several partitioning methods for the path-based multicast approach in 3D mesh-based NoCs, each with different levels of efficiency. In addition, we develop novel analytical models for unicast and multicast traffic to explore the efficiency of each approach. In order to distribute the unicast and multicast traffic more efficiently over the network, we propose the Minimal and Adaptive Routing (MAR) algorithm for the presented partitioning methods. The analytical and experimental results show that an advantageous method named Recursive Partitioning (RP) outperforms the other approaches. RP recursively partitions the network until all partitions contain a comparable number of switches and thus the multicast traffic is equally distributed among several subsets and the network latency is considerably decreased. The simulation results reveal that the RP method can achieve performance improvement across all workloads while performance can be further improved by utilizing the MAR algorithm. Nineteen percent average and 42 percent maximum latency reduction are obtained on SPLASH-2 and PARSEC benchmarks running on a 64-core CMP.

international conference on computer design | 2012

CoNA: Dynamic application mapping for congestion reduction in many-core systems

Mohamamd Fattah; Marco Ramirez; Masoud Daneshtalab; Pasi Liljeberg; Juha Plosila

Increasing the number of processors in a single chip toward network-based many-core systems requires a run-time task allocation algorithm. We propose an efficient mapping algorithm that assigns communicating tasks of incoming applications onto resources of a many-core system utilizing Network-on-Chip paradigm. In our contiguous neighborhood allocation (CoNA) algorithm, we target at the reduction of both internal and external congestion due to detrimental impact of congestion on the network performance. We approach the goal by keeping the mapped region contiguous and placing the communicating tasks in a close neighborhood. A completely synthesizable simulation environment where none of the system objects are assumed to be ideal is provided. Experiments show at least 40% gain in different mapping cost functions, as well as 16% reduction in average network latency compared to existing algorithms.

asia and south pacific design automation conference | 2013

MD: Minimal path-based fault-tolerant routing in on-Chip Networks

Masoumeh Ebrahimi; Masoud Daneshtalab; Juha Plosila; Farhad Mehdipour

The communication requirements of many-core embedded systems are convened by the emerging Network-on-Chip (NoC) paradigm. As on-chip communication reliability is a crucial factor in many-core systems, the NoC paradigm should address the reliability issues. Using fault-tolerant routing algorithms to reroute packets around faulty regions will increase the packet latency and create congestion around the faulty region. On the other hand, the performance of NoC is highly affected by the network congestion. Congestion in the network can increase the delay of packets to route from a source to a destination, so it should be avoided. In this paper, a minimal and defect-resilient (MD) routing algorithm is proposed in order to route packets adaptively through the shortest paths in the presence of a faulty link, as long as a path exists. To avoid congestion, output channels can be adaptively chosen whenever the distance from the current to destination node is greater than one hop along both directions. In addition, an analytical model is presented to evaluate MD for two-faulty cases.

Explore More