Is this you? Create Your Porfile

Lasse Natvig

Norwegian University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lasse Natvig is active.

Explore More

Publication

Featured researches published by Lasse Natvig.

IEEE Transactions on Education | 2009

Experimental Validation of the Learning Effect for a Pedagogical Game on Computer Fundamentals

Guttorm Sindre; Lasse Natvig; Magnus Jahre

The question/answer-based computer game Age of Computers was introduced to replace traditional weekly paper exercises in a course in computer fundamentals in 2003. Questionnaire evaluations and observation of student behavior have indicated that the students found the game more motivating than paper exercises and that a majority of the students also perceived the game to have a higher learning effect than paper exercises or textbook reading. This paper reports on a controlled experiment to compare the learning effectiveness of game play with traditional paper exercises, as well as with textbook reading. The results indicated that with equal time being spent on the various learning activities, the effect of game play was only equal to that of the other activities, not better. Yet this result is promising enough, as the increased motivation means that students work harder in the course. Also, the results indicate that the game has potential for improvement, in particular with respect to its feedback on the more complicated questions.

european conference on parallel processing | 2009

Towards an Intelligent Environment for Programming Multi-core Computing Systems

Sabri Pllana; Siegfried Benkner; Eduard Mehofer; Lasse Natvig; Fatos Xhafa

In this position paper we argue that an intelligent program development environment that proactively supports the user helps a mainstream programmer to overcome the difficulties of programming multi-core computing systems. We propose a programming environment based on intelligent software agents that enables users to work at a high level of abstraction while automating low-level implementation activities. The programming environment supports program composition in a model-driven development fashion using parallel building blocks and proactively assists the user during major phases of program development and performance tuning. We highlight the potential benefits of using such a programming environment with usage-scenarios. An experiment with a parallel building block on a Sun UltraSPARC T2 Plus processor shows how the system may assist the programmer in achieving performance improvements.

ACM Sigarch Computer Architecture News | 2006

An LRU-based replacement algorithm augmented with frequency of access in shared chip-multiprocessor caches

Haakon Dybdahl; Per Stenström; Lasse Natvig

This paper proposes a new replacement algorithm to protect cache lines with potential future reuse from being evicted. In contrast to the recency based approaches used in the past (LRU for example), our algorithm also uses the notion of frequency of access. Instead of evicting the least recently used block, our algorithm identifies among a set of LRU blocks the one that is also least-frequently-used (according to a heuristic) and chooses that as a victim. We have implemented this replacement algorithm in a detailed simulation model of a chip multiprocessor system driven by SPEC2000 benchmarks. We have found that the new scheme improves performance for memory intensive applications. Moreover, as compared to other attempts, our replacement algorithm provides robust improvements across all benchmarks. We have also extended an earlier scheme proposed by Wong and Baer so it is switched off when performance is not improved. Our results show that this makes the scheme much more suitable for CMP configurations.

technical symposium on computer science education | 2004

Age of computers: game-based teaching of computer fundamentals

Lasse Natvig; Steinar Line

Age of Computers (AoC) is a new approach to the learning activities that supplements the auditorium lectures in a computer fundamentals course with 250 students. It is a computer game that presents the students a diverse set of problems from the course topics linked to computer history. It is implemented as set of dynamic web pages retrieved from a database. A prototype was used in 2003, and the feedback is positive and a strong motivation for continuing the project. The paper describes AoC, its use and implementation.

Image and Vision Computing | 1997

High-level architectural simulation of the Torus Routing Chip

Lasse Natvig

This paper presents a simulation model of the Torus Routing Chip (TRC) written in Verilog. The model represents the functional behaviour of the routing chip down to the flit (byte) level. The TRCs are self-timed and interconnected in a 4 by 4 torus (mesh with wrap-around) having unidirectional channels along the x and y-dimension. To avoid deadlock situations, the TRC implements two virtual channels on every physical channel. The model is presented in a top down manner with emphasis on the modelling of the packet routing algorithm, asynchronous channels, controlled access to shared resources and the increased complexity caused by virtual channels. The testing of the model as well as experience from using Verilog to develop a high-level architectural simulation is discussed.

ieee international conference on high performance computing data and analytics | 2012

Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors

Juan M. Cebrian; Lasse Natvig; Jan Christian Meyer

Driven by the utilization wall and the Dark Silicon effect, energy efficiency has become a key research area in microprocessor design. Vectorization, parallelization, specialization and heterogeneity are the key design points to deal with the utilization wall. Heterogeneous architectures are enhanced with architectural optimizations, such as vectorization, to further increase the energy efficiency of the processor, reducing the number of instructions that go through the pipeline and leveraging the usage of the memory hierarchy. AMD® FusionTM or Intel Core i5 and i7 are commercial examples of this new generation of microprocessors. Still, there is a question to be answered: How can software developers maximize energy efficiency of these architectures? In this paper, we evaluate the energy efficiency of different processors from the Intel Core i5 and i7 family, using selected benchmarks from the PARSEC suite with variable core counts and vectorization techniques to quantify energy efficiency under the Thermal Design Power (TDP). Results show that software developers should prioritize vectorization over parallelization whenever possible, as it is much better in terms of energy efficiency. When using vectorization and parallelization simultaneously, scalability of the application can be reduced drastically, and may require different development strategies to maximize resource utilization in order to increase energy efficiency. This is especially true in the server market, where we can find more than one processor per board. Finally, when comparing on-chip and “at the wall” energy savings, we can see variations from 5 to 20%, depending on the benchmark and system. This high variability shows the need to develop a more detailed model to predict system power based on on-chip power information.

computing frontiers | 2009

A light-weight fairness mechanism for chip multiprocessor memory systems

Magnus Jahre; Lasse Natvig

Chip Multiprocessor (CMP) memory systems suffer from the effects of destructive thread interference. This interference reduces performance predictability because it depends heavily on the memory access pattern and intensity of the co-scheduled threads. In this work, we confirm that all shared units must be thread-aware in order to provide memory system fairness. However, the current proposals for fair memory systems are complex as they require an interference measurement mechanism and a fairness enforcement policy for all hardware-controlled shared units. Furthermore, they often sacrifice system throughput to reach their fairness goals which is not desirable in all systems. In this work, we show that our novel fairness mechanism, called the Dynamic Miss Handling Architecture (DMHA), is able to reduce implementation complexity by using a single fairness enforcement policy for the complete hardware-managed shared memory system. Specifically, it controls the total miss bandwidth available to each thread by dynamically manipulating the number of Miss Status Holding Registers (MSHRs) available in each private data cache. When fairness is chosen as the metric of interest and we compare to a state-of-the-art fairness-aware memory system, DMHA improves fairness by 26% on average with the single program baseline. With a different configuration, DMHA improves throughput by 13% on average compared to a conventional memory system.

2015 Sustainable Internet and ICT for Sustainability (SustainIT) | 2015

Cost-comfort balancing in a smart residential building with bidirectional energy trading

Abdullah Al Hasib; Nikita Nikitin; Lasse Natvig

The increasing integration of new technologies for power generation into the smart grid systems calls for novel demand response (DR) algorithms, which schedule the appliances to minimize cost and maximize comfort for the users. Traditionally, the formulations of the DR problem consider unidirectional energy flow from the power grid (energy supplier) to the user in a residential building (energy consumer). In this paper, we argue for an extended model of a smart residential building with bidirectional energy trading. This model allows the user to sell the surplus energy, obtained from local renewable energy sources, so as to partially recover the electricity cost. We further study an efficient linear model for appliance scheduling, under the assumption of bidirectional energy trading. To balance the user comfort and electricity cost, we introduce a comfort demand function based on declining block rates (DBR) and discuss the microeconomic meaning of this function. We evaluate our method with several case studies and analyze how energy selling and comfort demand affect the total cost and the schedule. We also show that our scheduler is fast enough to allow for nearly realtime scheduling adjustments ahead of each period, to minimize the impact of forecast deviations.

2013 International Green Computing Conference Proceedings | 2013

Temperature effects on on-chip energy measurements

Juan M. Cebrian; Lasse Natvig

Latest generations of microprocessors target to maximize energy efficiency over other constraints such as area or performance. This turn of events does not only affect the industry, the academia is also focusing its efforts in proposing new solutions to maximize energy efficiency of computations at all levels, from hardware to software. While voltage, frequency and activity factors are usually considered by researchers in their works, temperature is usually left aside in their calculations or models. In this work we present an evaluation of the effects of temperature in real hardware taking advantage of the relatively new hardware energy counters available since the second generation of Intel® Core™ processors. Results show an average 5% energy increase with every 8-10 degrees Celsius increment. After certaint temperature (80-100C), the processor starts throttling its frequency to prevent permanent damage on the die. Based on these results we will discuss some “good practices” when evaluating power and energy based on on-chip energy measurements.

high performance computing and communications | 2009

A Quantitative Study of Memory System Interference in Chip Multiprocessor Architectures

Magnus Jahre; Marius Grannæs; Lasse Natvig

The potential for destructive interference between running processes is increased as Chip Multiprocessors (CMPs) share more on-chip resources. We believe that understanding the nature of memory system interference is vital to achieve good fairness/complexity/performance trade-offs in CMPs. Our goal in this work is to quantify the latency penalties due to interference in all hardware-controlled, shared units (i.e. the on-chip interconnect, shared cache and memory bus). To achieve this, we simulate a wide variety of realistic CMP architectures. In particular, we vary the number of cores, interconnect topology, shared cache size and off-chip memory bandwidth. We observe that interference in the off-chip memory bus accounts for between 63% and 87% of the total interference impact while the impact of cache capacity interference can be lower than indicated by previous studies (between 5% and 32% of the total impact). In addition, as much as 11% of the total impact can be due to uncontrolled allocation of shared cache Miss Status Holding Registers (MSHRs).

Explore More