Theo Ungerer
University of Augsburg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Theo Ungerer.
ACM Computing Surveys | 2003
Theo Ungerer; Borut Robič; Jurij Šilc
Hardware multithreading is becoming a generally applied technique in the next generation of microprocessors. Several multithreaded processors are announced by industry or already into production in the areas of high-performance microprocessors, media, and network processors.A multithreaded processor is able to pursue two or more threads of control in parallel within the processor pipeline. The contexts of two or more threads of control are often stored in separate on-chip register sets. Unused instruction slots, which arise from latencies during the pipelined execution of single-threaded programs by a contemporary microprocessor, are filled by instructions of other threads within a multithreaded processor. The execution units are multiplexed between the thread contexts that are loaded in the register sets.Underutilization of a superscalar processor due to missing instruction-level parallelism can be overcome by simultaneous multithreading, where a processor can issue multiple instructions from multiple threads each cycle. Simultaneous multithreaded processors combine the multithreading technique with a wide-issue superscalar processor to utilize a larger part of the issue bandwidth by issuing instructions from different threads simultaneously.Explicit multithreaded processors are multithreaded processors that apply processes or operating system threads in their hardware thread slots. These processors optimize the throughput of multiprogramming workloads rather than single-thread performance. We distinguish these processors from implicit multithreaded processors that utilize thread-level speculation by speculatively executing compiler- or machine-generated threads of control that are part of a single sequential program.This survey paper explains and classifies the explicit multithreading techniques in research and in commercial microprocessors.
international symposium on microarchitecture | 2010
Theo Ungerer; Francisco J. Cazorla; Pascal Sainrat; Guillem Bernat; Zlatko Petrov; Christine Rochange; Eduardo Quiñones; Mike Gerdes; Marco Paolieri; Julian Wolf; Hugues Cassé; Sascha Uhrig; Irakli Guliashvili; Michael Houston; Florian Kluge; Stefan Metzlaff; Jörg Mische
The Merasa project aims to achieve a breakthrough in hardware design, hard real-time support in system software, and worst-case execution time analysis tools for embedded multicore processors. The project focuses on developing multicore processor designs for hard real-time embedded systems and techniques to guarantee the analyzability and timing predictability of every feature provided by the processor.
Microprocessors and Microsystems | 2003
Jochen Kreuzinger; Uwe Brinkschulte; Matthias Pfeffer; Sascha Uhrig; Theo Ungerer
Abstract Our aim is to investigate the suitability of hardware multithreading for real-time event handling in combination with appropriate real-time scheduling techniques. We designed and evaluated a multithreaded microcontroller based on a Java processor core. Java threads are used as Interrupt Service Threads (ISTs) instead of the Interrupt Service Routines (ISRs) of conventional processors. Our proposed Komodo microcontroller supports multiple ISTs with zero-cycle context switching overhead. A so-called priority manager implements several real-time scheduling algorithms in hardware. We show the feasibility of a hardware real-time scheduler integrated deeply into the processor pipeline with a VHDL design and its synthesis. Evaluations with a software simulator and real-time applications as benchmarks show that hardware multithreading reaches a 1.2–1.4 performance increase for hard real-time applications (multithreading without latency utilization) and a 2.0–2.6 speedup by latency utilization for programs without hard real-time requirements. With respect to real-time scheduling on a multithreaded microcontroller, the Least Laxity First (LLF) scheme outperforms the Fixed Priority Preemptive (FPP), Earliest Deadline First (EDF), and Guaranteed Percentage (GP) schemes, but suffers from the highest implementation costs.
international conference on sensor technologies and applications | 2009
Faruk Bagci; Theo Ungerer; Nader Bagherzadeh
In recent years, the potential range of applications for sensor networks is expanding. Their use has been considered for safety critical areas such as: hospitals or power plants. The security comes more to the fore. This paper presents SecSens, an architecture that provides basic security components for wireless sensor networks. Since robust and strong security features require powerful nodes, SecSens uses a heterogeneous sensor network. In addition to a large number of simple (cheap) sensor nodes providing the actual sensor tasks, there are a few powerful nodes (cluster nodes) that implement the required security features. The basic component of SecSens offers authenticated broadcasts to allow recipients to authenticate the sender of a message. To protect the sensor network against routing attacks, SecSens includes a probabilistic multi-path routing protocol, which supports the key management and the authenticated broadcasts. SecSens also provides functions to detect forged sensor data by verifying data reports en-route. SecSens is successfully evaluated in a real test environment with two different kinds of sensor boards.
location and context awareness | 2005
Jan Petzold; Andreas Pietzowski; Faruk Bagci; Wolfgang Trumler; Theo Ungerer
This paper investigates the efficiency of in-door next location prediction by comparing several prediction methods. The scenario concerns people in an office building visiting offices in a regular fashion over some period of time. We model the scenario by a dynamic Bayesian network and evaluate accuracy of next room prediction and of duration of stay, training and retraining performance, as well as memory and performance requirements of a Bayesian network predictor. The results are compared with further context predictor approaches – a state predictor and a multi-layer perceptron predictor using exactly the same evaluation set-up and benchmarks. The publicly available Augsburg Indoor Location Tracking Benchmarks are applied as predictor loads. Our results show that the Bayesian network predictor reaches a next location prediction accuracy of up to 90% and a duration prediction accuracy of up to 87% with variations depending on the person and specific predictor set-up. The Bayesian network predictor performs in the same accuracy range as the neural network and the state predictor.
Microprocessors and Microsystems | 2014
Roberto Giorgi; Rosa M. Badia; François Bodin; Albert Cohen; Paraskevas Evripidou; Paolo Faraboschi; Bernhard Fechner; Guang R. Gao; Arne Garbade; Rahulkumar Gayatri; Sylvain Girbal; Daniel Goodman; Behram Khan; Souad Koliai; Joshua Landwehr; Nhat Minh Lê; Feng Li; Mikel Luján; Avi Mendelson; Laurent Morin; Nacho Navarro; Tomasz Patejko; Antoniu Pop; Pedro Trancoso; Theo Ungerer; Ian Watson; Sebastian Weis; Stéphane Zuckerman; Mateo Valero
The improvements in semiconductor technologies are gradually enabling extreme-scale systems such as teradevices (i.e., chips composed by 1000 billion of transistors), most likely by 2020. Three major challenges have been identified: programmability, manageable architecture design, and reliability. TERAFLUX is a Future and Emerging Technology (FET) large-scale project funded by the European Union, which addresses such challenges at once by leveraging the dataflow principles. This paper presents an overview of the research carried out by the TERAFLUX partners and some preliminary results. Our platform comprises 1000+ general purpose cores per chip in order to properly explore the above challenges. An architectural template has been proposed and applications have been ported to the platform. Programming models, compilation tools, and reliability techniques have been developed. The evaluation is carried out by leveraging on modifications of the HP-Labs COTSon simulator.
embedded and real-time computing systems and applications | 2000
Jochen Kreuzinger; A. Schulz; Matthias Pfeffer; Theo Ungerer; Uwe Brinkschulte; C. Krakowski
This paper investigates real-time scheduling algorithms on upcoming multithreaded processors. As evaluation testbed we introduce a multithreaded processor kernel which is specifically designed as core processor of a micro-controller or system-on-a-chip. Handling of external real-time events is performed through multithreading. Real-time threads are used as interrupt service threads (ISTs) instead of interrupt service routines (ISRs). Our proposed micro-controller supports multiple ISTs with zero-cycle context switching overhead. We investigate the behavior of fixed priority preemptive, earliest deadline first, least laxity first and guaranteed percentage scheduling with respect to multithreaded processors. Our finding is that the strategies GP and LLF result in a good blending of instructions of different threads thus enabling a multithreaded processor to utilize latencies best. Assuming a zero-cycle context switch LLF performs best, however implementation cost context, are prohibitive.
international conference on parallel processing | 2006
Jan Petzold; Faruk Bagci; Wolfgang Trumler; Theo Ungerer
Next location prediction anticipates a persons movement based on the history of previous sojourns. It is useful for proactive actions taken to assist the person in an ubiquitous environment. This paper evaluates next location prediction methods: dynamic Bayesian network, multi-layer perceptron, Elman net, Markov predictor, and state predictor. For the Markov and state predictor we use additionally an optimization, the confidence counter. The criterions for the comparison are the prediction accuracy, the quantity of useful predictions, the stability, the learning, the relearning, the memory and computing costs, the modelling costs, the expandability, and the ability to predict the time of entering the next location. For evaluation we use the same benchmarks containing movement sequences of real persons within an office building.
international conference on autonomic computing | 2004
Wolfgang Trumler; Jan Petzold; Faruk Bagci; Theo Ungerer
We envision future office buildings that partly or fully implement a flexible office organization. These organizational principles save office space, but require a sophisticated software system that is highly dynamic, scalable, context-aware, self-configuring, self-optimizing and self-healing. We therefore propose an autonomic middleware approach for ubiquitous in-door environments.
digital systems design | 2013
Theo Ungerer; Christian Bradatsch; Mike Gerdes; Florian Kluge; Ralf Jahr; Jörg Mische; Joao Fernandes; Pavel G. Zaykov; Zlatko Petrov; Bert Böddeker; Sebastian Kehr; Hans Regler; Andreas Hugl; Christine Rochange; Haluk Ozaktas; Hugues Cassé; Armelle Bonenfant; Pascal Sainrat; Ian Broster; Nick Lay; David George; Eduardo Quiñones; Miloš Panić; Jaume Abella; Francisco J. Cazorla; Sascha Uhrig; Mathias Rohde; Arthur Pyka
Engineers who design hard real-time embedded systems express a need for several times the performance available today while keeping safety as major criterion. A breakthrough in performance is expected by parallelizing hard real-time applications and running them on an embedded multi-core processor, which enables combining the requirements for high-performance with timing-predictable execution. parMERASA will provide a timing analyzable system of parallel hard real-time applications running on a scalable multicore processor. parMERASA goes one step beyond mixed criticality demands: It targets future complex control algorithms by parallelizing hard real-time programs to run on predictable multi-/many-core processors. We aim to achieve a breakthrough in techniques for parallelization of industrial hard real-time programs, provide hard real-time support in system software, WCET analysis and verification tools for multi-cores, and techniques for predictable multi-core designs with up to 64 cores.