Jan Moritz Joseph
Otto-von-Guericke University Magdeburg
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jan Moritz Joseph.
international symposium on system-on-chip | 2014
Jan Moritz Joseph; Thilo Pionteck
This papers presents the design of a Network-on-Chip (NoC) simulator for design space exploration of router architectures. The simulator supports cycle-accurate router models and in addition allows the simulation of router architectures, which can adjust their processing according to the traffic type. Realistic traffic patterns are derived from task graph models of real-world applications that are simulated in parallel to the NoC at transaction level. Combining cycle-accurate router simulation and abstract task graph simulation circumvents the limitations of most NoC simulators, which either use synthetic traffic patterns or unrealistic and fixed router models. The proposed simulator architecture is presented in detail and its suitability is shown by means of a case study.
reconfigurable computing and fpgas | 2015
Christopher Blochwitz; Jan Moritz Joseph; Rico Backasch; Thilo Pionteck; Stefan Werner; Dennis Heinrich; Sven Groppe
In this paper, a data structure and a hardware acceleration for dictionary generation for Semantic Web databases are presented. Current hardware accelerators for databases are based on co-processor designs supporting software-centric applications: only single, selected operations of query processing are offloaded to the FPGA with time-consuming data transfers. In contrast, we propose a novel FPGA-centric design, which creates and manages specialized database structures. As part of the design, a scalable and parallel architecture for dictionary generation is introduced. We propose optimizations for Radix- Trees, which are designed to exploit characteristics of FPGA structures. Furthermore, the tree is parameterizable which enables the adaptation of properties to the specific characteristics of the generated data structure. The configuration influences memory and logic utilization. Optimal parameters are determined by simulative evaluation using existing Semantic Web input data sets. The proposed hardware design is integrated into an existing Semantic Web database system and the results are analyzed with a focus on utilization and throughput. The required memory of the optimized Radix-Tree is reduced by 94% and a speed-up of 70% is achieved.
reconfigurable communication centric systems on chip | 2016
Jan Moritz Joseph; Sven Wrieden; Christopher Blochwitz; Alberto Garcia-Oritz; Thilo Pionteck
We present a comprehensive simulation environment for design space exploration in Asymmetric 3D-Networks-on-chip (A-3D-NoCs) covering the heterogeneity in 3D-System-on-chips (3D-SoCs). A challenging aspect of A-3D-NoC design is the consideration of interwoven parameters of the communication infrastructure and characteristics of the manufacturing technologies. Thus, simultaneous evaluation of multiple design metrics is mandatory. Our simulation environment consists of three parts. First, it comprises a NoC simulator that supports a multitude of different manufacturing technologies, router architectures, and network topologies within a single design. As a key feature, the NoC and technologies parameters per chip layer are fully configurable during simulation runtime permitting flexible and fast evaluation. Second, a central reporting tool facilitates system analysis on different abstraction levels. Third, the evolution tool provides various synthetic and real-world based benchmarks. Thus, our tool allows for an incremental approach to systematically explore the A-3D-NoCs design space.
reconfigurable communication centric systems on chip | 2017
Jan Moritz Joseph; Lennart Bamberg; Sven Wrieden; Dominik Ermel; Alberto Garcia-Oritz; Thilo Pionteck
New 3D production methods enable heterogeneous integration of dies manufactured in different technology nodes. Asymmetric 3D interconnect architectures (A-3D-IAs) are the communication infrastructure targeting these heterogeneous 3D system on chips (3D SoCs), for which design methodologies and design tools are still missing. Here, a design method is proposed following an incremental approach enabled by high level models. Therefore, we present the first simulator and design framework covering the diverse requirements of A-3D-IAs. This includes an abstract model to estimate the application specific energy consumption of 2D metal wires and 3D through silicon vias (TSVs) in an A-3D-IA. It is validated by circuit simulations in combination with an electromagnetic field solver which is used for the extraction of the TSV array equivalent circuit. The model lays on a high abstraction level for fast simulations. Nonetheless, for real data stream scenarios it still shows a small maximum error of less than 8%. Additionally, a mathematical description is presented which enables a fast evaluation of low power coding schemes for A-3D-IA on a high level of abstraction.
automation, robotics and control systems | 2017
Christopher Blochwitz; Julian Wolff; Jan Moritz Joseph; Stefan Werner; Dennis Heinrich; Sven Groppe; Thilo Pionteck
In this paper, a scalable hardware architecture for string sorting in the application field of Big Data is presented. Current hardware architectures focus on the acceleration of sorting small sets of data with a maximum string length. In contrast, we propose an FPGA-accelerated architecture based on Radix-Trees, which has the ability to sort large sets of strings without practical limitation of the string length. The Radix-Tree is parameterizable and so is the design, which enables the adaptation for application-specific properties, such as diversity of strings and size of the used alphabet. The scalable design has a hierarchical processing and memory architecture, which operate in parallel. Optimal parameters and configurations are evaluated by using a dataset of the Semantic Web, as an example of Big Data applications. The results are analyzed with a focus on throughput, memory requirement, and utilization. The hardware design is faster for all values of the radix parameter and achieves a maximum speed-up factor of 2.78 compared to a software system.
Microprocessors and Microsystems | 2017
Jan Moritz Joseph; Christopher Blochwitz; Alberto Garcia-Ortiz; Thilo Pionteck
Abstract In this paper we investigate the effects of asymmetric organization and depths of Network-on-Chip (NoC) router buffers among dies in heterogeneous 3D-System-on-Chips (SoCs). In our novel approach the properties of the routers are aligned with the characteristics of the technological nodes per layer. We call these designs Asymmetric 3D-NoCs (A-3D-NoCs). In this work we demonstrate potentials of A-3D-NoCs in comparison to a conventional, symmetric 3D-NoC: Applying asymmetric buffer reorganization we achieve area savings of 8.3% and power savings of 5.4% for link buffers while accepting a minor average system performance loss of 2.1%. With additional asymmetry in buffer depth up to 28% cost savings and 15% power reduction are given in combination with a 4.6% performance decline. Thus, the proposed buffer organization scheme is applicable for cost and power critical applications of NoCs in heterogeneous 3D-SoCs.
reconfigurable computing and fpgas | 2016
Jan Moritz Joseph; Tobias Winker; Kristian Ehlers; Christopher Blochwitz; Thilo Pionteck
The focus of this work is to facilitate pose estimation and, thus, gesture recognition for embedded systems, although these are tasks with high computational performance requirements. Therefore, an existing pose estimation algorithm is optimized for Xilinx High Level Synthesis (HLS). The resulting hardware acceleration cores are compared for different optimizations and, finally, we propose a hardware/software system design for a Xilinx Zynq Zedboard. Using this method, we achieve a speedup of 1.6 in comparison to a software solution on the ARM processor and, thus, facilitate hand tracking for embedded systems with low power consumption.
international conference on high performance computing and simulation | 2016
Jan Moritz Joseph; Christopher Blochwitz; Thilo Pionteck
We present a novel prioritization technique to reduce latencies in Network-on-chips. For individual routers, we adaptively allocate default paths assuming that subsequent packages are part of a data stream and, thus, routing decisions are identical. Since proactive routing to an output port is performed, the conventional router pipeline is partly bypassed. The method is deterministic, non-speculative with local and autonomous decisions, retains the standard network load, and does not penalize non-prioritized links. Virtual point-to-point connections emerge, which span multiple hops and accelerate interleaved streams. We achieve an average package latency reduction of 4.8% to 12.2% in simulations for PARSEC benchmarks.
reconfigurable communication centric systems on chip | 2018
Jan Moritz Joseph; Lennart Bamberg; Gerald Krell; Imad Hajjar; Alberto Garcia-Oritz; Thilo Pionteck
power and timing modeling, optimization and simulation | 2018
Lennart Bamberg; Jan Moritz Joseph; Robert Schmidt; Thilo Pionteck; Alberto Garcia-Ortiz