Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christopher Jung is active.

Publication


Featured researches published by Christopher Jung.


2014 IEEE/ACM International Symposium on Big Data Computing | 2014

Monitoring Data Streams at Process Level in Scientific Big Data Batch Clusters

Eileen Kuehn; Max Fischer; Christopher Jung; Andreas Petzold; Achim Streit

The operation of scientific big data centres requires an overall monitoring and perception of system components. Insights into internal and external network traffic is of high importance for understanding specific data flows regarding storage accesses, firewall configurations, and the scheduling of batch jobs on clusters for computing/analysis of data. However, wide adoptions of federated storage, the handling of numerous job on many-core nodes, or the execution of job pilots inside the batch system complicate current data stream monitoring attempts. Therefore, the rising complexity requires new approaches to extend available solutions. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the Grid Ka data and computing centre at KIT for monitoring continuous data streams. Obtained results can for example be used to realise an optimisation of LAN/WAN setups based on measured data flows to adapt to the actual needs. This paper describes the current approach being implemented at the Grid Ka batch cluster and presents first analysis results showing the significance of measurements. The described approach is consecutively applied to the context of computing for high-energy physics.


Journal of Physics: Conference Series | 2015

Analyzing data flows of WLCG jobs at batch job level

Eileen Kuehn; Max Fischer; Manuel Giffels; Christopher Jung; Andreas Petzold

With the introduction of federated data access to the workflows of WLCG, it is becoming increasingly important for data centers to understand specific data flows regarding storage element accesses, firewall configurations, as well as the scheduling of batch jobs themselves. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the GridKa Tier 1 center for monitoring continuous data streams and characteristics of WLCG jobs and pilots. Long term measurements and data collection are in progress. These measurements already have been proven to be useful analyzing misbehaviors and various issues. Therefore we aim for an automated, realtime approach for anomaly detection. As a requirement, prototypes for standard workflows have to be examined. Based on measurements of several months, different features of HEP jobs are evaluated regarding their effectiveness for data mining approaches to identify these common workflows. The paper will introduce the actual measurement approach and statistics as well as the general concept and first results classifying different HEP job workflows derived from the measurements at GridKa.


Journal of Physics: Conference Series | 2015

Tier 3 batch system data locality via managed caches

Max Fischer; Manuel Giffels; Christopher Jung; Eileen Kühn; Gunter Quast

Modern data processing increasingly relies on data locality for performance and scalability, whereas the common HEP approaches aim for uniform resource pools with minimal locality, recently even across site boundaries. To combine advantages of both, the High- Performance Data Analysis (HPDA) Tier 3 concept opportunistically establishes data locality via coordinated caches.In accordance with HEP Tier 3 activities, the design incorporates two major assumptions: First, only a fraction of data is accessed regularly and thus the deciding factor for overall throughput. Second, data access may fallback to non-local, making permanent local data availability an inefficient resource usage strategy. Based on this, the HPDA design generically extends available storage hierarchies into the batch system. Using the batch system itself for scheduling file locality, an array of independent caches on the worker nodes is dynamically populated with high-profile data. Cache state information is exposed to the batch system both for managing caches and scheduling jobs. As a result, users directly work with a regular, adequately sized storage system. However, their automated batch processes are presented with local replications of data whenever possible.


global engineering education conference | 2015

GridKa school - Teaching information technologies since 2003

Melanie Ernst; Thomas Hartmann; Andreas Heiss; Christopher Jung; Joerg Meyer; Dimitri Nilsen; Andreas Petzold; Christoph-Erdmann Pfeiler; Ingrid Schaeffner; Jie Tao; Pavel Weber

GridKa School is an annual international computing school hosted by one of the largest scientific data centers in Europe. Adopting its computational resources and expert knowledge for the educational purposes, GridKa School provides excellent possibilities to gain the skills on the deployment and application of advanced software tools and techniques. This paper describes the unique concept of the school, its educational model, the continuous development of its curriculum and the successful organization structure. It argues the benefits and challenges of the chosen educational approach which has emerged based on many years of teaching experience at GridKa School.


Journal of Physics: Conference Series | 2015

Active Job Monitoring in Pilots

Eileen Kuehn; Max Fischer; Manuel Giffels; Christopher Jung; Andreas Petzold

Recent developments in high energy physics (HEP) including multi-core jobs and multi-core pilots require data centres to gain a deep understanding of the system to monitor, design, and upgrade computing clusters. Networking is a critical component. Especially the increased usage of data federations, for example in diskless computing centres or as a fallback solution, relies on WAN connectivity and availability. The specific demands of different experiments and communities, but also the need for identification of misbehaving batch jobs, requires an active monitoring. Existing monitoring tools are not capable of measuring fine-grained information at batch job level. This complicates network-aware scheduling and optimisations. In addition, pilots add another layer of abstraction. They behave like batch systems themselves by managing and executing payloads of jobs internally. The number of real jobs being executed is unknown, as the original batch system has no access to internal information about the scheduling process inside the pilots. Therefore, the comparability of jobs and pilots for predicting run-time behaviour or network performance cannot be ensured. Hence, identifying the actual payload is important. At the GridKa Tier 1 centre a specific tool is in use that allows the monitoring of network traffic information at batch job level. This contribution presents the current monitoring approach and discusses recent efforts and importance to identify pilots and their substructures inside the batch system. It will also show how to determine monitoring data of specific jobs from identified pilots. Finally, the approach is evaluated.


(EGICF12-EMITC2) 032, EGI Community Forum 2012 / EMI Second Technical Conference, 26-30 March, 2012 Munich, Germany | 2012

Experiment representation at the WLCG Tier-1 center GridKa

Christopher Jung; Andreas Petzold; Marian Zvada

The GridKa Computing Center at the Karlsruhe Institute of Technology is one of the biggest Tier-1 centers for the Worldwide LHC Computing Grid (WLCG) and one of the major resource providers in the National Grid Initiative of Germany (NGI-DE). In 2010, three positions of local Virtual Organization (VO) representatives for ALICE, ATLAS, and CMS experiments were established. The representatives’ duties are to represent both their respective LHC experiment at GridKa and GridKa within their respective LHC experiment. This presentation will focus on the representatives’ technical and communication tasks and their experiences gained.


Journal of Physics: Conference Series | 2014

Optimization of data life cycles

Christopher Jung; M Gasthuber; André Giesler; Marcus Hardt; Jörg Meyer; F Rigoll; K Schwarz; Rainer Stotzka; Achim Streit


arXiv: Digital Libraries | 2012

Data Life Cycle Labs, A New Concept to Support Data-Intensive Science

Jos van Wezel; Achim Streit; Christopher Jung; Rainer Stotzka; Silke Halstenberg; Fabian Rigoll; Ariel Garcia; Andreas Heiss; Kilian Schwarz; Martin Gasthuber; André Giesler


21st International Conference on Computing in High Energy and Nuclear Physics | 2015

Progress in Multi-Disciplinary Data Life Cycle Management

Christopher Jung; M Gasthuber; André Giesler; Marcus Hardt; Jörg Meyer; A Prabhune; F Rigoll; K Schwarz; Achim Streit


Archive | 2014

Large-Scale Data Management and Analysis (LSDMA) - Big Data in Science

Christopher Jung; Achim Streit

Collaboration


Dive into the Christopher Jung's collaboration.

Top Co-Authors

Avatar

Andreas Petzold

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Achim Streit

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Max Fischer

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

André Giesler

Forschungszentrum Jülich

View shared research outputs
Top Co-Authors

Avatar

Eileen Kuehn

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Andreas Heiss

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

F Rigoll

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jörg Meyer

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

K Schwarz

GSI Helmholtz Centre for Heavy Ion Research

View shared research outputs
Top Co-Authors

Avatar

Marcus Hardt

Karlsruhe Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge