Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tomislav Janjusic is active.

Publication


Featured researches published by Tomislav Janjusic.


ACM Sigarch Computer Architecture News | 2013

Gleipnir: a memory profiling and tracing tool

Tomislav Janjusic; Krishna M. Kavi

In this article we describe a memory tracing and profiling tool called Gleipnir. Gleipnir is a plug-in tool for a widely used binary instrumentation framework, Valgrind. Gleipnirs ability to collect fine grained memory traces and associate each access to source level data structures and elements of these structures, makes it a good candidate tool for advanced memory analysis and studying complex memory access patterns. The data provided by Gleipnir may be used by cache simulators to analyze accesses to data structure elements and understand the dynamic memory behavior of programs. The goal of Gleipnir is to give the programmer aid in refactoring data and code. In addition to Gleipnir we introduce a cache simulation tool, Gl cSim. Gl cSim is an extension to DineroIV (a uni-processor simulator) that tracks Gleipnir provided trace and debug-information.


international conference on conceptual structures | 2014

Toward Better Understanding of the Community Land Model within the Earth System Modeling Framework

Dali Wang; Joseph Schuchart; Tomislav Janjusic; Frank Winkler; Yang Xu; Christos Kartsaklis

Abstract One key factor in the improved understanding of earth system science is the development and improvement of high fidelity models. Along with the deeper understanding of biogeophysical and biogeochemical processes, the software complexity of those earth system models becomes a barrier for further rapid model improvements and validation. In this paper, we present our experience on better understanding the Community Land Model (CLM) within an earth system modelling framework. First, we give an overview of the software system of the global offline CLM simulation. Second, we present our approach to better understand the CLM software structure and data structure using advanced software tools. After that, we focus on the practical issues related to CLM computational performance and individual ecosystem function. Since better software engineering practices are much needed for general scientific software systems, we hope those considerations can be beneficial to many other modeling research programs involving multiscale system dynamics.


ieee international conference on high performance computing data and analytics | 2015

A scientific function test framework for modular environmental model development: application to the community land model

Dali Wang; Tomislav Janjusic; Colleen M. Iversen; Peter E. Thornton; Misha Karssovski; Wei Wu; Yang Xu

As environmental models have become more complicated, we need new tools to analyze and validate these models and to facilitate collaboration among field scientists, observation dataset providers, environmental system modelers, and computer scientists. Modular design and function test of environmental models have gained attention recently within the Biological and Environmental Research Program of the U.S. Department of Energy. In this paper, we will present our methods and software tools 1) to analyze environmental software and 2) to generate modules for scientific function testing of environmental models. We have applied these methods to the Community Land Model with three typical scenarios: 1) benchmark case function validation, 2) observation-constraint function validation, and 3) a virtual root module generation for root function investigation and evaluation. We believe that our strategies and experience in scientific function test framework can be beneficial to many other research programs that adapt integrated environmental modeling methodology.


international conference on conceptual structures | 2017

A Web-based Visual Analytic Framework for Understanding Large-scale Environmental Models: A Use Case for The Community Land Model

Yang Xu; Dali Wang; Tomislav Janjusic; Wei Wu; Yu Pei; Zhuo Yao

Abstract This study introduces a web-based visual analytic framework to better understand the software structures of large-scale environmental models. The framework integrates data management, software structures analysis, and web-based visualizations. A system for the Community Land Model (CLM) is developed to demonstrate the capability of the proposed framework. It consists of three major components: (1) a Fortran-syntax analysis tool that decomposes CLM source code into simpler forms; (2) an application tier that further analyzes and converts the prepro-cessed data into meaningful software structural information; (3) a web-based front end that is developed using state-of-the-art web technologies and visualization toolkit (e.g., D3.js). The framework provides users with easy access to the internal structures of complex environmental models. Currently, the prototype system is being used by CLM modelers and field scientists to tackle different environmental research problems.


international conference on conceptual structures | 2015

Glprof: A Gprof Inspired, Callgraph-oriented Per-object Disseminating Memory Access Multi-cache Profiler

Tomislav Janjusic; Christos Kartsaklis

Application analysis is facilitated through a number of program profiling tools. The tools vary in their complexity, ease of deployment, design, and profiling detail. Specifically, understand- ing, analyzing, and optimizing is of particular importance for scientific applications where minor changes in code paths and data-structure layout can have profound effects. Understanding how intricate data-structures are accessed and how a given memory system responds is a complex task. In this paper we describe a trace profiling tool, Glprof, specifically aimed to lessen the burden of the programmer to pin-point heavily involved data-structures during an applications run-time, and understand data-structure run-time usage. Moreover, we showcase the tools modularity using additional cache simulation components. We elaborate on the tools design, and features. Finally we demonstrate the application of our tool in the context of Spec bench- marks using the Glprof profiler and two concurrently running cache simulators, PPC440 and AMD Interlagos.


Proceedings of the Second Workshop on Optimizing Stencil Computations | 2014

Trace-Driven Memory Access Pattern Recognition in Computational Kernels

Eunjung Park; Christos Kartsaklis; Tomislav Janjusic; John Cavazos

Classifying memory access patterns is paramount to the selection of the right set of optimizations and determination of the parallelization strategy. Static analyses suffer from ambiguities present in source code, which modern compilation techniques, such as profile-guided optimization, alleviate by observing runtime behavior and feeding back into the compilation flow. This paper discusses a dynamic analysis technique for recognizing memory access patterns, with application to the stencils domain, and presents our design and C++ implementation using the memory-tracing tool Gleipnir. Finally, we evaluate and discuss the performance and matching capability of our classifiers in the context of the Polybench scientific benchmark suite, which includes both stencil and matrix computations.


Advances in Computers | 2014

Hardware and Application Profiling Tools

Tomislav Janjusic; Krishna M. Kavi

Abstract This chapter describes hardware and application profiling tools used by researchers and application developers. With over 30 years of research, there have been numerous tools developed and used, and it will be too difficult to include all of them here. Therefore, in this chapter, we describe various areas with a selection of widely accepted and recent tools. This chapter is intended for the beginning reader interested in exploring more about these topics. Numerous references are provided to help jump-start the interested reader into the area of hardware simulation and application profiling. We make an effort to clarify and correctly classify application profiling tools based on their scope, interdependence, and operation mechanisms. To visualize these features, we provide diagrams that explain various development relationships between interdependent tools. Hardware simulation tools are described into categories that elaborate on their scope. Therefore, we have covered areas of single to full-system simulation, power modeling, and network processors.


symposium on computer architecture and high performance computing | 2014

RACB: Resource Aware Cache Bypass on GPUs

Hongwen Dai; Christos Kartsaklis; Chao Li; Tomislav Janjusic; Huiyang Zhou

Caches are universally used in computing systems to hide long off-chip memory access latencies. Unlike CPUs, massive threads running simultaneously on GPUs bring a tremendous pressure on memory hierarchy. As a result, the limitation of cache resources becomes a bottleneck for a GPU to exploit thread-level parallelism (TLP) and memory-level parallelism (MLP) and achieve high performance. In this paper, we propose a mechanism to bypass L1D and L2 cache based on the availability of cache resources. Our proposed mechanism is based on the observation that a huge number of stalls coming from limited cache resources prohibit GPUs from providing a higher throughput. So we propose Resource Aware Cache Bypass (RACB) with minor hardware changes to eliminate such stalls to improve performance. We examine the effectiveness of this approach when applied to L1D and L2 cache separately as well as together. Evaluation results with NVIDIA Computing SDK show that RACB generally improves performance the most when applied to both L1D and L2 cache, which is up to 88.05% and on an average of 16.73%, additionally, energy is saved up to 22.35% and on an average of 5.88% with minor hardware overheads.


Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models | 2014

OpenSHMEM Reference Implementation using UCCS-uGNI Transport Layer

Tomislav Janjusic; Pavel Shamis; Manjunath Gorentla Venkata; Stephen W. Poole

OpenSHMEM is a library interface implementation and specification that enables the implementation of the Partitioned Global Address Space (PGAS) model. It exports modern RDMA network functionality and communication semantics to applications very efficiently. There are many closed source implementations of OpenSHMEM for modern RDMA interconnects such as InfiniBand and Crays Gemini and Aries. Given the important role that Cray systems play in HPC, in this paper, we present an open source implementation of OpenSHMEM for Cray XE/XK/XC systems. To implement OpenSHMEM, we use the uGNI interface. uGNI is a generic interface that is designed for multiple programming models. The interface fits well the goal of UCCS. Having OpenSHMEM with UCCS-uGNI allows usage of the same implementation over multiple interconnects. This also translates into many advantages that come with common code such as resource sharing, increasing productivity because of less code maintenance, etc. Preliminary results show that OpenSHMEM-UCCS performs comparable to state-of-the-art Cray SHMEM for Put, Get, and AMO operations.


Archive | 2015

The 2015 International Workshop on Software Engineering for High Performance Computing in Science (SE4HPCS 2015)

Claudio Bonati; Enrico Calore; Simone Coscetti; Michele Mesiti; Francesco Negro; Sebastiano Fabio Schifano; R. Tripiccione; Dali Wang; Tomislav Janjusic; Colleen M. Iversen; Peter E. Thornton; Misha Karssovski; Wei Wu; Yang Xu; Kapil Agrawal; Audris Mockus; Hassan Reza; Michael Aguilar; Sara Faraji Jalal; Valeria Cardellini; Salvatore Filippone; Gregory Butler

Collaboration


Dive into the Tomislav Janjusic's collaboration.

Top Co-Authors

Avatar

Christos Kartsaklis

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Dali Wang

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Yang Xu

University of Tennessee

View shared research outputs
Top Co-Authors

Avatar

Wei Wu

University of Tennessee

View shared research outputs
Top Co-Authors

Avatar

Colleen M. Iversen

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Krishna M. Kavi

University of North Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Misha Karssovski

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Peter E. Thornton

Oak Ridge National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chao Li

North Carolina State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge