David Jaeger
Hasso Plattner Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Jaeger.
ieee international conference on dependable autonomic and secure computing | 2013
Amir Azodi; David Jaeger; Feng Cheng; Christoph Meinel
Looking at current IDS and SIEM systems, we observe heavy processing power dedicated solely to answering a simple question, What is the format of the log line that the IDS (or SIEM) system should process next? Due to the apparent difficulties of uniquely identifying a log line at run-time, most systems today do little or no normalisation of the events they receive. Indeed these systems often rely on popular search engine applications for processing and analysing the event information they receive, which results in slower and far less accurate event correlations. In this process, a large list of tokenisers is usually created in order to find an answer to the above posted question. The tokenisers are run against the log lines, until a match is found. The appropriate log line can then be passed on to the correct extraction module for further processing. This process is currently the standard procedure of most IDS and SIEM systems. To address this problem and to optimise and improve the said process, this paper describes a method for detecting the exact type and format of a read log line in the first place. The method presented performs in an efficient manner, while it is less resource hungry. The proposed detection system is described and implemented, its pros and cons are analysed and weighed against methods currently implemented by popular IDS and SIEM systems for solving this task.
international conference on advanced cloud and big data | 2013
Amir Azodi; David Jaeger; Feng Cheng; Christoph Meinel
The current state of affairs regarding the way events are logged by IT systems is the source of many problems for the developers of Intrusion Detection Systems (IDS) and Security Information and Event Management (SIEM) systems. These problems stand in the way of the development of more accurate security solutions that draw their results from the data included within the logs they process. This is mainly caused by a lack of standards that can encapsulate all events in a coherent way. As a result, correlating between logs produced by different systems that use different log formats has been difficult and infeasible in many cases. In order to solve the challenges faced by Correlation Based Intrusion Detection Systems, we provide a platform for normalising events1 into a unified super event loosely based on the Common Event Expression standard (CEE) developed by the Mitre corporation. We show how our solution is able to normalise seemingly unrelated events into a unified format. Additionally, we demonstrate queries that can detect attacks on collections of normalised logs from different sources.
information assurance and security | 2013
Andrey Sapegin; David Jaeger; Amir Azodi; Marian Gawron; Feng Cheng; Christoph Meinel
The differences in log file formats employed in a variety of services and applications remain to be a problem for security analysts and developers of intrusion detection systems. The proposed solution, i.e. the usage of common log formats, has a limited utilization within existing solutions for security management. In our paper, we reveal the reasons for this limitation. We show disadvantages of existing common log formats for normalisation of security events. To deal with it we have created a new log format that fits for intrusion detection purposes and can be extended easily. Taking previous work into account, we would like to propose a new format as an extension to existing common log formats, rather than a standalone specification.
international conference on cyber security and cloud computing | 2015
David Jaeger; Martin Ussath; Feng Cheng; Christoph Meinel
Looking at recent cyber-attacks in the news, a growing complexity and sophistication of attack techniques can be observed. Many of these attacks are performed in multiple steps to reach the core of the targeted network. Existing signature detection solutions are focused on the detection of a single step of an attack, but they do not see the big picture. Furthermore, current signature languages cannot integrate valuable external threat intelligence, which would simplify the creation of complex signatures and enables the detection of malicious activities seen by other targets. We extend an existing multi-step signature language to support attack detection on normalized log events, which were collected from various applications and devices. Additionally, the extended language supports the integration of external threat intelligence and allows us to reference current threat indicators. With this approach, we can create generic signatures that stay up-to-date. Using our language, we could detect various login brute-force attempts on multiple applications with only one generic signature.
conference on information sciences and systems | 2016
Martin Ussath; David Jaeger; Feng Cheng; Christoph Meinel
Advanced persistent threats (APTs) pose a significant risk to nearly every infrastructure. Due to the sophistication of these attacks, they are able to bypass existing security systems and largely infiltrate the target network. The prevention and detection of APT campaigns is also challenging, because of the fact that the attackers constantly change and evolve their advanced techniques and methods to stay undetected. In this paper we analyze 22 different APT reports and give an overview of the used techniques and methods. The analysis is focused on the three main phases of APT campaigns that allow to identify the relevant characteristics of such attacks. For each phase we describe the most commonly used techniques and methods. Through this analysis we could reveal different relevant characteristics of APT campaigns, for example that the usage of 0-day exploit is not common for APT attacks. Furthermore, the analysis shows that the dumping of credentials is a relevant step in the lateral movement phase for most APT campaigns. Based on the identified characteristics, we also propose concrete prevention and detection approaches that make it possible to identify crucial malicious activities that are performed during APT campaigns.
international conference information security theory and practice | 2015
David Jaeger; Amir Azodi; Feng Cheng; Christoph Meinel
An important technique for attack detection in complex company networks is the analysis of log data from various network components. As networks are growing, the number of produced log events increases dramatically, sometimes even to multiple billion events per day. The analysis of such big data highly relies on a full normalization of the log data in realtime. Until now, the important issue of full normalization of a large number of log events is only insufficiently handled by many software solutions and not well covered in existing research work. In this paper, we propose and evaluate multiple approaches for handling the normalization of a large number of typical logs better and more efficient. The main idea is to organize the normalization in multiple levels by using a hierarchical knowledge base KB of normalization rules. In the end, we achieve a performance gain of about 1000x with our presented approaches, in comparison to a naive approach typically used in existing normalization solutions. Considering this improvement, big log data can now be handled much faster and can be used to find and mitigate attacks in realtime.
International Conference on Passwords | 2014
David Jaeger; Hendrik Graupner; Andrey Sapegin; Feng Cheng; Christoph Meinel
The amount of identity data leaks in recent times is drastically increasing. Not only smaller web services, but also established technology companies are affected. However, it is not commonly known, that incidents covered by media are just the tip of the iceberg. Accordingly, more detailed investigation of not just publicly accessible parts of the web but also deep web is imperative to gain greater insight into the large number of data leaks. This paper presents methods and experiences of our deep web analysis. We give insight in commonly used platforms for data exposure, formats of identity related data leaks, and the methods of our analysis. On one hand a lack of security implementations among Internet service providers exists and on the other hand users still tend to generate and reuse weak passwords. By publishing our results we aim to increase awareness on both sides and the establishment of counter measures.
Concurrency and Computation: Practice and Experience | 2017
Andrey Sapegin; Marian Gawron; David Jaeger; Feng Cheng; Christoph Meinel
Modern security information and event management systems should be capable to store and process high amount of events or log messages in different formats and from different sources. This requirement often prevents such systems from usage of computational heavy algorithms for security analysis. To deal with this issue, we built our system based on an in‐memory database with an integrated machine learning library, namely, SAP HANA. Three approaches, that is, (1) deep normalisation of log messages, (2) storing data in the main memory and (3) running data analysis directly in the database, allow us to increase processing speed in such a way that machine learning analysis of security events becomes possible nearly in real time. Besides that, we developed a universal anomaly detection algorithm, which uses vector space model to represent and cluster textual log messages. Together with deep normalisation approach, this algorithm solves the problem of correlation for heterogenous security events containing many text fields. To prove our concepts, we measured the processing speed for the developed system on the data generated using Active Directory testbed, compared it with classical system architecture based on PostgreSQL database and showed the efficiency of our approach for high‐speed analysis of security events. Copyright
international symposium on parallel and distributed computing | 2015
Andrey Sapegin; Marian Gawron; David Jaeger; Feng Cheng; Christoph Meinel
Modern Security Information and Event Management systems should be capable to store and process high amount of events or log messages in different formats and from different sources. This requirement often prevents such systems from usage of computational-heavy algorithms for security analysis. To deal with this issue, we built our system based on an in-memory data base with an integrated machine learning library, namely SAP HANA. Three approaches, i.e. (1) deep normalisation of log messages (2) storing data in the main memory and (3) running data analysis directly in the database, allow us to increase processing speed in such a way, that machine learning analysis of security events becomes possible nearly in real-time. To prove our concepts, we measured the processing speed for the developed system on the data generated using Active Directory tested and showed the efficiency of our approach for high-speed analysis of security events.
international conference on it convergence and security, icitcs | 2013
Feng Cheng; Amir Azodi; David Jaeger; Christoph Meinel
A huge amount of information about real-time events are being generated in every second in a running IT-Infrastructure and recorded by the system logs, application logs, as well as the output from the deployed security or management methods, e.g., IDS alerts, firewall logs, scanning reports, etc. To rapidly gather, process, correlate, and analyze the massive event information is a challenging task. High performance security analytics is proposed to address this challenge by which the real-time event information can be normalized, centralized, and correlated to help identify the current running state of the target environment. As an example of next generation Security Information and Event Management (SIEM) platform, Security Analytics Lab (SAL) has been designed and implemented based on the newly emerged In-Memory data management technique, which makes it possible to efficiently organize, access, and process different types of event information through a consistent central storage and interface. In this paper, the multi-core architecture is introduced on the event correlation module of SAL platform by which the correlation tasks can be executed in parallel by different computing resources. The k-means algorithm is implemented as an example of possible event clustering and correlation algorithms. Several experiments are conducted and analyzed to show that the performance of analytics can be significantly improved by applying multi-core architecture in the event correlation procedure.