Kim Herzig | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kim Herzig is active.

Explore More

Publication

Featured researches published by Kim Herzig.

international conference on software engineering | 2013

It's not a bug, it's a feature: how misclassification impacts bug prediction

Kim Herzig; Sascha Just; Andreas Zeller

In a manual examination of more than 7,000 issue reports from the bug databases of five open-source projects, we found 33.8% of all bug reports to be misclassified - that is, rather than referring to a code fix, they resulted in a new feature, an update to documentation, or an internal refactoring. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. We discuss the impact of this misclassification on earlier studies and recommend manual data validation for future studies.

international symposium on software reliability engineering | 2010

Change Bursts as Defect Predictors

Nachiappan Nagappan; Andreas Zeller; Thomas Zimmermann; Kim Herzig; Brendan Murphy

In software development, every change induces a risk. What happens if code changes again and again in some period of time? In an empirical study on Windows Vista, we found that the features of such change bursts have the highest predictive power for defect-prone components. With precision and recall values well above 90%, change bursts significantly improve upon earlier predictors such as complexity metrics, code churn, or organizational structure. As they only rely on version history and a controlled change process, change bursts are straight-forward to detect and deploy.

mining software repositories | 2013

The impact of tangled code changes

Kim Herzig; Andreas Zeller

When interacting with version control systems, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing the version history, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source Java projects, we found up to 15% of all bug fixes to consist of multiple tangled changes. Using a multi-predictor approach to untangle changes, we show that on average at least 16.6% of all source files are incorrectly associated with bug reports. We recommend better change organization to limit the impact of tangled changes.

empirical software engineering and measurement | 2011

Network Versus Code Metrics to Predict Defects: A Replication Study

Rahul Premraj; Kim Herzig

Several defect prediction models have been proposed to identify which entities in a software system are likely to have defects before its release. This paper presents a replication of one such study conducted by Zimmermann and Nagappan on Windows Server 2003 where the authors leveraged dependency relationships between software entities captured using social network metrics to predict whether they are likely to have defects. They found that network metrics perform significantly better than source code metrics at predicting defects. In order to corroborate the generality of their findings, we replicate their study on three open source Java projects, viz., JRuby, ArgoUML, and Eclipse. Our results are in agreement with the original study by Zimmermann and Nagappan when using a similar experimental setup as them (random sampling). However, when we evaluated the metrics using setups more suited for industrial use -- forward-release and cross-project prediction -- we found network metrics to offer no vantage over code metrics. Moreover, code metrics may be preferable to network metrics considering the data is easier to collect and we used only 8 code metrics compared to approximately 58 network metrics.

international symposium on software reliability engineering | 2013

Predicting defects using change genealogies

Kim Herzig; Sascha Just; Andreas Rau; Andreas Zeller

When analyzing version histories, researchers traditionally focused on single events: e.g. the change that causes a bug, the fix that resolves an issue. Sometimes however, there are indirect effects that count: Changing a module may lead to plenty of follow-up modifications in other places, making the initial change having an impact on those later changes. To this end, we group changes into change genealogies, graphs of changes reflecting their mutual dependencies and influences and develop new metrics to capture the spatial and temporal influence of changes. In this paper, we show that change genealogies offer good classification models when identifying defective source files: With a median precision of 73% and a median recall of 76%, change genealogy defect prediction models not only show better classification accuracies as models based on code complexity, but can also outperform classification models based on code dependency network metrics.

international conference on software engineering | 2009

Predicting defects in SAP Java code: An experience report

Tilman Holschuh; Markus Pauser; Kim Herzig; Thomas Zimmermann; Rahul Premraj; Andreas Zeller

Which components of a large software system are the most defect-prone? In a study on a large SAP Java system, we evaluated and compared a number of defect predictors, based on code features such as complexity metrics, static error detectors, change frequency, or component imports, thus replicating a number of earlier case studies in an industrial context. We found the overall predictive power to be lower than expected; still, the resulting regression models successfully predicted 50–60% of the 20% most defect-prone components.

international conference on software engineering | 2015

The art of testing less without sacrificing quality

Kim Herzig; Michaela Greiler; Jacek Czerwonka; Brendan Murphy

Testing is a key element of software development processes for the management and assessment of product quality. In most development environments, the software engineers are responsible for ensuring the functional correctness of code. However, for large complex software products, there is an additional need to check that changes do not negatively impact other parts of the software and they comply with system constraints such as backward compatibility, performance, security etc. Ensuring these system constraints may require complex verification infrastructure and test procedures. Although such tests are time consuming and expensive and rarely find defects they act as an insurance process to ensure the software is compliant. However, long lasting tests increasingly conflict with strategic aims to shorten release cycles. To decrease production costs and to improve development agility, we created a generic test selection strategy called THEO that accelerates test processes without sacrificing product quality. THEO is based on a cost model, which dynamically skips tests when the expected cost of running the test exceeds the expected cost of removing it. We replayed past development periods of three major Microsoft products resulting in a reduction of 50% of test executions, saving millions of dollars per year, while maintaining product quality.

international conference on software testing verification and validation | 2011

An Empirical Study on the Relation between Dependency Neighborhoods and Failures

Thomas Zimmerman; Nachiappan Nagappan; Kim Herzig; Rahul Premraj; Laurie Williams

Changing source code in large software systems is complex and requires a good understanding of dependencies between software components. Modification to components with little regard to dependencies may have an adverse impact on the quality of the latter, i.e., increase their risk to fail. We conduct an empirical study to understand the relationship between the quality of components and the characteristics of their dependencies such as their frequency of change, their complexity, number of past failures and the like. Our study has been conducted on two large software systems: Microsoft VISTA and ECLIPSE. Our results show that components that have outgoing dependencies to components with higher object-oriented complexity tend to have fewer field failures for VISTA, but the opposite relation holds for ECLIPSE. Likewise, other notable observations have been made through our study that (a) confirm that certain characteristics of components increase the risk of their dependencies to fail and (b) some of the characteristics are project specific while some were also found to be common. We expect that such results can be leveraged for use to provide new directions for research in defect prediction, test prioritization and related research fields that utilize code dependencies in their empirical analysis. Additionally, these results provide insights to engineers on the potential reliability impacts of new component dependencies based upon the characteristics of the component.

international conference on software engineering | 2015

Approximating attack surfaces with stack traces

Christopher Theisen; Kim Herzig; Patrick Morrison; Brendan Murphy; Laurie Williams

Security testing and reviewing efforts are a necessity for software projects, but are time-consuming and expensive to apply. Identifying vulnerable code supports decision-making during all phases of software development. An approach for identifying vulnerable code is to identify its attack surface, the sum of all paths for untrusted data into and out of a system. Identifying the code that lies on the attack surface requires expertise and significant manual effort. This paper proposes an automated technique to empirically approximate attack surfaces through the analysis of stack traces. We hypothesize that stack traces from user-initiated crashes have several desirable attributes for measuring attack surfaces. The goal of this research is to aid software engineers in prioritizing security efforts by approximating the attack surface of a system via stack trace analysis. In a trial on Windows 8, the attack surface approximation selected 48.4% of the binaries and contained 94.6% of known vulnerabilities. Compared with vulnerability prediction models (VPMs) run on the entire codebase, VPMs run on the attack surface approximation improved recall from .07 to .1 for binaries and from .02 to .05 for source files. Precision remained at .5 for binaries, while improving from .5 to .69 for source files.

mining software repositories | 2009

Mining the Jazz repository: Challenges and opportunities

Kim Herzig; Andreas Zeller

By integrating various development and collaboration tools into one single platform, the Jazz environment offers several opportunities for software repository miners. In particular, Jazz offers full traceability from the initial requirements via work packages and work assignments to the final changes and tests; all these features can be easily accessed and leveraged for better prediction and recommendation systems. In this paper, we share our initial experiences from mining the Jazz repository. We also give a short overview of the retrieved data sets and discuss possible problems of the Jazz repository and the platform itself.

Explore More