Sascha Just | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sascha Just is active.

Explore More

Publication

Featured researches published by Sascha Just.

foundations of software engineering | 2008

What makes a good bug report

Nicolas Bettenburg; Sascha Just; Adrian Schröter; Cathrin Weiss; Rahul Premraj; Thomas Zimmermann

In software development, bug reports provide crucial information to developers. However, these reports widely differ in their quality. We conducted a survey among developers and users of APACHE, ECLIPSE, and MOZILLA to find out what makes a good bug report. The analysis of the 466 responses revealed an information mismatch between what developers need and what users supply. Most developers consider steps to reproduce, stack traces, and test cases as helpful, which are, at the same time, most difficult to provide for users. Such insight is helpful for designing new bug tracking tools that guide users at collecting and providing more helpful information. Our CUEZILLA prototype is such a tool and measures the quality of new bug reports; it also recommends which elements should be added to improve the quality. We trained CUEZILLA on a sample of 289 bug reports, rated by developers as part of the survey. The participants of our survey also provided 175 comments on hurdles in reporting and resolving bugs. Based on these comments, we discuss several recommendations for better bug tracking systems, which should focus on engaging bug reporters, better tool support, and improved handling of bug duplicates.

international conference on software engineering | 2013

It's not a bug, it's a feature: how misclassification impacts bug prediction

Kim Herzig; Sascha Just; Andreas Zeller

In a manual examination of more than 7,000 issue reports from the bug databases of five open-source projects, we found 33.8% of all bug reports to be misclassified - that is, rather than referring to a code fix, they resulted in a new feature, an update to documentation, or an internal refactoring. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. We discuss the impact of this misclassification on earlier studies and recommend manual data validation for future studies.

IEEE Transactions on Software Engineering | 2010

What Makes a Good Bug Report

Thomas Zimmermann; Rahul Premraj; Nicolas Bettenburg; Sascha Just; Adrian Schröter; Cathrin Weiss

eclipse technology exchange | 2007

Quality of bug reports in Eclipse

Nicolas Bettenburg; Sascha Just; Adrian Schröter; Cathrin Weiß; Rahul Premraj; Thomas Zimmermann

The information in bug reports influences the speed at which bugs are fixed. However, bug reports differ in their quality of information. We conducted a survey among ECLIPSE developers to determine the information in reports that they widely used and the problems frequently encountered. Our results show that steps to reproduce and stack traces are most sought after by developers, while inaccurate steps to reproduce and incomplete information pose the largest hurdles. Surprisingly, developers are indifferent to bug duplicates. Such insight is useful to design new bug tracking tools that guide reporters at providing more helpful information. We also present a prototype of a quality-meter tool that measures the quality of bug reports by scanning its content.

symposium on visual languages and human-centric computing | 2008

Towards the next generation of bug tracking systems

Sascha Just; Rahul Premraj; Thomas Zimmermann

Developers typically rely on the information submitted by end-users to resolve bugs. We conducted a survey on information needs and commonly faced problems with bug reporting among several hundred developers and users of the APACHE, ECLIPSE and MOZILLA projects. In this paper, we present the results of a card sort on the 175 comments sent back to us by the responders of the survey. The card sort revealed several hurdles involved in reporting and resolving bugs, which we present in a collection of recommendations for the design of new bug tracking systems. Such systems could provide contextual assistance, reminders to add information, and most important, assistance to collect and report crucial information to developers.

international symposium on software reliability engineering | 2013

Predicting defects using change genealogies

Kim Herzig; Sascha Just; Andreas Rau; Andreas Zeller

When analyzing version histories, researchers traditionally focused on single events: e.g. the change that causes a bug, the fix that resolves an issue. Sometimes however, there are indirect effects that count: Changing a module may lead to plenty of follow-up modifications in other places, making the initial change having an impact on those later changes. To this end, we group changes into change genealogies, graphs of changes reflecting their mutual dependencies and influences and develop new metrics to capture the spatial and temporal influence of changes. In this paper, we show that change genealogies offer good classification models when identifying defective source files: With a median precision of 73% and a median recall of 76%, change genealogy defect prediction models not only show better classification accuracies as models based on code complexity, but can also outperform classification models based on code dependency network metrics.

Empirical Software Engineering | 2016

The impact of tangled code changes on defect prediction models

Kim Herzig; Sascha Just; Andreas Zeller

When interacting with source control management system, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing version histories, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source Java projects, we found between 7 % and 20 % of all bug fixes to consist of multiple tangled changes. Using a multi-predictor approach to untangle changes, we show that on average at least 16.6 % of all source files are incorrectly associated with bug reports. These incorrect bug file associations seem to not significantly impact models classifying source files to have at least one bug or no bugs. But our experiments show that untangling tangled code changes can result in more accurate regression bug prediction models when compared to models trained and tested on tangled bug datasets—in our experiments, the statistically significant accuracy improvements lies between 5 % and 200 %. We recommend better change organization to limit the impact of tangled changes.

international symposium on software reliability engineering | 2016

Switching to Git: The Good, the Bad, and the Ugly

Sascha Just; Kim Herzig; Jacek Czerwonka; Brendan Murphy

Since its introduction 10 years ago, GIT has taken the world of version control systems (VCS) by storm. Its success is partly due to creating opportunities for new usage patterns that empower developers to work more efficiently. However, the resulting change in both user behavior and the way GIT stores changes impacts data mining and data analytics procedures [6], [13]. While some of these unique characteristics can be managed by adjusting mining and analytical techniques, others can lead to severe data loss and the inability to audit code changes, e.g. knowing the full history of changes of code related to security and privacy functionality. Thus, switching to GIT comes with challenges to established development process analytics. This paper is based on our experience in attempting to provide continuous process analysis for Microsoft product teams who switching to GIT as their primary VCS. We illustrate how GITs concepts and usage patterns create a need for changing well-established data analytic processes. The goal of this paper is to raise awareness how certain GIT operations may damage or even destroy information about historical code changes necessary for continuous data development process analytics. To that end, we provide a list of common GIT usage patterns with a description of how these operations impact data mining applications. Finally, we provide examples of how one may counteract the effects of such destructive operations in the future. We further provide a new algorithm to detect integration paths that is specific to distributed version control systems like GIT, which allows us to reconstruct the information that is crucial to most development process analytics.

Perspectives on Data Science for Software Engineering | 2016

Gotchas from mining bug reports

Sascha Just; Kim Herzig

Over the years, it has become common practice in empirical software engineering to mine data from version archives and bug databases to learn where bugs have been fixed in the past, or to build prediction models to find error-prone code in the future. However, most of these approach rely on strong assumptions that need to be verified to ensure that resulting models are accurate and reflect the intended property which can have serious consequences for decisions based on such flawed models.

Software Engineering & Management | 2015