Sebastian Baltes | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sebastian Baltes is active.

Explore More

Publication

Featured researches published by Sebastian Baltes.

foundations of software engineering | 2014

Sketches and diagrams in practice

Sebastian Baltes; Stephan Diehl

Sketches and diagrams play an important role in the daily work of software developers. In this paper, we investigate the use of sketches and diagrams in software engineering practice. To this end, we used both quantitative and qualitative methods. We present the results of an exploratory study in three companies and an online survey with 394 participants. Our participants included software developers, software architects, project managers, consultants, as well as researchers. They worked in different countries and on projects from a wide range of application areas. Most questions in the survey were related to the last sketch or diagram that the participants had created. Contrary to our expectations and previous work, the majority of sketches and diagrams contained at least some UML elements. However, most of them were informal. The most common purposes for creating sketches and diagrams were designing, explaining, and understanding, but analyzing requirements was also named often. More than half of the sketches and diagrams were created on analog media like paper or whiteboards and have been revised after creation. Most of them were used for more than a week and were archived. We found that the majority of participants related their sketches to methods, classes, or packages, but not to source code artifacts with a lower level of abstraction.

international conference on software engineering | 2014

RegViz: visual debugging of regular expressions

Fabian Beck; Stefan Gulan; Benjamin Biegel; Sebastian Baltes; Daniel Weiskopf

Regular expressions are a widely used programming technique, but seem to be neglected by software engineering research. Encoding complex string parsing in a very compact notation, their complexity and compactness, however, introduce particular challenges with respect to program comprehension. In this paper, we present RegViz, an approach to visually augment regular expressions without changing their original textual notation. The visual encoding clarifies the structure of the regular expressions and clearly discerns included tokens by function. The approach also provides advanced visual highlighting of matches in a sample text and defining test cases therein. We implemented RegViz as a Web-based tool for JavaScript regular expressions. Expert feedback suggests that the approach is intuitive to apply and increases the readability of regular expressions.

foundations of software engineering | 2014

Linking sketches and diagrams to source code artifacts

Sebastian Baltes; Peter Schmitz; Stephan Diehl

Recent studies have shown that sketches and diagrams play an important role in the daily work of software developers. If these visual artifacts are archived, they are often detached from the source code they document, because there is no ad- equate tool support to assist developers in capturing, archiving, and retrieving sketches related to certain source code artifacts. This paper presents SketchLink, a tool that aims at increasing the value of sketches and diagrams created during software development by supporting developers in these tasks. Our prototype implementation provides a web application that employs the camera of smartphones and tablets to capture analog sketches, but can also be used on desktop computers to upload, for instance, computer-generated diagrams. We also implemented a plugin for a Java IDE that embeds the links in Javadoc comments and visualizes them in situ in the source code editor as graphical icons.

visual analytics science and technology | 2016

Visual analysis and coding of data-rich user behavior

Tanja Blascheck; Fabian Beck; Sebastian Baltes; Thomas Ertl; Daniel Weiskopf

Investigating user behavior involves abstracting low-level events to higher-level concepts. This requires an analyst to study individual user activities, assign codes which categorize behavior, and develop a consistent classification scheme. To better support this reasoning process of an analyst, we suggest a novel visual analytics approach which integrates rich user data including transcripts, videos, eye movement data, and interaction logs. Word-sized visualizations embedded into a tabular representation provide a space-efficient and detailed overview of user activities. An analyst assigns codes, grouped into code categories, as part of an interactive process. Filtering and searching helps to select specific activities and focus an analysis. A comparison visualization summarizes results of coding and reveals relationships between codes. Editing features support efficient assignment, refinement, and aggregation of codes. We demonstrate the practical applicability and usefulness of our approach in a case study and describe expert feedback.

empirical software engineering and measurement | 2015

Navigate, Understand, Communicate: How Developers Locate Performance Bugs

Sebastian Baltes; Oliver Moseler; Fabian Beck; Stephan Diehl

Background: Performance bugs can lead to severe issues regarding computation efficiency, power consumption, and user experience. Locating these bugs is a difficult task because developers have to judge for every costly operation whether runtime is consumed necessarily or unnecessarily. Objective: We wanted to investigate how developers, when locating performance bugs, navigate through the code, understand the program, and communicate the detected issues. Method: We performed a qualitative user study observing twelve developers trying to fix documented performance bugs in two open source projects. The developers worked with a profiling and analysis tool that visually depicts runtime information in a list representation and embedded into the source code view. Results: We identified typical navigation strategies developers used for pinpointing the bug, for instance, following method calls based on runtime consumption. The integration of visualization and code helped developers to understand the bug. Sketches visualizing data structures and algorithms turned out to be valuable for externalizing and communicating the comprehension process for complex bugs. Conclusion: Fixing a performance bug is a code comprehension and navigation problem. Flexible navigation features based on executed methods and a close integration of source code and performance information support the process.

international conference on software engineering | 2017

Attribution required: stack overflow code snippets in GitHub projects

Sebastian Baltes; Richard Kiefer; Stephan Diehl

Stack Overflow (SO) is the largest Q&A website for developers, providing a huge amount of copyable code snippets. Using these snippets raises various maintenance and legal issues. The SO license requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SOs license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution. In this paper, we present the research design and summarized results of an empirical study analyzing attributed and unattributed usages of SO code snippets in GitHub projects. On average, 3.22% of all analyzed repositories and 7.33% of the popular ones contained a reference to SO. Further, we found that developers rather refer to the whole thread on SO than to a specific answer. For Java, at least two thirds of the copied snippets were not attributed.

mining software repositories | 2018

SOTorrent: reconstructing and analyzing the evolution of stack overflow posts

Sebastian Baltes; Lorik Dumani; Christoph Treude; Stephan Diehl

Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more recent library version, or text surrounding a code snippet is edited for clarity. To be able to analyze how content on SO evolves, we built SOTorrent, an open dataset based on the official SO data dump. SOTorrent provides access to the version history of SO content at the level of whole posts and individual text or code blocks. It connects SO posts to other platforms by aggregating URLs from text blocks and by collecting references from GitHub files to SO posts. In this paper, we describe how we built SOTorrent, and in particular how we evaluated 134 different string similarity metrics regarding their applicability for reconstructing the version history of text and code blocks. Based on a first analysis using the dataset, we present insights into the evolution of SO posts, e.g., that post edits are usually small, happen soon after the initial creation of the post, and that code is rarely changed without also updating the surrounding text. Further, our analysis revealed a close relationship between post edits and comments. Our vision is that researchers will use SOTorrent to investigate and understand the evolution of SO posts and their relation to other platforms such as GitHub.

Empirical Software Engineering | 2018

Usage and attribution of Stack Overflow code snippets in GitHub projects

Sebastian Baltes; Stephan Diehl

Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of copyable code snippets. Using those snippets raises maintenance and legal issues. SO’s license (CC BY-SA 3.0) requires attribution, i.e., referencing the original question or answer, and requires derived work to adopt a compatible license. While there is a heated debate on SO’s license model for code snippets and the required attribution, little is known about the extent to which snippets are copied from SO without proper attribution. We present results of a large-scale empirical study analyzing the usage and attribution of non-trivial Java code snippets from SO answers in public GitHub (GH) projects. We followed three different approaches to triangulate an estimate for the ratio of unattributed usages and conducted two online surveys with software developers to complement our results. For the different sets of projects that we analyzed, the ratio of projects containing files with a reference to SO varied between 3.3% and 11.9%. We found that at most 1.8% of all analyzed repositories containing code from SO used the code in a way compatible with CC BY-SA 3.0. Moreover, we estimate that at most a quarter of the copied code snippets from SO are attributed as required. Of the surveyed developers, almost one half admitted copying code from SO without attribution and about two thirds were not aware of the license of SO code snippets and its implications.

international conference on agile software development | 2016

Empirical research plan: effects of sketching on program comprehension

Sebastian Baltes; Stefan Wagner

Sketching is an important means of communication in software engineering practice. Yet, there is little research investigating the use of sketches. We want to contribute a better understanding of sketching, in particular its use during program comprehension. We propose a controlled experiment to investigate the effectiveness and efficiency of program comprehension with the support of sketches as well as what sketches are used in what way.

empirical software engineering and measurement | 2016

Worse Than Spam: Issues In Sampling Software Developers

Sebastian Baltes; Stephan Diehl

Background: Reaching out to professional software developers is a crucial part of empirical software engineering research. One important method to investigate the state of practice is survey research. As drawing a random sample of professional software developers for a survey is rarely possible, researchers rely on various sampling strategies. Objective: In this paper, we report on our experience with different sampling strategies we employed, highlight ethical issues, and motivate the need to maintain a collection of key demographics about software developers to ease the assessment of the external validity of studies. Method: Our report is based on data from two studies we conducted in the past. Results: Contacting developers over public media proved to be the most effective and efficient sampling strategy. However, we not only describe the perspective of researchers who are interested in reaching goals like a large number of participants or a high response rate, but we also shed light onto ethical implications of different sampling strategies. We present one specific ethical guideline and point to debates in other research communities to start a discussion in the software engineering research community about which sampling strategies should be considered ethical.

Explore More