Daniel M. German | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel M. German is active.

Explore More

Publication

Featured researches published by Daniel M. German.

mining software repositories | 2009

The promises and perils of mining GitHub

Eirini Kalliamvakou; Georgios Gousios; Kelly Blincoe; Leif Singer; Daniel M. German; Daniela E. Damian

With over 10 million git repositories, GitHub is becoming one of the most important source of software artifacts on the Internet. Researchers are starting to mine the information stored in GitHubs event logs, trying to understand how its users employ the site to collaborate on software. However, so far there have been no studies describing the quality and properties of the data available from GitHub. We document the results of an empirical study aimed at understanding the characteristics of the repositories in GitHub and how users take advantage of GitHubs main features---namely commits, pull requests, and issues. Our results indicate that, while GitHub is a rich source of data on software development, mining GitHub for research purposes should take various potential perils into consideration. We show, for example, that the majority of the projects are personal and inactive; that GitHub is also being used for free storage and as a Web hosting service; and that almost 40% of all pull requests do not appear as merged, even though they were. We provide a set of recommendations for software engineering researchers on how to approach the data in GitHub.

software visualization | 2005

On the use of visualization to support awareness of human activities in software development: a survey and a framework

Margaret-Anne D. Storey; Davor Cubranic; Daniel M. German

This paper proposes a framework for describing, comparing and understanding visualization tools that provide awareness of human activities in software development. The framework has several purposes -- it can act as a formative evaluation mechanism for tool designers; as an assessment tool for potential tool users; and as a comparison tool so that tool researchers can compare and understand the differences between various tools and identify potential new research areas. We use this framework to structure a survey of visualization tools for activity awareness in software development. Based on this survey we suggest directions for future research.

international conference on software engineering | 2008

Open source software peer review practices: a case study of the apache server

Peter C. Rigby; Daniel M. German; Margaret-Anne D. Storey

Peer review is seen as an important quality assurance mechanism in both industrial development and the open source software (OSS) community. The techniques for performing inspections have been well studied in industry; in OSS development, peer reviews are less well understood. We examine the two peer review techniques used by the successful, mature Apache server project: review-then-commit and commit-then-review. Using archival records of email discussion and version control repositories, we construct a series of metrics that produces measures similar to those used in traditional inspection experiments. Specifically, we measure the frequency of review, the level of participation in reviews, the size of the artifact under review, the calendar time to perform a review, and the number of reviews that find defects. We provide a comparison of the two Apache review techniques as well as a comparison of Apache review to inspection in an industrial project. We conclude that Apache reviews can be described as (1) early, frequent reviews (2) of small, independent, complete contributions (3) conducted asynchronously by a potentially large, but actually small, group of self-selected experts (4) leading to an efficient and effective peer review technique.

Software Process: Improvement and Practice | 2003

The GNOME project: a case study of open source, global software development

Daniel M. German

Many successful free/open source software (FOSS) projects start with the premise that their contributors are rarely colocated, and as a consequence, these projects are cases of global software development (GSD). This article describes how the GNOME Project, a large FOSS project, has tried to overcome the disadvantages of GSD. The main goal of GNOME is to create a GUI desktop for Unix systems, and encompasses close to two million lines of code. More than 500 individuals (distributed across the world) have contributed to the project. This article also describes the software development methods and practices used by the members of the project, and its organizational structure. The article ends by proposing a list of practices that could benefit other global software development projects, both FOSS and commercial. Copyright

mining software repositories | 2008

What do large commits tell us?: a taxonomical study of large commits

Abram Hindle; Daniel M. German; Richard C. Holt

Research in the mining of software repositories has frequently ignored commits that include a large number of files (we call these large commits). The main goal of this paper is to understand the rationale behind large commits, and if there is anything we can learn from them. To address this goal we performed a case study that included the manual classification of large commits of nine open source projects. The contributions include a taxonomy of large commits, which are grouped according to their intention. We contrast large commits against small commits and show that large commits are more perfective while small commits are more corrective. These large commits provide us with a window on the development practices of maintenance teams.

international conference on software maintenance | 2004

An empirical study of fine-grained software modifications

Daniel M. German

Software is typically improved and modified in small increments (we refer to each of these increments as a modification record—MR). MRs are usually stored in a configuration management or version control system and can be retrieved for analysis. In this study we retrieved the MRs from several mature open software projects. We then concentrated our analysis on those MRs that fix defects and provided heuristics to automatically classify them. We used the information in the MRs to visualize what files are changed at the same time, and who are the people who tend to modify certain files. We argue that these visualizations can be used to understand the development stage of in which a project is at a given time (new features are added, or defects are being fixed), the level of modularization of a project, and how developers might interact between each other and the source code of a system.

2008 Frontiers of Software Maintenance | 2008

The past, present, and future of software evolution

Michael W. Godfrey; Daniel M. German

Change is an essential characteristic of software development, as software systems must respond to evolving requirements, platforms, and other environmental pressures. In this paper, we discuss the concept of software evolution from several perspectives. We examine how it relates to and differs from software maintenance. We discuss insights about software evolution arising from Lehmanpsilas laws of software evolution and the staged lifecycle model of Bennett and Rajlich. We compare software evolution to other kinds of evolution, from science and social sciences, and we examine the forces that shape change. Finally, we discuss the changing nature of software in general as it relates to evolution, and we propose open challenges and future directions for software evolution research.

international conference on software engineering | 2009

License integration patterns: Addressing license mismatches in component-based development

Daniel M. German; Ahmed E. Hassan

In this paper we address the problem of combining software components with different and possibly incompatible legal licenses to create a software application that does not violate any of these licenses while potentially having its own. We call this problem the license mismatch problem. The rapid growth and availability of Open Source Software (OSS) components with varying licenses, and the existence of more than 70 OSS licenses increases the complexity of this problem. Based on a study of 124 OSS software packages, we developed a model which describes the interconnection of components in these packages from a legal point of view. We used our model to document integration patterns that are commonly used to solve the license mismatch problem in practice when creating both proprietary and OSS applications. Software engineers with little legal expertise could use these documented patterns to understand and address the legal issues involved in reusing components with different and possibly conflicting licenses.

mining software repositories | 2011

Software bertillonage: finding the provenance of an entity

Julius Davies; Daniel M. German; Michael W. Godfrey; Abram Hindle

Deployed software systems are typically composed of many pieces, not all of which may have been created by the main development team. Often, the provenance of included components -- such as external libraries or cloned source code -- is not clearly stated, and this uncertainty can introduce technical and ethical concerns that make it difficult for system owners and other stakeholders to manage their software assets. In this work, we motivate the need for the recovery of the provenance of software entities by a broad set of techniques that could include signature matching, source code fact extraction, software clone detection, call flow graph matching, string matching, historical analyses, and other techniques. We liken our provenance goals to that of Bertillonage, a simple and approximate forensic analysis technique based on bio-metrics that was developed in 19th century France before the advent of fingerprints. As an example, we have developed a fast, simple, and approximate technique called anchored signature matching for identifying library version information within a given Java application. This technique involves a type of structured signature matching performed against a database of candidates drawn from the Maven2 repository, a 150GB collection of open source Java libraries. An exploratory case study using a proprietary e-commerce Java application illustrates that the approach is both feasible and effective.

International Journal of Software Engineering and Knowledge Engineering | 2006

VISUALIZING THE EVOLUTION OF SOFTWARE USING SOFTCHANGE

Daniel M. German; Abram Hindle

A typical software development team leaves behind a large amount of information. This information takes different forms, such as mail messages, software releases, version control logs, defect reports, etc. softChange is a tool that retrieves this information, analyses and enhances it by finding new relationships amongst it, and then allows users to navigate and visualize this information. The main objective of softChange it to help programmers, their management and software evolution researchers in understanding how a software product has evolved since its conception.

Explore More