Erik Arisholm | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Erik Arisholm is active.

Explore More

Publication

Featured researches published by Erik Arisholm.

Journal of Systems and Software | 2010

A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Erik Arisholm; Lionel C. Briand; Eivind B. Johannessen

This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE). The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost.

international symposium on empirical software engineering | 2006

Predicting fault-prone components in a java legacy system

Erik Arisholm; Lionel C. Briand

This paper reports on the construction and validation of faultproneness prediction models in the context of an object-oriented, evolving, legacy system. The goal is to help QA engineers focus their limited verification resources on parts of the system likely to contain faults. A number of measures including code quality, class structure, changes in class structure, and the history of class-level changes and faults are included as candidate predictors of class fault-proneness. A cross-validated classification analysis shows that the obtained model has less than 20% of false positives and false negatives, respectively. However, as shown in this paper, statistics regarding the classification accuracy tend to inflate the potential usefulness of the fault-proneness prediction models. We thus propose a simple and pragmatic methodology for assessing the costeffectiveness of the predictions to focus verification effort. On the basis of the cost-effectiveness analysis we show that change and fault data from previous releases is paramount to developing a practically useful prediction model. When our model is applied to predict faults in a new release, the estimated potential savings in verification effort is about 29%. In contrast, the estimated savings in verification effort drops to 0% when history data is not included.

international symposium on empirical software engineering | 2002

Conducting realistic experiments in software engineering

Dag I. K. Sjøberg; Bente Anda; Erik Arisholm; Tore Dybå; Magne Jørgensen; Amela Karahasanovic; Espen Frimann Koren; Marek Vokáč

An important goal of most empirical software engineering research is the transfer of research results to industrial applications. Two important obstacles for this transfer are the lack of control of variables of case studies, i.e., the lack of explanatory power, and the lack of realism of controlled experiments. While it may be difficult to increase the explanatory power of case studies, there is a large potential for increasing the realism of controlled software engineering experiments. To convince industry about the validity and applicability of the experimental results, the tasks, subjects and the environments of the experiments should be as realistic as practically possible. Such experiments are, however, more expensive than experiments involving students, small tasks and pen-and-paper environments. Consequently, a change towards more realistic experiments requires a change in the amount of resources spent on software engineering experiments. This paper argues that software engineering researchers should apply for resources enabling expensive and realistic software engineering experiments similar to how other researchers apply for resources for expensive software and hardware that are necessary for their research. The paper describes experiences from recent experiments that varied in size from involving one software professional for 5 days to 130 software professionals, from 9 consultancy companies, for one day each.

IEEE Transactions on Software Engineering | 2006

The impact of UML documentation on software maintenance: an experimental evaluation

Erik Arisholm; Lionel C. Briand; Siw Elisabeth Hove; Yvan Labiche

The Unified Modeling Language (UML) is becoming the de facto standard for software analysis and design modeling. However, there is still significant resistance to model-driven development in many software organizations because it is perceived to be expensive and not necessarily cost-effective. Hence, it is important to investigate the benefits obtained from modeling. As a first step in this direction, this paper reports on controlled experiments, spanning two locations, that investigate the impact of UML documentation on software maintenance. Results show that, for complex tasks and past a certain learning curve, the availability of UML documentation may result in significant improvements in the functional correctness of changes as well as the quality of their design. However, there does not seem to be any saving of time. For simpler tasks, the time needed to update the UML documentation may be substantial compared with the potential benefits, thus motivating the need for UML tools with better support for software maintenance

IEEE Transactions on Software Engineering | 2008

A Realistic Empirical Evaluation of the Costs and Benefits of UML in Software Maintenance

Wojciech J. Dzidek; Erik Arisholm; Lionel C. Briand

The Unified Modeling Language (UML) is the de facto standard for object-oriented software analysis and design modeling. However, few empirical studies exist that investigate the costs and evaluate the benefits of using UML in realistic contexts. Such studies are needed so that the software industry can make informed decisions regarding the extent to which they should adopt UML in their development practices. This is the first controlled experiment that investigates the costs of maintaining and the benefits of using UML documentation during the maintenance and evolution of a real, non-trivial system, using professional developers as subjects, working with a state-of-the-art UML tool during an extended period of time. The subjects in the control group had no UML documentation. In this experiment, the subjects in the UML group had on average a practically and statistically significant 54% increase in the functional correctness of changes (p=0.03), and an insignificant 7% overall improvement in design quality (p=0.22) - though a much larger improvement was observed on the first change task (56%) - at the expense of an insignificant 14% increase in development time caused by the overhead of updating the UML documentation (p=0.35).

Information & Software Technology | 2009

The effectiveness of pair programming: A meta-analysis

Jo Erskine Hannay; Tore Dybå; Erik Arisholm; Dag I. K. Sjøberg

Several experiments on the effects of pair versus solo programming have been reported in the literature. We present a meta-analysis of these studies. The analysis shows a small significant positive overall effect of pair programming on quality, a medium significant positive overall effect on duration, and a medium significant negative overall effect on effort. However, between-study variance is significant, and there are signs of publication bias among published studies on pair programming. A more detailed examination of the evidence suggests that pair programming is faster than solo programming when programming task complexity is low and yields code solutions of higher quality when task complexity is high. The higher quality for complex tasks comes at a price of considerably greater effort, while the reduced completion time for the simpler tasks comes at a price of noticeably lower quality. We conclude that greater attention should be given to moderating factors on the effects of pair programming.

IEEE Transactions on Software Engineering | 2010

Effects of Personality on Pair Programming

Jo Erskine Hannay; Erik Arisholm; Harald Engvik; Dag I. K. Sjøberg

Personality tests in various guises are commonly used in recruitment and career counseling industries. Such tests have also been considered as instruments for predicting the job performance of software professionals both individually and in teams. However, research suggests that other human-related factors such as motivation, general mental ability, expertise, and task complexity also affect the performance in general. This paper reports on a study of the impact of the Big Five personality traits on the performance of pair programmers together with the impact of expertise and task complexity. The study involved 196 software professionals in three countries forming 98 pairs. The analysis consisted of a confirmatory part and an exploratory part. The results show that: (1) Our data do not confirm a meta-analysis-based model of the impact of certain personality traits on performance and (2) personality traits, in general, have modest predictive value on pair programming performance compared with expertise, task complexity, and country. We conclude that more effort should be spent on investigating other performance-related predictors such as expertise, and task complexity, as well as other promising predictors, such as programming skill and learning. We also conclude that effort should be spent on elaborating on the effects of personality on various measures of collaboration, which, in turn, may be used to predict and influence performance. Insights into such malleable, rather than static, factors may then be used to improve pair programming performance.

Empirical Software Engineering | 1999

Empirical Studies of Object-Oriented Artifacts, Methods,and Processes: State of the Art and Future Directions

Lionel C. Briand; Erik Arisholm; Steve Counsell; F. Houdek; P. Thévenod–Fosse

Object-Oriented technologies are becoming pervasive in many software development organizations. However, many methods, processes, tools, or notations are being used without thorough evaluation. Empirical studies aim at investigating the performance of such technologies and the quality of the resulting object-oriented (OO) software products. In other words, the goal is to provide a scientific foundation to the engineering of OO software. This paper summarizes the results of a working group at the Empirical Studies of Software Development and Evolution (ESSDE) workshop in Los Angeles in May 1999. The authors of this paper took part in the working group and have all been involved with various aspects of empirical studies of OO software development. We therefore hope to achieve a good coverage of the current state of the art. We provide an overview of the existing research and results, future research directions, and important issues regarding the methodology of conducting empirical studies. In Section 2, we cover existing empirical studies and we relate them to the claims usually associated with OO development technologies. Section 3 describes in depth what we believe are important directions for research and, more concretely, precise research questions. Section 4 identifies what we think are important methodological points and strategies to answer these questions.

Empirical Software Engineering | 2004

A Controlled Experiment Comparing the Maintainability of Programs Designed with and without Design Patterns—A Replication in a Real Programming Environment

Marek Vokáč; Walter F. Tichy; Dag I. K. Sjøberg; Erik Arisholm; Magne Aldrin

Software “design patterns” seek to package proven solutions to design problems in a form that makes it possible to find, adapt and reuse them. To support the industrial use of design patterns, this research investigates when, and how, using patterns is beneficial, and whether some patterns are more difficult to use than others. This paper describes a replication of an earlier controlled experiment on design patterns in maintenance, with major extensions. Experimental realism was increased by using a real programming environment instead of pen and paper, and paid professionals from multiple major consultancy companies as subjects. Measurements of elapsed time and correctness were analyzed using regression models and an estimation method that took into account the correlations present in the raw data. Together with on-line logging of the subjects’ work, this made possible a better qualitative understanding of the results. The results indicate quite strongly that some patterns are much easier to understand and use than others. In particular, the Visitor pattern caused much confusion. Conversely, the patterns Observer and, to a certain extent, Decorator were grasped and used intuitively, even by subjects with little or no knowledge of patterns. The implication is that design patterns are not universally good or bad, but must be used in a way that matches the problem and the people. When approaching a program with documented design patterns, even basic training can improve both the speed and quality of maintenance activities.

international symposium on software reliability engineering | 2007