An Analysis of Bug Distribution in Object Oriented Systems
Alessandro Murgia, Giulio Concas, Michele Marchesi, Roberto Tonelli, Ivana Turnu
aa r X i v : . [ c s . S E ] M a y An Analysis of BugDistribution in Ob jectOriented Systems
Alessandro Murgia ∗ , Giulio Concas † ,Michele Marchesi ‡ , Roberto Tonelli § andIvana Turnu ¶ . Department of Electrical and Electronic Engineering, University of Cagliari, piazza d’Armi,09123 Cagliari, Italy.
SUMMARYWe introduced a new approach to describe Java software as graph, where nodes representa Java file - called compilation unit (CU) - and an edges represent a relations betweenthem. The software system is characterized by the degree distribution of the graphproperties, like in-or-out links, as well as by the distribution of Chidamber and Kemerermetrics computed on its CUs. Every CU can be related to one or more bugs during itslife. We find a relationship among the software system and the bugs hitting its nodes.We found that the distribution of some metrics, and the number of bugs per CU, exhibita power-law behavior in their tails, as well as the number of CUs influenced by a specificbug. We examine the evolution of software metrics across different releases to understandhow relationships among CUs metrics and CUs faultness change with time. key words:
Software graphs, object-oriented programming, statistical methods, complexity measures,software metrics, bug distribution.
1. INTRODUCTION
Large software systems can be analysed as graphs so huge and intricate that can be studiedusing complex network theory.In the case of object oriented (OO) software systems nodes are the classes or the interfaces, and ∗ E-mail: [email protected] † E-mail: [email protected] ‡ E-mail: [email protected] § E-mail: [email protected] ¶ E-mail: [email protected] oriented edges are the various kinds of relationships between them, inheritance, composition,dependence. For OO systems there exist also some consolidated software metrics, alsoassociated to the graph, usually computed at class level, the most used being the Chidamberand Kemerer (CK) suite of metrics [1]. The relationship between metrics and software qualityis fuzzy, and is still the subject of ongoing research.Related to software quality are software bugs. Several researchers analysed software evolutionin order to understand the relationship between software management and bug issues.Purushothaman et al. [2] analyzed software development process to identify what are therelationships between small changes to the code and bug growth. Kim et al. [3] analyzedmicro-pattern evolution in Java classes to identify which of them is more bug-prone. ´Sliwerskiet al. [4] analyzed the fix-inducing changes, i.e. software updates that trigger the appearance ofbugs. In their work, the revision history associated to compilation units (CUs) was examinedto understand where bugs issues are introduced during CU evolution. Compilation units, thebasic blocks examined in this paper, are files containing one or more classes, for which it ispossible to compute software metrics similar to those used for classes.A complete analysis of the relationships between graph properties of large software systems,statistic of software metrics, and the introduction and distribution of bugs in such graphsis, to our knowledge, completely missing. Zimmerman et al. considered a network analysison dependences graphs, built on binary files [5], and how dependencies correlate with, andpredict, defects. Andersson et al. [6] discussed the Pareto distribution of bugs in classes,without entering into the details of the statistical properties of software which determine suchdistribution. Zhang found that the bug distribution across compilation packages in EclipseJava system seems to follow a Weibull distribution [7].The aim of this paper is study OO systems using complex network theory, to improve theknowledge of bugs causes and to statistically determine their distribution into the system. Weextend the definitions of CK software metrics to CUs to understand the evolution of faultness,i.e. how a metric variation affects the number of bugs hitting a CU. A deeper understanding ofthe dynamics of software development could be useful for software engineers to identify whichsystem components will be more prone to bugs, thus focusing testing and code reviews onthese components.We also study the time evolution of software systems and of the related graphs and metrics,analysing both the source code and the bugs of various releases of two large Java systems,Eclipse [8] and Netbeans [9]. For each release we computed the associated software graph andthe CK metrics for each class. Furthermore, we study the number of defects associated to CUs,as found in the bug-tracking system used for development.We computed the correlation between OO metrics and bugs and analyzed the evolution ofthese metrics between one release and the next, correlating metrics changes with the numberof defects. We present a scheme of classification of CUs into categories which allows us toidentify which parts of the software are the most fault-prone, and how these are correlated toCK software metrics. We support our findings with significance tests.0
Large software systems can be analysed as graphs so huge and intricate that can be studiedusing complex network theory.In the case of object oriented (OO) software systems nodes are the classes or the interfaces, and ∗ E-mail: [email protected] † E-mail: [email protected] ‡ E-mail: [email protected] § E-mail: [email protected] ¶ E-mail: [email protected] oriented edges are the various kinds of relationships between them, inheritance, composition,dependence. For OO systems there exist also some consolidated software metrics, alsoassociated to the graph, usually computed at class level, the most used being the Chidamberand Kemerer (CK) suite of metrics [1]. The relationship between metrics and software qualityis fuzzy, and is still the subject of ongoing research.Related to software quality are software bugs. Several researchers analysed software evolutionin order to understand the relationship between software management and bug issues.Purushothaman et al. [2] analyzed software development process to identify what are therelationships between small changes to the code and bug growth. Kim et al. [3] analyzedmicro-pattern evolution in Java classes to identify which of them is more bug-prone. ´Sliwerskiet al. [4] analyzed the fix-inducing changes, i.e. software updates that trigger the appearance ofbugs. In their work, the revision history associated to compilation units (CUs) was examinedto understand where bugs issues are introduced during CU evolution. Compilation units, thebasic blocks examined in this paper, are files containing one or more classes, for which it ispossible to compute software metrics similar to those used for classes.A complete analysis of the relationships between graph properties of large software systems,statistic of software metrics, and the introduction and distribution of bugs in such graphsis, to our knowledge, completely missing. Zimmerman et al. considered a network analysison dependences graphs, built on binary files [5], and how dependencies correlate with, andpredict, defects. Andersson et al. [6] discussed the Pareto distribution of bugs in classes,without entering into the details of the statistical properties of software which determine suchdistribution. Zhang found that the bug distribution across compilation packages in EclipseJava system seems to follow a Weibull distribution [7].The aim of this paper is study OO systems using complex network theory, to improve theknowledge of bugs causes and to statistically determine their distribution into the system. Weextend the definitions of CK software metrics to CUs to understand the evolution of faultness,i.e. how a metric variation affects the number of bugs hitting a CU. A deeper understanding ofthe dynamics of software development could be useful for software engineers to identify whichsystem components will be more prone to bugs, thus focusing testing and code reviews onthese components.We also study the time evolution of software systems and of the related graphs and metrics,analysing both the source code and the bugs of various releases of two large Java systems,Eclipse [8] and Netbeans [9]. For each release we computed the associated software graph andthe CK metrics for each class. Furthermore, we study the number of defects associated to CUs,as found in the bug-tracking system used for development.We computed the correlation between OO metrics and bugs and analyzed the evolution ofthese metrics between one release and the next, correlating metrics changes with the numberof defects. We present a scheme of classification of CUs into categories which allows us toidentify which parts of the software are the most fault-prone, and how these are correlated toCK software metrics. We support our findings with significance tests.0 :0–0
2. Method
We analyze the source code of object-oriented systems written in Java. Both use CVS as versioncontrol system. Eclipse uses Bugzilla as issue tracker system, while Netbeans uses Issuezilla.The CVS keeps track of the source code history, Bugzilla and Issuezilla keep track of the bugshistory.
An oriented graph is associated to OO software systems, where the nodes are the classesand the interfaces, and the edges are the relationships between classes, namely inheritance,composition and dependence.The number and orientation of edges allow to study the coupling between nodes. In this graphthe in-degree of a class is the number of edges directed toward the class, and measures howmuch this class is used by other classes of the system. The out-degree of a class is the numberof edges leaving the class, and represents the level of usage the class makes of other classes inthe system. In this context CK suite is a common metrics employed in classes analysis. Wecalculated for each node the values of the four most relevant CK metrics of the associatedclass: • Weighted Methods per Class (WMC). A weighted sum of all the methods defined in aclass. We set the weighting factor to one to simplify our analysis. • Coupling Between Objects (CBO). The counting of the number of classes which a givenclass is coupled to. • Response For a Class (RFC). The sum of the number of methods defined in the class, andthe cardinality of the set of methods called by them and belonging to external classes. • Lack of Cohesion of Methods (LCOM). The difference between the number of noncohesive method pairs and the number of cohesive pairs.We also computed the lines of code of the class (LOC), excluding blanks and comment lines.This is useful to keep track of CU dimension because it is known that a ”long” class is moredifficult to menage than a short class.Every system class resides inside a Java file, called CU. While most files include just oneclass, there are files including more than one class. In Eclipse 10% of CUs host more than oneclass, whereas in Netbeans this percentage is 30%. In commit messages issues and issue fixingalways refer to CUs. To make consistent issue tracking with source code, we decided to extendCK metrics from classes to CUs. CUs represent therefore the main element of our study. So,we defined a CU graph whose nodes are the CUs of the system. Two nodes are connectedwith a directed edge if at least one class inside the CU associated with the first node has adependency relationship with one class inside the CU associated with the second node. Werefer to this graph for computing in-links and out-links of a CU-node. We reinterpreted CKmetrics onto this CU-graph: • CU LOCS is the sum of the LOCS of classes contained in the CU;0
An oriented graph is associated to OO software systems, where the nodes are the classesand the interfaces, and the edges are the relationships between classes, namely inheritance,composition and dependence.The number and orientation of edges allow to study the coupling between nodes. In this graphthe in-degree of a class is the number of edges directed toward the class, and measures howmuch this class is used by other classes of the system. The out-degree of a class is the numberof edges leaving the class, and represents the level of usage the class makes of other classes inthe system. In this context CK suite is a common metrics employed in classes analysis. Wecalculated for each node the values of the four most relevant CK metrics of the associatedclass: • Weighted Methods per Class (WMC). A weighted sum of all the methods defined in aclass. We set the weighting factor to one to simplify our analysis. • Coupling Between Objects (CBO). The counting of the number of classes which a givenclass is coupled to. • Response For a Class (RFC). The sum of the number of methods defined in the class, andthe cardinality of the set of methods called by them and belonging to external classes. • Lack of Cohesion of Methods (LCOM). The difference between the number of noncohesive method pairs and the number of cohesive pairs.We also computed the lines of code of the class (LOC), excluding blanks and comment lines.This is useful to keep track of CU dimension because it is known that a ”long” class is moredifficult to menage than a short class.Every system class resides inside a Java file, called CU. While most files include just oneclass, there are files including more than one class. In Eclipse 10% of CUs host more than oneclass, whereas in Netbeans this percentage is 30%. In commit messages issues and issue fixingalways refer to CUs. To make consistent issue tracking with source code, we decided to extendCK metrics from classes to CUs. CUs represent therefore the main element of our study. So,we defined a CU graph whose nodes are the CUs of the system. Two nodes are connectedwith a directed edge if at least one class inside the CU associated with the first node has adependency relationship with one class inside the CU associated with the second node. Werefer to this graph for computing in-links and out-links of a CU-node. We reinterpreted CKmetrics onto this CU-graph: • CU LOCS is the sum of the LOCS of classes contained in the CU;0 :0–0 • CU CBO is the number of out-links of each node, excluding those representinginheritance. This definition is consistent with that of CBO metrics for classes; • CU LCOM and CU WMC are the sum of LCOM and WMC metrics of the classescontained in the CU,respectively; • CU RFC is the sum of weighted out-links of each node, each out-link being multipliedby the number of specific distinct relationships between classes belonging to the CUsconnected to the related edge.For each CU we have thus a set of 6 metrics: In-links, Out-links, CU-LOCS, CU-LCOM,CU-WMC, CU-RFC and CU-CBO. This was made for all versions of Eclipse and Netbeans.
Onto the CU graph we look for nodes hit by Issues. To obtain this information it is necessaryto check the CVS log file, and the data contained in the ITS.We consider a CU as affected by an Issue when it is modified for issue fixing. Developers recordon the CVS log all fixing activities. All commit operations are tracked in the CVS log as singleentries. Each entry contains various data, among which the date, the developer who made thechanges, an annotation referring to the reasons of the commit, and the list of CUs interestedby the commit. In case of commits associated to an issue fixing activity, this is written in theannotation, though not in a standardized way. It is not simple to obtain a correct mappingbetween issue(s) and the related CU(s) [4] [10].In our approach, we first analyzed the CVS log, to locate commit messages associated tofixing activities. Then, the extracted data are matched with information found in the ITS.Each issue is identified by a whole positive number (ID). In commit messages it can appear astring such as ”Fixed 141181” or ”bug 0
Onto the CU graph we look for nodes hit by Issues. To obtain this information it is necessaryto check the CVS log file, and the data contained in the ITS.We consider a CU as affected by an Issue when it is modified for issue fixing. Developers recordon the CVS log all fixing activities. All commit operations are tracked in the CVS log as singleentries. Each entry contains various data, among which the date, the developer who made thechanges, an annotation referring to the reasons of the commit, and the list of CUs interestedby the commit. In case of commits associated to an issue fixing activity, this is written in theannotation, though not in a standardized way. It is not simple to obtain a correct mappingbetween issue(s) and the related CU(s) [4] [10].In our approach, we first analyzed the CVS log, to locate commit messages associated tofixing activities. Then, the extracted data are matched with information found in the ITS.Each issue is identified by a whole positive number (ID). In commit messages it can appear astring such as ”Fixed 141181” or ”bug 0 :0–0 Table I: Number of CUs of Eclipse for each main releaseRelease 2.1 3.0 3.1 3.2 3.3Number of CU 7885 10584 12174 13221 14564
3. Results
The subjects of our study were Eclipse and Netbeans projects, both open source, objectoriented, Java based systems. Table I and II show the number of CUs involved in the mainreleases of Eclipse and Netbeans, respectively.Table II: Number of CUs of Netbeans for each main releaseRelease 3.2 3.3 3.5 3.6 4.0 5.0 6.0Number of CU 3350 4421 7391 8350 9365 12137 37145A software system usually evolves through subsequent releases . Main releases entailsubstantial enhancements of the system, and are usually characterized by significant changesin software sizes, as demonstrated by the data reported in Tables I and II. Between two mainreleases there may be different “patching releases”, intended to fix bugs and to provide minorenhancements. Even if we analyzed all the releases, we report results for the main releases andthe patching release immediately preceding the next main release. In fact most of bugs areintroduced in upgrading from the last patching release to the next main release.
We computed the statistical distributions of software metrics underlying the software graph.We compared the metrics for software graphs built using classes as basic units, already observedin literature, with the ones obtained in this work for software graphs built considering CUs.The latter distributions substantially keep the ”fat-tail” behavior of the corresponding classmetrics [11] in all cases. Fig. 1 reports the log-log plot of the complementary cumulativedistribution functions (CCDF) of CBO metric of Eclipse 3.2 for classes and for CUs.Fig. 2 reports the CCDF of CBO metrics, this time referred to Netbeans 4.0. All thesedistributions exhibit a power-law behavior in their tail.We recall that a quantity x obeys a power law if it is drawn from a probability distributionproportional to a negative power of x : p ( x ) ∝ x − γ where γ > . (1) γ is the power-law coefficient, known also as the exponent or scaling parameter . Thecorresponding complementary cumulative distribution function (CCDF), i.e. the probability0
We computed the statistical distributions of software metrics underlying the software graph.We compared the metrics for software graphs built using classes as basic units, already observedin literature, with the ones obtained in this work for software graphs built considering CUs.The latter distributions substantially keep the ”fat-tail” behavior of the corresponding classmetrics [11] in all cases. Fig. 1 reports the log-log plot of the complementary cumulativedistribution functions (CCDF) of CBO metric of Eclipse 3.2 for classes and for CUs.Fig. 2 reports the CCDF of CBO metrics, this time referred to Netbeans 4.0. All thesedistributions exhibit a power-law behavior in their tail.We recall that a quantity x obeys a power law if it is drawn from a probability distributionproportional to a negative power of x : p ( x ) ∝ x − γ where γ > . (1) γ is the power-law coefficient, known also as the exponent or scaling parameter . Thecorresponding complementary cumulative distribution function (CCDF), i.e. the probability0 :0–0 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 1: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Eclipse 3.2 −5 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 2: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Netbeans 4.0.that the random variable is greater than a given value x , is: P ( X ≥ x ) ∝ x − ( γ − (2)A power-law, or Pareto, distribution cannot hold for x = 0, so eligible values of x must begreater than a positive number x min . This characteristic allows to consider distributions thatare power-laws only in their right ”tail”, that is for x greater than a given value x min , andnot for lower values of x . All the distributions shown in Figs. 1, 3 and 4 show a straight linebehavior in their right tail. Note that the CCDF has the same analytical expression of thedistribution function, with a negative exponent offset by one. Plotting p ( x ) or P ( x ) in log-logscale one obtains a straight line, as shown in Figs. 1 and 2.Fig. 3 and 4 show the CCDF of WMC metric in Eclipse 3.2 and in Netbeans 4.0, respectively.These distributions are also quite similar, and present again in their tail a power-law behavior,both for classes and for CUs. We found this behavior also for all other releases, and for allmetrics.The finding that the distributions of CU metrics largely coincide with those of thecorresponding metrics of classes suggests that the same considerations that are valid for CUsmay be extended also to classes, even in the cases where data for the classes are not directlyaccessible, like in our case for bugs. One goal of this paper is, in fact, to find, by means ofthe software graph framework, existing correlations among bugs and metrics. Thus, since buginformation for classes is not directly detectable from the repository, we analyzed the bugsmetric only for CUs, and use this information to obtain clues about classes.Fig. 5 shows the CCDF of the number of bugs per CU in Eclipse 3.2. Fig. 6 shows the samedistribution in Netbeans 3.4. The meaning of these power-law tail distributions is unequivocal.While most CUs present only very few bugs, there is a non-negligible number of CUs withvery many bugs. We also found similar shapes (patterns) in all other main releases.0
We computed the statistical distributions of software metrics underlying the software graph.We compared the metrics for software graphs built using classes as basic units, already observedin literature, with the ones obtained in this work for software graphs built considering CUs.The latter distributions substantially keep the ”fat-tail” behavior of the corresponding classmetrics [11] in all cases. Fig. 1 reports the log-log plot of the complementary cumulativedistribution functions (CCDF) of CBO metric of Eclipse 3.2 for classes and for CUs.Fig. 2 reports the CCDF of CBO metrics, this time referred to Netbeans 4.0. All thesedistributions exhibit a power-law behavior in their tail.We recall that a quantity x obeys a power law if it is drawn from a probability distributionproportional to a negative power of x : p ( x ) ∝ x − γ where γ > . (1) γ is the power-law coefficient, known also as the exponent or scaling parameter . Thecorresponding complementary cumulative distribution function (CCDF), i.e. the probability0 :0–0 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 1: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Eclipse 3.2 −5 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 2: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Netbeans 4.0.that the random variable is greater than a given value x , is: P ( X ≥ x ) ∝ x − ( γ − (2)A power-law, or Pareto, distribution cannot hold for x = 0, so eligible values of x must begreater than a positive number x min . This characteristic allows to consider distributions thatare power-laws only in their right ”tail”, that is for x greater than a given value x min , andnot for lower values of x . All the distributions shown in Figs. 1, 3 and 4 show a straight linebehavior in their right tail. Note that the CCDF has the same analytical expression of thedistribution function, with a negative exponent offset by one. Plotting p ( x ) or P ( x ) in log-logscale one obtains a straight line, as shown in Figs. 1 and 2.Fig. 3 and 4 show the CCDF of WMC metric in Eclipse 3.2 and in Netbeans 4.0, respectively.These distributions are also quite similar, and present again in their tail a power-law behavior,both for classes and for CUs. We found this behavior also for all other releases, and for allmetrics.The finding that the distributions of CU metrics largely coincide with those of thecorresponding metrics of classes suggests that the same considerations that are valid for CUsmay be extended also to classes, even in the cases where data for the classes are not directlyaccessible, like in our case for bugs. One goal of this paper is, in fact, to find, by means ofthe software graph framework, existing correlations among bugs and metrics. Thus, since buginformation for classes is not directly detectable from the repository, we analyzed the bugsmetric only for CUs, and use this information to obtain clues about classes.Fig. 5 shows the CCDF of the number of bugs per CU in Eclipse 3.2. Fig. 6 shows the samedistribution in Netbeans 3.4. The meaning of these power-law tail distributions is unequivocal.While most CUs present only very few bugs, there is a non-negligible number of CUs withvery many bugs. We also found similar shapes (patterns) in all other main releases.0 :0–0 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 3: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inEclipse 3.2. −5 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 4: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inNetbeans 4.0. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 5: The CCDF of the numberof bugs per CU in Eclipse 3.2. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 6: The CCDF of the numberof bugs per CU in Netbeans 3.4.0
We computed the statistical distributions of software metrics underlying the software graph.We compared the metrics for software graphs built using classes as basic units, already observedin literature, with the ones obtained in this work for software graphs built considering CUs.The latter distributions substantially keep the ”fat-tail” behavior of the corresponding classmetrics [11] in all cases. Fig. 1 reports the log-log plot of the complementary cumulativedistribution functions (CCDF) of CBO metric of Eclipse 3.2 for classes and for CUs.Fig. 2 reports the CCDF of CBO metrics, this time referred to Netbeans 4.0. All thesedistributions exhibit a power-law behavior in their tail.We recall that a quantity x obeys a power law if it is drawn from a probability distributionproportional to a negative power of x : p ( x ) ∝ x − γ where γ > . (1) γ is the power-law coefficient, known also as the exponent or scaling parameter . Thecorresponding complementary cumulative distribution function (CCDF), i.e. the probability0 :0–0 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 1: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Eclipse 3.2 −5 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 2: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Netbeans 4.0.that the random variable is greater than a given value x , is: P ( X ≥ x ) ∝ x − ( γ − (2)A power-law, or Pareto, distribution cannot hold for x = 0, so eligible values of x must begreater than a positive number x min . This characteristic allows to consider distributions thatare power-laws only in their right ”tail”, that is for x greater than a given value x min , andnot for lower values of x . All the distributions shown in Figs. 1, 3 and 4 show a straight linebehavior in their right tail. Note that the CCDF has the same analytical expression of thedistribution function, with a negative exponent offset by one. Plotting p ( x ) or P ( x ) in log-logscale one obtains a straight line, as shown in Figs. 1 and 2.Fig. 3 and 4 show the CCDF of WMC metric in Eclipse 3.2 and in Netbeans 4.0, respectively.These distributions are also quite similar, and present again in their tail a power-law behavior,both for classes and for CUs. We found this behavior also for all other releases, and for allmetrics.The finding that the distributions of CU metrics largely coincide with those of thecorresponding metrics of classes suggests that the same considerations that are valid for CUsmay be extended also to classes, even in the cases where data for the classes are not directlyaccessible, like in our case for bugs. One goal of this paper is, in fact, to find, by means ofthe software graph framework, existing correlations among bugs and metrics. Thus, since buginformation for classes is not directly detectable from the repository, we analyzed the bugsmetric only for CUs, and use this information to obtain clues about classes.Fig. 5 shows the CCDF of the number of bugs per CU in Eclipse 3.2. Fig. 6 shows the samedistribution in Netbeans 3.4. The meaning of these power-law tail distributions is unequivocal.While most CUs present only very few bugs, there is a non-negligible number of CUs withvery many bugs. We also found similar shapes (patterns) in all other main releases.0 :0–0 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 3: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inEclipse 3.2. −5 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 4: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inNetbeans 4.0. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 5: The CCDF of the numberof bugs per CU in Eclipse 3.2. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 6: The CCDF of the numberof bugs per CU in Netbeans 3.4.0 :0–0 On the basis of these similarities, the hypothesis that the power-laws existing for bugdistribution among CUs may be extended to classes, as well as to other units, like modules orpackages, and that it is a property of the graph structure of the system looks sensible.In fact similar results were obtained by Andersson et Runeson [6], and by Zhang [7].Andersson et Runeson suggest a Pareto law governing the distribution of bugs across basicunits of a software system only partially OO, showing that few modules contain most of thebugs (the 20-80 rule [12]). Zhang re-examined their results for the Eclipse software system,finding that a Weibull distribution fits data better than a power-law, studying packages insteadof modules. Since the tail of a Weibull distribution is often not distinguishable from a power-law tail, their results support our hypothesis.Let us point out what we consider our most relevant finding. We verified that a power-lawdistribution may be appropriate to describe the fat-tail distribution of different quantities.Note that the fat-tail contains the software units to which most of the information belongs.When a metric is distributed according to a power-law, even only in its tail, with a scalingexponent small enough, there are relatively few units with highest values of the metrics, wherecriticality resides, while most other units are much less critical. The 80-20 Pareto principle isa consequence of that: about 80% of the criticality is held in 20% of all units.Our analysis is finer than those performed in [6] or in [7], in the sense that we analyzed thesoftware structure and relationships at the level of compilation units, one level deeper thanthe module or the package level presented in the above works. This allowed us to recoverfiner information on the distributions of metrics, especially in their tail. Our results confirmthose of Andersson and Runeson, and of Zhang, showing that the same framework holds atdifferent scales, exhibiting a scale-free structure [13]. This finding qualitatively supports theuse of power-laws. Finally, also Louridas et al. [14], show a large variety of cases in whichpower-laws well account for the distribution of different software properties.Regarding the value of the exponent γ and the corresponding behavior of the number of bugsper CU, this value tends to be between 2.5 and 3.5 in the various releases examined for bothEclipse and Netbeans.According to ref. [14], a mathematical description of the fat-tail may have relevantconsequences on software engineering, for example in helping to carefully select which parts ofthe software project are worth of more care and effort, also from an economical point of view.For instance, given n modules characterized by a metric distributed according to a power-lawwith exponent γ , the average maximum expected value for this metric in the module withhighest metric value, < xmax > , is given by the formula [15] < x max > = n / ( γ − (3)This formula provides a definite expectation of the maximum value taken by the metric, andhence allows to flag specific modules with metric value of this order of magnitude.We studied also the distributions of the number of CUs hit by a single bug, the dual of thedistribution of bugs across CUs. Also in this case, we find a power-law, as shown in Figs. 7and 8 for Eclipse and Netbeans, respectively. This means that, while most bugs affect just oneor a few CUs, there are bugs that affect tens, or ever hundreds of CUs.0
We computed the statistical distributions of software metrics underlying the software graph.We compared the metrics for software graphs built using classes as basic units, already observedin literature, with the ones obtained in this work for software graphs built considering CUs.The latter distributions substantially keep the ”fat-tail” behavior of the corresponding classmetrics [11] in all cases. Fig. 1 reports the log-log plot of the complementary cumulativedistribution functions (CCDF) of CBO metric of Eclipse 3.2 for classes and for CUs.Fig. 2 reports the CCDF of CBO metrics, this time referred to Netbeans 4.0. All thesedistributions exhibit a power-law behavior in their tail.We recall that a quantity x obeys a power law if it is drawn from a probability distributionproportional to a negative power of x : p ( x ) ∝ x − γ where γ > . (1) γ is the power-law coefficient, known also as the exponent or scaling parameter . Thecorresponding complementary cumulative distribution function (CCDF), i.e. the probability0 :0–0 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 1: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Eclipse 3.2 −5 −4 −3 −2 −1 x P r( X > x ) *CBO Compilation Units +CBO Classes Figure 2: The CCDF of CBO metricsfor classes (crosses) and CUs (stars)in Netbeans 4.0.that the random variable is greater than a given value x , is: P ( X ≥ x ) ∝ x − ( γ − (2)A power-law, or Pareto, distribution cannot hold for x = 0, so eligible values of x must begreater than a positive number x min . This characteristic allows to consider distributions thatare power-laws only in their right ”tail”, that is for x greater than a given value x min , andnot for lower values of x . All the distributions shown in Figs. 1, 3 and 4 show a straight linebehavior in their right tail. Note that the CCDF has the same analytical expression of thedistribution function, with a negative exponent offset by one. Plotting p ( x ) or P ( x ) in log-logscale one obtains a straight line, as shown in Figs. 1 and 2.Fig. 3 and 4 show the CCDF of WMC metric in Eclipse 3.2 and in Netbeans 4.0, respectively.These distributions are also quite similar, and present again in their tail a power-law behavior,both for classes and for CUs. We found this behavior also for all other releases, and for allmetrics.The finding that the distributions of CU metrics largely coincide with those of thecorresponding metrics of classes suggests that the same considerations that are valid for CUsmay be extended also to classes, even in the cases where data for the classes are not directlyaccessible, like in our case for bugs. One goal of this paper is, in fact, to find, by means ofthe software graph framework, existing correlations among bugs and metrics. Thus, since buginformation for classes is not directly detectable from the repository, we analyzed the bugsmetric only for CUs, and use this information to obtain clues about classes.Fig. 5 shows the CCDF of the number of bugs per CU in Eclipse 3.2. Fig. 6 shows the samedistribution in Netbeans 3.4. The meaning of these power-law tail distributions is unequivocal.While most CUs present only very few bugs, there is a non-negligible number of CUs withvery many bugs. We also found similar shapes (patterns) in all other main releases.0 :0–0 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 3: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inEclipse 3.2. −5 −4 −3 −2 −1 x P r( X > x ) *WMC Compilation Units +WMC Classes Figure 4: The CCDF of WMC metricsfor classes (crosses) and CUs (stars) inNetbeans 4.0. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 5: The CCDF of the numberof bugs per CU in Eclipse 3.2. −4 −3 −2 −1 x P r( X > x ) bugs count Figure 6: The CCDF of the numberof bugs per CU in Netbeans 3.4.0 :0–0 On the basis of these similarities, the hypothesis that the power-laws existing for bugdistribution among CUs may be extended to classes, as well as to other units, like modules orpackages, and that it is a property of the graph structure of the system looks sensible.In fact similar results were obtained by Andersson et Runeson [6], and by Zhang [7].Andersson et Runeson suggest a Pareto law governing the distribution of bugs across basicunits of a software system only partially OO, showing that few modules contain most of thebugs (the 20-80 rule [12]). Zhang re-examined their results for the Eclipse software system,finding that a Weibull distribution fits data better than a power-law, studying packages insteadof modules. Since the tail of a Weibull distribution is often not distinguishable from a power-law tail, their results support our hypothesis.Let us point out what we consider our most relevant finding. We verified that a power-lawdistribution may be appropriate to describe the fat-tail distribution of different quantities.Note that the fat-tail contains the software units to which most of the information belongs.When a metric is distributed according to a power-law, even only in its tail, with a scalingexponent small enough, there are relatively few units with highest values of the metrics, wherecriticality resides, while most other units are much less critical. The 80-20 Pareto principle isa consequence of that: about 80% of the criticality is held in 20% of all units.Our analysis is finer than those performed in [6] or in [7], in the sense that we analyzed thesoftware structure and relationships at the level of compilation units, one level deeper thanthe module or the package level presented in the above works. This allowed us to recoverfiner information on the distributions of metrics, especially in their tail. Our results confirmthose of Andersson and Runeson, and of Zhang, showing that the same framework holds atdifferent scales, exhibiting a scale-free structure [13]. This finding qualitatively supports theuse of power-laws. Finally, also Louridas et al. [14], show a large variety of cases in whichpower-laws well account for the distribution of different software properties.Regarding the value of the exponent γ and the corresponding behavior of the number of bugsper CU, this value tends to be between 2.5 and 3.5 in the various releases examined for bothEclipse and Netbeans.According to ref. [14], a mathematical description of the fat-tail may have relevantconsequences on software engineering, for example in helping to carefully select which parts ofthe software project are worth of more care and effort, also from an economical point of view.For instance, given n modules characterized by a metric distributed according to a power-lawwith exponent γ , the average maximum expected value for this metric in the module withhighest metric value, < xmax > , is given by the formula [15] < x max > = n / ( γ − (3)This formula provides a definite expectation of the maximum value taken by the metric, andhence allows to flag specific modules with metric value of this order of magnitude.We studied also the distributions of the number of CUs hit by a single bug, the dual of thedistribution of bugs across CUs. Also in this case, we find a power-law, as shown in Figs. 7and 8 for Eclipse and Netbeans, respectively. This means that, while most bugs affect just oneor a few CUs, there are bugs that affect tens, or ever hundreds of CUs.0 :0–0 −4 −3 −2 −1 x P r( X > x ) Compilation Units
Figure 7: The CCDF of the number ofCUs associated to each bug in Eclipse3.2. −4 −3 −2 −1 x P r( X > x ) Compilation Units
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0 :0–0 We also analyzed the evolution of the metrics between two consecutive releases. To this purposewe define different types of CUs, distinguishing among updated, unmodified, newly introduced,and defining all these types with respect to all the different metrics.In particular, given a release R i , the next release R i +1 , and a metric M, we classified thecompilation units in four categories: • CU.X is the set of compilation units where metric M doesn’t change between R i and R i +1 ; • CU.U is the set of compilation units where metric M changes (Updated); • CU.A is the set of compilation units that exist in R i +1 but not in R i (Added);It must be pointed out that U and X categories are defined relative to a specific metric. A CUmight exhibit a change in metric M but not in metric M’ between the releases R i and R i +1 .Thus, it will belong to class CU.U for M, and to class CU.X for M’. This case is not common,but it is definitely possible. CU.A is defined regardless to any metric M, since it refers to CUsjust introduced in the new release. There are also CUs existing in release R i but not in release R i +1 . These deleted CUs are not considered in our study.Given the set of compilation units belonging to the three categories CU.U, CU.X, and CU.A,we compute: • the fraction of compilation unit affected by bugs, which provides an infection probability; • the average number of bugs of the infected compilation units.In Table V we show the probability for CUs belonging to one of the families U, X and A, ofbeing infected, in various changes of releases.The probability that a CU belonging to family CU.U is infected is between 0.6 - 0.7 inEclipse. This means that there is a high probability that changing the LOCS, CBO, or LCOMmetrics of a CU from one release to the next results in injecting at least one error into thecompilation unit. This result confirms Purushothaman’s study [2], that highlighted that codecorrection for defects often introduces new defects. Also the CUs added to the system, in thetransition from R i to R i +1 , show a high probability to be infected, clearly larger than forthe case of CUs not modified (set CU.X), and slightly smaller than for the set CU.U. Similarresults were obtained also for all other metrics.On the contrary, if the metric does not change there is a low probability that a CU is affectedby bugs. These bugs clearly refer to bugs already present in R i but that were found only whenchecking R i +1 release.In order to support our findings about the deep differences among CU.U, CU.X and CU.Afamilies, we performed chi-square significance tests. We formulate the following null hypothesis:“the subdivision of CU in U, X and A does not significantly influence the number of infected0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0 :0–0 We also analyzed the evolution of the metrics between two consecutive releases. To this purposewe define different types of CUs, distinguishing among updated, unmodified, newly introduced,and defining all these types with respect to all the different metrics.In particular, given a release R i , the next release R i +1 , and a metric M, we classified thecompilation units in four categories: • CU.X is the set of compilation units where metric M doesn’t change between R i and R i +1 ; • CU.U is the set of compilation units where metric M changes (Updated); • CU.A is the set of compilation units that exist in R i +1 but not in R i (Added);It must be pointed out that U and X categories are defined relative to a specific metric. A CUmight exhibit a change in metric M but not in metric M’ between the releases R i and R i +1 .Thus, it will belong to class CU.U for M, and to class CU.X for M’. This case is not common,but it is definitely possible. CU.A is defined regardless to any metric M, since it refers to CUsjust introduced in the new release. There are also CUs existing in release R i but not in release R i +1 . These deleted CUs are not considered in our study.Given the set of compilation units belonging to the three categories CU.U, CU.X, and CU.A,we compute: • the fraction of compilation unit affected by bugs, which provides an infection probability; • the average number of bugs of the infected compilation units.In Table V we show the probability for CUs belonging to one of the families U, X and A, ofbeing infected, in various changes of releases.The probability that a CU belonging to family CU.U is infected is between 0.6 - 0.7 inEclipse. This means that there is a high probability that changing the LOCS, CBO, or LCOMmetrics of a CU from one release to the next results in injecting at least one error into thecompilation unit. This result confirms Purushothaman’s study [2], that highlighted that codecorrection for defects often introduces new defects. Also the CUs added to the system, in thetransition from R i to R i +1 , show a high probability to be infected, clearly larger than forthe case of CUs not modified (set CU.X), and slightly smaller than for the set CU.U. Similarresults were obtained also for all other metrics.On the contrary, if the metric does not change there is a low probability that a CU is affectedby bugs. These bugs clearly refer to bugs already present in R i but that were found only whenchecking R i +1 release.In order to support our findings about the deep differences among CU.U, CU.X and CU.Afamilies, we performed chi-square significance tests. We formulate the following null hypothesis:“the subdivision of CU in U, X and A does not significantly influence the number of infected0 :0–0 χ values have a confidence level larger than 99.9 percent (the confidencelevel is actually much larger). Therefore we can reject the null hypothesis with a probabilitygreater than 99.9%, and confirm that our classification of CUs into families providessignificative correlations with the presence of bugs.In Table VI we report the average number of bugs of the infected CUs. These dataconfirm that the CUs infected of type U and A have an average number of bugs largerthan the compilation units of type X. Note also that, on average, more than one bug isfound during a release lifespan even in the CUs that are not changed in the release. Thus, inTable VI: Average number of bug-affected CUs between two consecutive releases (shown inthe top row), for different families, relative to different metrics in EclipseSubsequent releasesMetric Set 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3LOC CU.U 4.02 3.16 2.61CU.X 1.38 1.22 1.29CBO CU.U 3.92 3.88 3.03CU.X 2.36 1.86 1.8LCOM CU.U 4.34 3.58 2.95CU.X 2 2.64 1.66CU.A 3.2 2.73 2.51general,irrespectively of the metric, we have: • CU.U infection probability is around 60-70%; • CU.A infection probability is around 50-60%; • CU.X infection probability is around 10-30%;0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0 :0–0 We also analyzed the evolution of the metrics between two consecutive releases. To this purposewe define different types of CUs, distinguishing among updated, unmodified, newly introduced,and defining all these types with respect to all the different metrics.In particular, given a release R i , the next release R i +1 , and a metric M, we classified thecompilation units in four categories: • CU.X is the set of compilation units where metric M doesn’t change between R i and R i +1 ; • CU.U is the set of compilation units where metric M changes (Updated); • CU.A is the set of compilation units that exist in R i +1 but not in R i (Added);It must be pointed out that U and X categories are defined relative to a specific metric. A CUmight exhibit a change in metric M but not in metric M’ between the releases R i and R i +1 .Thus, it will belong to class CU.U for M, and to class CU.X for M’. This case is not common,but it is definitely possible. CU.A is defined regardless to any metric M, since it refers to CUsjust introduced in the new release. There are also CUs existing in release R i but not in release R i +1 . These deleted CUs are not considered in our study.Given the set of compilation units belonging to the three categories CU.U, CU.X, and CU.A,we compute: • the fraction of compilation unit affected by bugs, which provides an infection probability; • the average number of bugs of the infected compilation units.In Table V we show the probability for CUs belonging to one of the families U, X and A, ofbeing infected, in various changes of releases.The probability that a CU belonging to family CU.U is infected is between 0.6 - 0.7 inEclipse. This means that there is a high probability that changing the LOCS, CBO, or LCOMmetrics of a CU from one release to the next results in injecting at least one error into thecompilation unit. This result confirms Purushothaman’s study [2], that highlighted that codecorrection for defects often introduces new defects. Also the CUs added to the system, in thetransition from R i to R i +1 , show a high probability to be infected, clearly larger than forthe case of CUs not modified (set CU.X), and slightly smaller than for the set CU.U. Similarresults were obtained also for all other metrics.On the contrary, if the metric does not change there is a low probability that a CU is affectedby bugs. These bugs clearly refer to bugs already present in R i but that were found only whenchecking R i +1 release.In order to support our findings about the deep differences among CU.U, CU.X and CU.Afamilies, we performed chi-square significance tests. We formulate the following null hypothesis:“the subdivision of CU in U, X and A does not significantly influence the number of infected0 :0–0 χ values have a confidence level larger than 99.9 percent (the confidencelevel is actually much larger). Therefore we can reject the null hypothesis with a probabilitygreater than 99.9%, and confirm that our classification of CUs into families providessignificative correlations with the presence of bugs.In Table VI we report the average number of bugs of the infected CUs. These dataconfirm that the CUs infected of type U and A have an average number of bugs largerthan the compilation units of type X. Note also that, on average, more than one bug isfound during a release lifespan even in the CUs that are not changed in the release. Thus, inTable VI: Average number of bug-affected CUs between two consecutive releases (shown inthe top row), for different families, relative to different metrics in EclipseSubsequent releasesMetric Set 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3LOC CU.U 4.02 3.16 2.61CU.X 1.38 1.22 1.29CBO CU.U 3.92 3.88 3.03CU.X 2.36 1.86 1.8LCOM CU.U 4.34 3.58 2.95CU.X 2 2.64 1.66CU.A 3.2 2.73 2.51general,irrespectively of the metric, we have: • CU.U infection probability is around 60-70%; • CU.A infection probability is around 50-60%; • CU.X infection probability is around 10-30%;0 :0–0 • • • • the most infected CUs, in both projects, are updated CUs; infection probabilities valuesare almost 70% in both systems; • CUs belonging to CU.A set exhibits in general a slightly smaller infection probabilitythan CU.U set; • CUs belonging to CU.X set are much less infected than CUs belonging to CU.A, andnever exceed 30% probability to be hit by a bug; • usually, updated CUs have more bugs than others; this is always true in Eclipse, whereasit is almost always true in Netbeans;0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0 :0–0 We also analyzed the evolution of the metrics between two consecutive releases. To this purposewe define different types of CUs, distinguishing among updated, unmodified, newly introduced,and defining all these types with respect to all the different metrics.In particular, given a release R i , the next release R i +1 , and a metric M, we classified thecompilation units in four categories: • CU.X is the set of compilation units where metric M doesn’t change between R i and R i +1 ; • CU.U is the set of compilation units where metric M changes (Updated); • CU.A is the set of compilation units that exist in R i +1 but not in R i (Added);It must be pointed out that U and X categories are defined relative to a specific metric. A CUmight exhibit a change in metric M but not in metric M’ between the releases R i and R i +1 .Thus, it will belong to class CU.U for M, and to class CU.X for M’. This case is not common,but it is definitely possible. CU.A is defined regardless to any metric M, since it refers to CUsjust introduced in the new release. There are also CUs existing in release R i but not in release R i +1 . These deleted CUs are not considered in our study.Given the set of compilation units belonging to the three categories CU.U, CU.X, and CU.A,we compute: • the fraction of compilation unit affected by bugs, which provides an infection probability; • the average number of bugs of the infected compilation units.In Table V we show the probability for CUs belonging to one of the families U, X and A, ofbeing infected, in various changes of releases.The probability that a CU belonging to family CU.U is infected is between 0.6 - 0.7 inEclipse. This means that there is a high probability that changing the LOCS, CBO, or LCOMmetrics of a CU from one release to the next results in injecting at least one error into thecompilation unit. This result confirms Purushothaman’s study [2], that highlighted that codecorrection for defects often introduces new defects. Also the CUs added to the system, in thetransition from R i to R i +1 , show a high probability to be infected, clearly larger than forthe case of CUs not modified (set CU.X), and slightly smaller than for the set CU.U. Similarresults were obtained also for all other metrics.On the contrary, if the metric does not change there is a low probability that a CU is affectedby bugs. These bugs clearly refer to bugs already present in R i but that were found only whenchecking R i +1 release.In order to support our findings about the deep differences among CU.U, CU.X and CU.Afamilies, we performed chi-square significance tests. We formulate the following null hypothesis:“the subdivision of CU in U, X and A does not significantly influence the number of infected0 :0–0 χ values have a confidence level larger than 99.9 percent (the confidencelevel is actually much larger). Therefore we can reject the null hypothesis with a probabilitygreater than 99.9%, and confirm that our classification of CUs into families providessignificative correlations with the presence of bugs.In Table VI we report the average number of bugs of the infected CUs. These dataconfirm that the CUs infected of type U and A have an average number of bugs largerthan the compilation units of type X. Note also that, on average, more than one bug isfound during a release lifespan even in the CUs that are not changed in the release. Thus, inTable VI: Average number of bug-affected CUs between two consecutive releases (shown inthe top row), for different families, relative to different metrics in EclipseSubsequent releasesMetric Set 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3LOC CU.U 4.02 3.16 2.61CU.X 1.38 1.22 1.29CBO CU.U 3.92 3.88 3.03CU.X 2.36 1.86 1.8LCOM CU.U 4.34 3.58 2.95CU.X 2 2.64 1.66CU.A 3.2 2.73 2.51general,irrespectively of the metric, we have: • CU.U infection probability is around 60-70%; • CU.A infection probability is around 50-60%; • CU.X infection probability is around 10-30%;0 :0–0 • • • • the most infected CUs, in both projects, are updated CUs; infection probabilities valuesare almost 70% in both systems; • CUs belonging to CU.A set exhibits in general a slightly smaller infection probabilitythan CU.U set; • CUs belonging to CU.X set are much less infected than CUs belonging to CU.A, andnever exceed 30% probability to be hit by a bug; • usually, updated CUs have more bugs than others; this is always true in Eclipse, whereasit is almost always true in Netbeans;0 :0–0 • In Eclipse, the mean number of bugs of CU.U sets is often higher than in Netbeans,whereas the opposite holds for CU.A set.One of the main differences between Eclipse and Netbeans projects is the clear subdivisionbetween patching release and main release. In Eclipse it is simple to verify that each mainrelease X.0 is always followed by patching releases, of type X.0.1, X.0.2, and so on. Thisdistinction is weaker in Netbeans, and this seems to affect the variation of its statistics.For the family of compilation units U (CU.U), we calculated the correlation between thefractional change of some metrics, passing from R i to R i +1 releases, and the number of bugsin R i +1 . We were interested in determining if and how the growth of a metric is possiblyassociated to an increase in the number of bugs.In Tables IX and X we report this correlation for Eclipse and Netbeans projects.Table IX: Pearson correlation between metric changes and number of defect in the subsequentrelease in Eclipse. Subsequent releasesMetric variation 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3∆CBO-bugs 0.37 0.58 0.49∆LOCS-bugs 0.29 0.64 0.53∆RFC-bugs 0.39 0.56 0.51∆LCOM-bugs 0.33 0.32 0.49All data in Tables IX and X, show positive correlations. Correlation values are quite similarfor the same pair of subsequent releases, whereas they show larger fluctuations for differentmetrics. A comparison of Tables IX and X shows that correlation values in Netbeans are oftenlower than in Eclipse. This result can be partially due to the less clear subdivision betweenmain and patching releases in Netbeans project.0
Figure 8: The CCDF of the numberof CUs associated to each bug inNetbeans 3.4.The value of the exponent γ of the distributions of the number of CUs affected by a bug isconsistently between 2.2 and 2.9 in all considered releases, for both Eclipse and Netbeans,meaning an ever “fatter” tail of this distribution with respect to the previously studieddistribution of bugs per CU.The finding that the distribution of bugs across CUs satisfies a power-law, may suggest amodel for the introduction and the spread of bugs in the software system. We already specifiedthat, in our investigation, we name “bug” each numerical identifier found in the repositoryassociated to software “fixing”. Thus, generally speaking, a bug reported in a CU meansthat such a CU needed to be partially modified owing to this bug. Now, let us consider thegraph structure of the software system. We, and many other authors in literature, verified anorganized structure of such graphs, exibiting power-law distributions for many properties ofthe system. In particular, there are nodes linked with many other nodes, playing the role of”hubs” of the system. For example, there are few CUs with a large number of in-links, meaningthat they are extensively used by other CUs. If a bug hits such CUs, namely, the CU codeneed modifications, it is very likely that also the code of CUs linked to that node need to bemodified. Such mechanism may generate a sort of defect propagation in the software graph,very similar to the spread of a contagious disease. The system gets infected by bugs, and asingle bug may affect many different CUs, if it propagates from a hub node. On the contrary,bugs in CUs with very few links will likely remain confined to a small number of CUs.Our heuristic conclusion is that the power-laws observed for the bug distribution is probablydue to the scale-free structure of the software graph. Bugs propagate inside a constrainingframework, which determines their diffusion across the software system.From the software engineering point of view, the usefulness of finding power-laws in the tail ofthe bugs distribution, may be illustrated following the reasoning of Louridas et al. [14]. Once0 :0–0 it is shown that bugs distribution across CUs is in the form of a power-law, CUs in the tailmay be identified as the most fault-prone. Thus, after the issue of a new release, the inspectionof CUs for bug detection may take advantage of this information. For instance, an inspectionof the highest 5 % ranked CUs would imply the inspection of a high percentage of bugs, werethe exact percentages is related to the power-law exponent. We analyzed, for each version of the system, the correlations between the considered softwaremetrics and the number of bugs. This information may be used to understand, from themeasure of the metric, which parts of the software are most affected by faults, and to devisethe possible strategies to apply during software development in order to control metrics values,with the goal of reducing bug introduction.Our analysis started computing, for various releases R i of the system, the linear correlationbetween a particular CK metric and the number of bugs of the same CUs. This is only apreliminary analysis in order to identify which CU metrics are more related to fault proneness.We recall that developers distinguish between ”main” and ”patching” releases, and thatchanges from a main release to the next are usually relevant also regarding metrics.In the first part of our study we referred to the main releases. In the Eclipse project mainreleases are identified by two-digit numbers, that is: Eclipse 2.1, Eclipse 3.0, Eclipse 3.1, Eclipse3.2, and Eclipse 3.3. We analyzed what can be deduced about bugs from the analysis of thesoftware metrics for this kind of releases.Table III shows the correlations between metrics and bugs for the main releases of Eclipse. Themetrics showing the highest correlation with bugs are those taking into account the number ofdependencies with other CUs, namely CBO and RFC. This fact highlights the importance ofan analysis of a software system as a graph. The out-links metric is less correlated with bugsthan CBO and RFC. Out-links metric includes not only dependency relationships, but alsoinheritance and implements relationships. A lower correlation of this metric with bugs may beinterpreted with a higher ability of dependency relationships of propagating bugs with respectto the other relationships.Table III: Pearson correlations between metrics and bugs for some releases of Eclipse:2.1 3.0 3.1 3.2 3.3bugs-LOCS 0.49 0.57 0.54 0.58 0.48bugs-CBO 0.55 0.53 0.55 0.55 0.42bugs-RFC 0.59 0.48 0.44 0.56 0.45bugs-WMC 0.48 0.45 0.38 0.48 0.40bugs-LCOM 0.30 0.21 0.15 0.34 0.24bugs-inliks 0.1 0.17 0.25 0.28 0.24bugs-outlinks 0.47 0.38 0.40 0.55 0.420 :0–0 R i of the Netbeans system. In Netbeans the distinction between mainand patching releases is fuzzier than in Eclipse; moreover there are various MR which are notfollowed by classic PR.A comparison of Tables III and IV shows that Netbeans correlation values among metrics andbugs number are usually lower than in Eclipse. However, in both systems, LOCS and RFCare the two most correlated metrics to the CU faultness, while LCOM shows, in both cases, aweak correlation to CU faultness.These results show that: • Given a release, there exist metrics that are more correlated to CU faultness than others; • Considering all releases, there is not one CK metric which is the most correlated for eachrelease; • Given a metric, its correlation with the number of bug changes release by release.0 :0–0 We also analyzed the evolution of the metrics between two consecutive releases. To this purposewe define different types of CUs, distinguishing among updated, unmodified, newly introduced,and defining all these types with respect to all the different metrics.In particular, given a release R i , the next release R i +1 , and a metric M, we classified thecompilation units in four categories: • CU.X is the set of compilation units where metric M doesn’t change between R i and R i +1 ; • CU.U is the set of compilation units where metric M changes (Updated); • CU.A is the set of compilation units that exist in R i +1 but not in R i (Added);It must be pointed out that U and X categories are defined relative to a specific metric. A CUmight exhibit a change in metric M but not in metric M’ between the releases R i and R i +1 .Thus, it will belong to class CU.U for M, and to class CU.X for M’. This case is not common,but it is definitely possible. CU.A is defined regardless to any metric M, since it refers to CUsjust introduced in the new release. There are also CUs existing in release R i but not in release R i +1 . These deleted CUs are not considered in our study.Given the set of compilation units belonging to the three categories CU.U, CU.X, and CU.A,we compute: • the fraction of compilation unit affected by bugs, which provides an infection probability; • the average number of bugs of the infected compilation units.In Table V we show the probability for CUs belonging to one of the families U, X and A, ofbeing infected, in various changes of releases.The probability that a CU belonging to family CU.U is infected is between 0.6 - 0.7 inEclipse. This means that there is a high probability that changing the LOCS, CBO, or LCOMmetrics of a CU from one release to the next results in injecting at least one error into thecompilation unit. This result confirms Purushothaman’s study [2], that highlighted that codecorrection for defects often introduces new defects. Also the CUs added to the system, in thetransition from R i to R i +1 , show a high probability to be infected, clearly larger than forthe case of CUs not modified (set CU.X), and slightly smaller than for the set CU.U. Similarresults were obtained also for all other metrics.On the contrary, if the metric does not change there is a low probability that a CU is affectedby bugs. These bugs clearly refer to bugs already present in R i but that were found only whenchecking R i +1 release.In order to support our findings about the deep differences among CU.U, CU.X and CU.Afamilies, we performed chi-square significance tests. We formulate the following null hypothesis:“the subdivision of CU in U, X and A does not significantly influence the number of infected0 :0–0 χ values have a confidence level larger than 99.9 percent (the confidencelevel is actually much larger). Therefore we can reject the null hypothesis with a probabilitygreater than 99.9%, and confirm that our classification of CUs into families providessignificative correlations with the presence of bugs.In Table VI we report the average number of bugs of the infected CUs. These dataconfirm that the CUs infected of type U and A have an average number of bugs largerthan the compilation units of type X. Note also that, on average, more than one bug isfound during a release lifespan even in the CUs that are not changed in the release. Thus, inTable VI: Average number of bug-affected CUs between two consecutive releases (shown inthe top row), for different families, relative to different metrics in EclipseSubsequent releasesMetric Set 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3LOC CU.U 4.02 3.16 2.61CU.X 1.38 1.22 1.29CBO CU.U 3.92 3.88 3.03CU.X 2.36 1.86 1.8LCOM CU.U 4.34 3.58 2.95CU.X 2 2.64 1.66CU.A 3.2 2.73 2.51general,irrespectively of the metric, we have: • CU.U infection probability is around 60-70%; • CU.A infection probability is around 50-60%; • CU.X infection probability is around 10-30%;0 :0–0 • • • • the most infected CUs, in both projects, are updated CUs; infection probabilities valuesare almost 70% in both systems; • CUs belonging to CU.A set exhibits in general a slightly smaller infection probabilitythan CU.U set; • CUs belonging to CU.X set are much less infected than CUs belonging to CU.A, andnever exceed 30% probability to be hit by a bug; • usually, updated CUs have more bugs than others; this is always true in Eclipse, whereasit is almost always true in Netbeans;0 :0–0 • In Eclipse, the mean number of bugs of CU.U sets is often higher than in Netbeans,whereas the opposite holds for CU.A set.One of the main differences between Eclipse and Netbeans projects is the clear subdivisionbetween patching release and main release. In Eclipse it is simple to verify that each mainrelease X.0 is always followed by patching releases, of type X.0.1, X.0.2, and so on. Thisdistinction is weaker in Netbeans, and this seems to affect the variation of its statistics.For the family of compilation units U (CU.U), we calculated the correlation between thefractional change of some metrics, passing from R i to R i +1 releases, and the number of bugsin R i +1 . We were interested in determining if and how the growth of a metric is possiblyassociated to an increase in the number of bugs.In Tables IX and X we report this correlation for Eclipse and Netbeans projects.Table IX: Pearson correlation between metric changes and number of defect in the subsequentrelease in Eclipse. Subsequent releasesMetric variation 2.1.3-3.0 3.0.2-3.1 3.2.2-3.3∆CBO-bugs 0.37 0.58 0.49∆LOCS-bugs 0.29 0.64 0.53∆RFC-bugs 0.39 0.56 0.51∆LCOM-bugs 0.33 0.32 0.49All data in Tables IX and X, show positive correlations. Correlation values are quite similarfor the same pair of subsequent releases, whereas they show larger fluctuations for differentmetrics. A comparison of Tables IX and X shows that correlation values in Netbeans are oftenlower than in Eclipse. This result can be partially due to the less clear subdivision betweenmain and patching releases in Netbeans project.0 :0–0
4. Conclusion
A statistical description of large software systems as directed graphs can provide muchadditional information on the system features with respect to more traditional approaches,from the software engineering perspective. Adopting a graph as a model for the softwaresystem, we used the compilation units as the basic software module in order to build a softwaregraph, and redefined the CK suite of metrics to cope with CUs. These metrics were then usedto investigate, with a statistical analysis, how and where bugs were introduced into two big,OO software projects like Eclipse and Netbeans. We wrote two different parsers to analyze theCVS log file and the issue tracker repositories in order to automatically associate bugs andCUs. In this paper, we introduced the concept of compilation unit graph, and of OO metricsrelated to compilation units, with the purpose of analyzing software projects managed usinga configuration management system and a corresponding bug tracking system.The picture of the software system as a graph allowed us to detect fat-tail distributions, welldescribed by power-laws, for different features of the system, suggesting the same generalunderlying framework of many other complex networks. In particular, we found that bugsdistribution among CUs, number of CUs affected by bugs, metrics distributions (namely LOCs,number of in-links and out-links of the class graph, CK metrics WMC, CBO, RFC and LCOM),all exhibit power-laws fat-tails.Inside this framework it is possible to identify strong correlations among bugs and those metricsrelated to the number of external dependencies which, in the graph representation, are easilydescribed as directed links. All these findings together indicate a possible strategy to optimizeresources and efforts in software engineering for finding, forecasting, and fixing software defects.Once the software graph reveals the fat-tail in the relationships between bug and CUs, onemay identify which parts of the software are the most fault-prone and focus fixing efforts onthem. Following [14], if one ranks CUs according to these power-laws, the review of a smallfraction among the highest ranked may have an exponential impact on the overall amount ofsoftware defects detectable and fixable.0
A statistical description of large software systems as directed graphs can provide muchadditional information on the system features with respect to more traditional approaches,from the software engineering perspective. Adopting a graph as a model for the softwaresystem, we used the compilation units as the basic software module in order to build a softwaregraph, and redefined the CK suite of metrics to cope with CUs. These metrics were then usedto investigate, with a statistical analysis, how and where bugs were introduced into two big,OO software projects like Eclipse and Netbeans. We wrote two different parsers to analyze theCVS log file and the issue tracker repositories in order to automatically associate bugs andCUs. In this paper, we introduced the concept of compilation unit graph, and of OO metricsrelated to compilation units, with the purpose of analyzing software projects managed usinga configuration management system and a corresponding bug tracking system.The picture of the software system as a graph allowed us to detect fat-tail distributions, welldescribed by power-laws, for different features of the system, suggesting the same generalunderlying framework of many other complex networks. In particular, we found that bugsdistribution among CUs, number of CUs affected by bugs, metrics distributions (namely LOCs,number of in-links and out-links of the class graph, CK metrics WMC, CBO, RFC and LCOM),all exhibit power-laws fat-tails.Inside this framework it is possible to identify strong correlations among bugs and those metricsrelated to the number of external dependencies which, in the graph representation, are easilydescribed as directed links. All these findings together indicate a possible strategy to optimizeresources and efforts in software engineering for finding, forecasting, and fixing software defects.Once the software graph reveals the fat-tail in the relationships between bug and CUs, onemay identify which parts of the software are the most fault-prone and focus fixing efforts onthem. Following [14], if one ranks CUs according to these power-laws, the review of a smallfraction among the highest ranked may have an exponential impact on the overall amount ofsoftware defects detectable and fixable.0 :0–0 REFERENCES
1. Chidamber S. R, Darcy D. P, Kemerer C.F,
Managerial Use of Metrics for Object Oriented Software: AnExploratory Analysis , IEEE Trans. Software Eng., vol 24, No. 8, pp. 629-639, 1998.2. Purushothaman R, Dewayne E.P.
Toward Understanding the Rhetoric of Small Source Code Changes ,IEEE Trans.Software Eng., VOL. 31, NO. 6, JUNE 20053. Kim S., Pan K., Whitehead E.J.Jr.
Micro Pattern Evolution (MSR’06), May 22-23, 2006, Shanghai, China.4. ´Sliwerski J, Zimmermann T, Zeller A.
When do changes induce fixes? . Proc. International Workshop onMining Software Repositories (MSR’05), St. Louis, Missouri, U.S., May 2005.5. Zimmermann T, Nagappan N.
Predicting Defects using Network Analysis on Dependency Graphs ,ICSE’08, May 10-18, 2008,Leipzig, Germany.6. Andersson C, Runeson P.
A Replicated Quantitative Analysis of Fault Distributions in Complex SoftwareSystems , IEEE Trans.Software Eng., VOL. 33, NO. 5, MAY 2007, pp. 273-286.7. Zhang H.
On the Distribution of Software Faults
Analyzing and relating bug report data for feature tracking . In Proc. 10thWorking Conference on Reverse Engineering (WCRE’03), Victoria, British Columbia, Canada, Nov. 2003.IEEE.11. Concas G, Marchesi M, Pinna S, Serra N.
Power-Laws in a Large Object-Oriented Software System . IEEETrans.Software Eng., vol. 33, no. 10. pp. 687-708, 2007.12. Juran J.M, Gryna F.M. Jr.
Quality Control Handbook , fourth ed. McGraw-Hill, 1988.13. Barabasi A, Albert R.
Emergence of Scaling in Random Networks . Science, vol. 286, pp. 509-512, 1999.14. Louridas P, Spinellis D, Vlachos V.
Power Laws in Software , ACM Transactions on Software Engineeringand Methodology, Vol. 18, No.1, September 2008.15. Newman M. E. J.
Power laws, Pareto distributions and Zipf’s law , Contemporary Physics, vol. 46, pp.323-351, 2005.0;0