Code forking in open-source software: a requirements perspective
11Code forking in open-source software: a requirementsperspective
Neil A. Ernst, Steve Easterbrook and John Mylopoulos
Department of Computer ScienceUniversity of Toronto, Toronto, Canada { nernst,sme,jm } @cs.utoronto.ca Summary.
To fork a project is to copy the existing code base and move in a direction different than thatof the erstwhile project leadership. Forking provides a rapid way to address new requirements by adaptingan existing solution. However, it can also create a plethora of similar tools, and fragment the developercommunity. Hence, it is not always clear whether forking is the right strategy. In this paper, we describe amixed-methods exploratory case study that investigated the process of forking a project. The study concernedthe forking of an open-source tool for managing software projects,
Trac . Trac was forked to address differingrequirements in an academic setting. The paper makes two contributions to our understanding of codeforking. First, our exploratory study generated several theories about code forking in open source projects,for further research. Second, we investigated one of these theories in depth, via a quantitative study. Weconjectured that the features of the OSS forking process would allow new requirements to be addressed. Weshow that the forking process in this case was successful at fulfilling the new projects requirements.
Traditionally, requirements analysis constitutes the initial phase of a software development project,and implementation the last. But what happens when a codebase already exists and serves as thestarting point of software development? We study this question in the context of open source software.Open-source software (OSS) has proved its worth. OSS runs some of the largest web sites in theworld (Apache), complex corporate servers (Linux/BSD), and is comparable to proprietary desktopapplication suites (OpenOffice) [1]. However, there remains a perception that OSS is not high-quality,and adoption of OSS in corporate IT environments is still low. Goode [2], in his survey of Australiancorporations, discusses several reasons for this. They include a perceived lack of support, hiddeninstallation costs, and finally, a belief that corporate IT requirements are not met by any existingOSS tools.To fork a project is to copy the existing code base and move in a direction different than that of theerstwhile project leadership. Forking the codebase allows developers to leverage existing functionalitywhile also addressing new requirements. Although flexible, forking has inherent difficulties, such asmaintenance, evolution, and social factors concerning the development community. This paper looksat the form these take. To study the problem and evaluate potential solutions we focus on a softwareproject management tool,
Trac . Inasmuch as
Trac had an original target audience, it was intendedfor developers in a corporate environment, then released more widely. In 2005 the tool was forkedto support the development of a project that targeted undergraduate programming projects. Thisprocess, and its implications, are the subject of this research.We followed a two-phase, mixed-methods sequential approach [3] examining the interaction offorking and requirements. In the exploratory phase (Section 1.4.2), we qualitatively examined thenature of OSS, requirements, and forking by conducting surveys and interviews of participants from a r X i v : . [ c s . S E ] A p r oth the original and the forked project. Using grounded theory [4] methodology, we generatedseveral theories about the problem.We then selected one theory and applied confirmatory analysis in the second phase. In thisprimarily quantitative segment, the theory we tested was whether some of the different requirementsthat motivated the fork were met. We analyzed this using two non-functional requirements. We firstused a variety of code metrics of the two codebases (Section 1.5) to assess maintainability . Secondly,we used qualitative usability analyses to assess interface complexity, or usability .We designed this study as an exploratory one since there was a lack of detailed research regardingthe forking of OSS to meet differing requirements. This way we generate some suitable researchhypotheses for further testing. The second, more quantitative portion of the research examineswhether code forking can successfully address certain new requirements. In some respects, we arguein Section 1.6, many of the theories we initially generated are best suited to qualitative or case studyanalysis.Our contribution is two-fold. One, we describe some of the process issues that arise when forkingOSS. We provide theory patterns derived from the data that are a step towards more comprehensivetheories about this process. Two, we describe a case study on a product of a fork, DrProject . Inthis case study we examined how well the forked tool could be adapted to meet new requirements,that didn’t exist in the original.
We begin with a brief introduction to the tools involved. To provide context, we then briefly reviewthe literature. This research adds to the existing literature because few people have looked at open-source software from a requirements perspective (requirements are rarely mentioned in OSS; oftena project is initiated due to perceived need). As well, there has been little work on the process offorking.
DrProject and
Trac
The context of this study is the evolution of a software project management tool named
Trac . Trac is currently (as of January 2006) in version 0.9.3 and supports, among other things, Subversionrepository integration, Wiki-enabled web pages, tickets and bug filing (a ticket is a to-do itemassignable to project members).
Trac is developed as an open-source project, under the leadershipof two developers at Edgewall Software.In January of 2005 two faculty members at the University of Toronto (not involved in conductingthis research) decided to fork
Trac in order to support undergraduate programming.
Trac was chosen,according to interviews with the proponents, for three reasons: 1) it was written in a familiar language(Python); 2) it had simple technology (CGI); 3) it had a small codebase and clean UI. The resultof the fork,
DrProject , is the latest in a series of attempts to develop software to support studentprogramming projects at the undergraduate level. Educators at the University of Toronto believedgraduating students were leaving unaware of the importance of bug reports and version control. Theydetermined that a tool should be developed to expand that pedagogical objective (notwithstandingreports that a large number of corporate software developers were equally unfamiliar with the tools).There have been a a number of previous attempts to develop such a tool at the University ofToronto (refer to Fig. 1.1). This latest iteration has been the most successful, with adoption by twoseparate courses, a self-hosted development project, and interest from academic research labs in theDepartment. We elicited the key requirements through interviews with the proponents. Functional http://projects.edgewall.com/trac http://pyre.third-bit.com/drproject equirements include, in addition to what Trac offers, multiple project support, test suite integration,and mailing list management. Non-functional requirements are for a codebase that is maintainableby students, and an interface that contains just those features that are necessary for short-durationsoftware projects. These ‘early’ requirements are presented in the goal graph of Fig. 1.2.
Fig. 1.1.
A short history of the incarnations of
DrProject . Letters indicate school terms (Fall, Winter,Summer)
Fig. 1.2.
Goal decomposition for
DrProject . Notation used is based on the NFR framework [5]. Goals ad-dressed by the existing
Trac tool are in italics. Other goals were extracted from qualitative studies conductedas part of this research.
Research into the nature of open-source software has spawned a large degree of interest from avariety of disciplines, including economics, management, sociology, and software engineering [1, 6].Source code accessibility opened the door to rich stores of data on software evolution, developmentstyles, and community building, e.g. , [7, 8]. There seems to be scant empirical work on the natureof forking in OSS, beyond discussions of psychological and sociological issues at play (for example,otivations for contributing to an OSS project). Most literature mentions forking as a pitfall, orsomething to be avoided e.g. [9]. There is mention of the process in several books, but it is mostlyconjecture or anecdotal in nature, e.g. , [10]. One useful distinction, albeit conjectured, is presentedby Fogel [11]. He distinguishes between forks that happen due to fundamental differences in projectgoals (as is the case here), and those which are custody battles over the existing project.In this paper, we use the term to refer to situations where code from one project is dupli-cated in a different repository, and significant new requirements are added to the new project. Thisrequirements-centric definition excludes downloaded code that receives minor alterations (e.g., a re-search tool for gene expression data). We are most interested in code that is altered to address newrequirements. That is clearly the case in this project.There has been some research on requirements in OSS. Scacchi [12] looks at how requirements arehandled in different OSS projects and provides a good look at differences between OSS communitiesand other models. For example, he makes the observation that in OSS, requirements are tightlyintegrated with tool design and implementation. Scacchi et al. [13] expand on the earlier paperwith an ethnographic analysis of the requirements process in the NetBeans product. Our workdiffers in its focus on adapting existing tools to new requirements, rather than creating new toolsbased on (possibly emergent) requirements. Regarding the question of adapting OSS for corporaterequirements, Bonaccorsi and Rossi [14] present a survey that suggests most corporate users of OSSadapt the tool to meet customer requirements without returning their contributions to the originalproject. Adams et al. [15] take a similar tack, but they describe how they adapt a tool using aplugin architecture, rather than working at the internal level of design, as was the case here. Finally,similar themes appear in fairly extensive research into COTS integration with project requirements,for example, [16, 17]. The main difference with OSS seems to be ability to a) leverage an existingcommunity of enthusiasts, and b) gain full access to the codebase.Forking is relatively uncommon in open-source projects, perceptions notwithstanding. Two pri-mary reasons are the open, accessible nature of OSS licences, and the social cost in splitting froman existing project. An example illustrates the first reason [18]. The gcc compiler team had decidednot to support particular optimizations; another team forked the codebase, then went ahead and im-plemented the optimizations. This alternative proved popular, and the resulting project, egcs , waseventually merged back into the gcc tree. The second reason, social support, detailed in Raymond([10], p. 84-87), is more common. Once a community has developed around a tool, it is difficult tocreate a variant of that tool and maintain the critical mass needed to keep the quality high. Theseconcerns also appear in the qualitative studies we conducted.
Another important context for this research is the evolution of requirements. Understanding one’srequirements is central to understanding what features a software system should implement. Wherethose requirements overlap with existing projects, the opportunity for a fork exists. Forking softwareto meet new requirements is equivalent to evolving an existing software product.For
DrProject , there was no initial explicit requirements analysis. Rather, as one of the
DrProject proponents mentioned, a “chasing tail-lights” approach was used. This means identi-fying a product that does something similar, then molding your tool after that. In this sense, manyOSS projects can be seen as ‘forks’ from existing tools, in ideas if not actual code.There are three possible avenues to satisfy the identified requirements (regardless of how theyare identified or specified). One, the proponents can create an entirely new project from scratch.This might include the use of commercial off-the-shelf products. This was tried, unsuccessfully,with earlier ancestors of
DrProject . Two, the proponents can identify existing tools that partiallysatisfy the requirements, and work within that community to fill the unsatisfied requirements. Tobe successful such an approach involves convincing a community of the merits of your needs. In thecase of
DrProject , the proponents felt this was unlikely, given the different goals they had. Finally,he proponents in OSS projects can fork the code and adapt it on their own to create a variant thatmeets their needs. This last choice was made for
DrProject . Georgas, Gorlick and Taylor [19, p. 3]describe why this last choice makes sense: “In the world of open-source software development, thechances are excellent that someone else, somewhere else has already solved your problem, in whichcase, it is a foolish waste of effort to solve it again”.
The initial review of the literature produced little in the way of clear theories about forking and open-source software. To address this, we chose a mixed-methods research framework for the project. Byway of introduction to this approach, we highlight some of the discussion regarding the methodologyfrom Creswell’s book on research design [3].Mixed-methods approaches combine both qualitative and quantitative research methods. Thiscombination can occur either in parallel or in sequence. Primarily this is done to combine the bestof both approaches: the open-ended, generative nature of qualitative research and the confirmatorynature of quantitative studies. Mixed-methods research techniques arose from research in psychol-ogy, and an interest in mixing different data sets. It is growing in popularity in sciences whichstraddle physical/social boundaries, such as psychology, economics, geography, and, we would ar-gue, requirements engineering, with its strong human-centered focus. Mixed-methods research is amethodology with a strong pragmatic motivation. This can be contrasted with the constructivistapproach common to many qualitative studies, and the post-positivist, rationalist approach com-mon to quantitative studies. Pragmatism focuses on real-world implications and consequences ofaction [3].The challenges of mixed-methods research include the need to understand both quantitativeand qualitative procedures, and the longer time frames possible (particularly in sequential studies,where data from one inquiry, such as the qualitative work here, inform the next stage). However, wehave found that using the qualitative inquiry successfully narrowed and informed both the choice ofquantitative method and the questions to focus on. It works well when there is little existing researchto form well-grounded theories about a problem. Figure 1.3 illustrates the combined approach wetook.
This phase was a qualitative inquiry into the nature of the process of forking in open-source software.The results of this inquiry are questions, stated as theories, about this process.
We gathered data from both sets of groups involved in the fork. We used a survey to gather responsesfrom the developers most involved with the Trac project. The survey was emailed to five individuals;by the deadline, three responded. All three were senior developers, although not the ‘gatekeepers’,for the
Trac project. For the
DrProject team, we interviewed both leaders in person, transcribingthe interview from the audio file. We also examined mailing list records for each project.Having gathered this dataset, we then used grounded theory methods of open, axial, and selec-tive coding to generate concepts (theories) regarding the problem. ‘Coding’ a set of data involvesestablishing commonalities that emerge (open coding); grouping these commonalities, or concepts,into higher-order clusters (axial coding); and finally, establishing potential relationships among the Questions can be found at http://neilernst.net/docs/trac-survey.txt . ig. 1.3. Process model of the two phases of the research design. Phase 1 is a qualitative theory generatingstep about the process of forking. Phase 2 takes one theory about this process and applies it to a productof the process. concepts (selective coding) [4]. To do this, we read through the various source materials (emails andtranscripts), highlighting terms and ideas that seemed salient. Those which appeared in a majorityof sources we used as codes. These codes are discussed at length in Section 1.4.2. Note, we usedall the interviews to generate the codes. Our aim was to generate codes which were grounded inboth sets of experiences. Codes in this context can be seen as dimensions, or themes, regarding theprocess of forking and open-source software.
Fig. 1.4.
Relationships and codes used in theory construction
Figure 1.4 illustrates the six codes/concepts that we extracted from the data. Again, these representcommon themes presented in the data. After deriving them, we returned to the literature to confirm,s best possible, that the dimensions we identified were not contradicted by the literature. Below,we present the code we generated, along with summaries of the type of data that gave rise to thecode.Branching – using branches to promote alternate visions for a tool; how a fork differs from a branch.Community – discussion on diversity of the community; how community requests are driving re-quirements; how the tool evolves as more people adopt it.Simplicity – the idea that code should be modular; that student projects require less complexitydue to shorter time frames; that chosen technology ( e.g. , language choice), if simple, can helpattract other developers and users.Power/control – the ‘scratch your own itch’ meme; that requirements evolve as the tools are used;that different requirements can imply different directions; that a good community has a ‘gate-keeper’ to oversee development and drive the project.Communication – tickets as a source for new ideas and a way of keeping the community informed;concern over appearing closed off, using IRC or chat, which have no easily accessible logs.Frustration/emotions – passion over existing tools as inadequate; emotion over a fork resulting inyour hard work supporting something you have no power over; frustration over design decisionsin another project; unease over project directions.As the figure indicates, these can be connected selectively to some of the other concepts.
Branching helps with community building; other people can get involved without having to convince othersof the benefits of their project until they are ready to merge their changes. This is related to apower/control concept as well.
Simplicity helps to build a community, allowing others to developsoftware more easily. This allows for people to take control of the issues they are interested in. Forexample,
DrProject is designed to be simpler than
Trac so new developers can quickly understandthe project. This is because
DrProject is intended for student projects and student maintainers.
Community needs, while facilitated by simplicity and branching, can involve emotions, includingfrustration, and requires good communication support.
Power and control issues can be supportedby effective, early communication about what requirements should be addressed; this issue also canlead to frustration, for example if a requirement is not considered important, or code one personworked on is forked to a different project where that individual has no influence. This is perhaps themost complex of the issues generated by the survey, and requires more investigation.
Frustration is a product of failing to address the other concepts.At this stage, we have taken the various data elements, generated common concepts, and thencreated relations among these codes selectively (Fig. 1.3 illustrates this process). Once we havethese dimensions in the form of the concept map in Fig. 1.4, questions can be posed regarding theprocess of forking and open-source software. We went through the concept map, deriving questionssuggested by the codes. We present these questions below in the form of simple theories regardingthe process. This list is not exhaustive, but does capture many salient aspects of the process. Wepresent a keyword to identify the theory, followed by its description. In brackets, we identify therelevant codes which give rise to that theory.Divergence – A forked project will diverge more from the main codebase over time than a branchedsubproject will [Branching, Power/control].Formality – OSS projects typically consider requirements informally, using branches and feedbackto suggest new features [Community, Branching].Activity – In community-developed projects ( i.e. . OSS), there will be a high number of brancheswith frequent commits, representing development on new requirements/features [Community,Branching, Power/control].Definition – Projects with well-defined requirements will have fewer developers and fewer branches,and less complex code [Simplicity, Community, Branching].Leadership – Central to every project is the requirement for a leader who has a good grasp of thesurrounding issues of control, community, and communication [Power/control, Frustration].penness – Publicly accessible chat logs will lead to more open communication about a project,and produce less dysfunction in community building [Simplicity, Community].Modularity – The more modular a tool is, and the more it uses well-known tools, the more likely itis to be adopted [Simplicity, Power/control].We discuss some ways to explore these theories in more detail in Section 5. Given scope constraints,we chose one additional theory to develop in more detail. This was generated in the same way theothers were, but we focus on it here to emphasize its use in the following section (as per Fig. 1.3.This theory questions why the decision to fork might be made.Motivation – Forking occurs when the requirements of a system-to-be extend beyond those of anexisting system, and there is an inability to address those new requirements within the existingproject [Power/control, Community, Simplicity].In the specific case of
Trac and
DrProject , the
DrProject team understood priorities for
Trac tobe different. For example, the
Trac team were unable to commit to supporting the integration ofmultiple-project support. Bonaccorsi and Rossi[14] describe a similar occurrence in the context ofcorporate participants in OSS projects. They tended to extend existing projects to address customerneeds. These two examples lend some rationale support to this theory. Because
Trac , for one, wasfairly well-structured, it was relatively simple to adapt the existing tool to new requirements. Most ofthese requirements arise from the new context – supporting undergrad projects within term-lengthcourses, administered by course instructors and students.
This second phase took the hypothesis generated in the preceding stage and conducted empiricaltests. Primarily these consisted of quantitative testing; however, we also used qualitative user testingin the second portion, since this test applies well to the chosen hypothesis.The original theory (
Motivation ) centered on the question of deciding to fork. We suggestedthat this is done when one group identifies additional requirements that the existing tool cannot, orwill not, address. To test this theory, we want to identify several factors. • One, were there additional requirements, and were they not being met in the original context?That there were additional requirements is shown by the goal model of Fig. 1.2. That these werenot addressed by the
Trac project is argued in the preceding section. • Two, were these additional requirements addressed in the new, forked project? If not, then therationale for forking seems less clear.In this section, we tackle this question of whether forking
Trac was successful in meeting thenew requirements. Recall the goal model of Fig. 1.2. In goal modeling terminology, we wish to assesswhether the new requirements are satisficed by the new system. In particular, we focus on two non-functional goals (softgoals), namely usability and maintainability. These were identified a priori asimportant characteristics for the system-to-be to satisfice.
To assess whether the forking achieved its objective, we can measure success for two of the mostimportant requirements: long term maintainability by instructors/students (using code complexityas a key indicator), and minimal learning time for undergrads (using usability testing on paperprototypes as an indicator). We do not need to provide proof this was done. Indeed, since these tworequirements are non-functional, by definition we can never know whether they are fully satisfied.Instead, this study seeks to provide sufficient rationale to confirm both goals are satisficed by thenew system (satisficing indicates that we can be reasonably content with the outcome [20]).oth requirements can be analyzed empirically. First, we derive a hypothesis to guide this phaseof the research. It is: H – The softgoals of maintainability and usability that partially motivated the decision to fork
DrProject have been satisficed .We formulated several techniques for testing the hypothesis. Below, we outline each approachand the associated claim to be investigated. Results are presented in the following section.
Maintainability
We used the pymetrics tool to assess software maintainability. We normalized the two projects byremoving any testing code as well as those python files with fewer than fifteen lines of code (such filesconsisted of initialization code not central to program comprehension). This left all python sourcefiles that contributed directly to the functionality of each project.1. We generated McCabe cyclomatic complexity [21] figures for both the Trac and
DrProject codebases. This metric calculates the number of paths through a program call graph, measuringdecision points ( e.g. , if statements) and exits. It is used to assess the relative complexity ofa codebase. For example, more decision paths branching from a particular point in the codeindicates that maintenance of that portion of the code may be troublesome (as changes havegreater impact).McCabe may not be the best metric for dynamically typed, web-based systems like
DrProject .Muthanna et al. [22] suggest it correlates well with maintainability, but they studied larger Clanguage systems. Rajaraman and Lyu [23] conclude the McCabe metric doesn’t work well forobject-oriented C++ systems, but
DrProject and
Trac exhibit few strong OO tendencies (suchas encapsulation or inheritance). We conclude that while imperfect, the McCabe metric canprovide evidence towards concluding whether or not goal satisficing is achieved. H a : Trac code is less maintainable than
DrProject according to the McCabe cyclomatic com-plexity metric.2. We performed a lines of code (LOC) analysis. This is a weak metric for software complexity,but provides another data point. Its most obvious weakness is that what one programmer doesin ten lines, another may do in two (and neither number says much about software quality).It nevertheless provides a quantitative assessment of the absolute size of the codebase, and ingeneral, larger codebases are more complex. One aspect of the codebases we haven’t consideredis output handling, that is, the code which controls output to the browser. We discuss thisqualitatively in the following section, under type conversion . H b : DrProject has fewer lines of code than
Trac .3. We assessed the extent of commenting in each project. A key factor in complexity and main-tainability is the degree to which source comments are used. Poorly commented code is moredifficult to maintain without assistance from the original author. In Python, comments can beadded at the function and class level using ‘doc-strings’, which are similar to Java’s ‘Javadoc’mechanism. We measured the proportion of functions with comments. H c : DrProject has a greater proportion of functions with comments than
Trac . Usability
As a final means to evaluate H , we introduce the results of qualitative usability testing of a revised DrProject interface. The new interface was a redesign of the existing
Trac interface to addressworkflow issues introduced by several new modules. We used paper prototypes [24] of proposedinterfaces to assess usability. We interviewed five participants, assigning them a series of tasks to http://pymetrics.org erform in DrProject . For example, one task was to log in to the system. After completing thetasks we presented them with a series of questions, in a structured interview format. We used thisto redesign the interface and tested this updated version with a further five students.
Code complexity – We ran the pymetrics tool on the source files, producing individual results for eachone. We then averaged these results to obtain a project McCabe complexity metric. We also computedthe COCOMO2 SLOC metric [25], which measures single lines of code (excluding comments andempty lines). This we aggregated by summing all the values to produce a total value. Finally, wegenerated a measurement of the proportion of functions which were commented in the source. Thesewere also averaged to produce the mean value. Results are shown in Table 1.1.
DrProject is 27% less complex according to the McCabe metric. It also has over twice asmany functions with comments, enhancing maintainability. Lines of code are marginally lower in
DrProject . However, this number includes two additional features not currently part of the pro-duction system. These are a requirements and testing component and a graphical dashboard. Theseadd over a thousand lines to the
DrProject codebase.
Metric
Trac DrProject
McCabe ( µ ) 56.25 44.49SLOC (sum) 7572 7504F n Comments 21.6% 46.2%
Table 1.1.
Aggregate normalized software metrics for the two projects
Type conversion – Type conversions increase cognitive load when data is translated from oneform to another. Programmers must remember both the new and existing form, as well as a newreference to the data. We examined the frameworks chosen by both
DrProject and
Trac to assesstype conversions. This was prompted by a claim by one of the leaders of
DrProject that the numberof type conversions was lower in the new codebase. A framework is a tool, or suite of tools, thatmake it easier to accomplish certain tasks. In this case, the principle framework difference is inHTML display.
Trac uses the Clearsilver tool for producing output. Its advantages are speed andcross-language compatibility. The DrProject team moved to Kid , which is only for the Pythonlanguage, and is designed for producing valid XML (of which XHTML is one language). Inspectingthe differences in these two languages shows that Kid reduces type conversions, since it is a nativePython tool. Data remains in the same form in either a Kid template (the view) or Python source file(the model). Clearsilver, by contrast, uses a hierarchical data model that requires the user to insertdata for display by converting to a string, then extract it for use in the template. Another benefit isthat Kid is explicitly designed for XML production, and is easier to secure against injection attacks.Finally, since Kid insists on valid XML, there is no possibility of generating invalid XHTML. Thesechanges contribute positively to the satisficing of the maintainability NFR.Given the results of the preceding three sections, we conclude that hypothesis H provides asufficient explanation of the data; namely, that the maintainability softgoal is satisficed. http://lesscode.org/kid sability While we cannot conclude
Trac is less usable than
DrProject , the iterative redesign does account forthe different requirements of the
DrProject tool, namely that a student user group needs a simplerinterface. For example,
DrProject has added multiple project support and substantially reducedinterface complexity. The ticket interface, for example, has reduced the available options by threefields (from nine), and the available values for those fields by nine (from eighteen). However, thereremains a tension between the functional features in the UI – such as modal interface to supportmultiple projects – and the softgoal of usability. While our subjects had little problem completingour walkthrough, they ran into difficulty with how the various projects interacted. For example, oneuser commented that he wanted “tickets to show up across projects”. Another user, however, likedthe custom ticket interface we presented her. . The usability testing needs to be more extensive tomake any firm conclusions. This is a well-evaluated area of HCI. However, it would be difficult to dorelative comparison of two interfaces over the time period required for this study. Ethics concernswould prevent the use of two tools of (hypothetically) differing quality. Students use DrProject for projects, and we wouldn’t want to impact marking. However, the qualitative assessment weundertook does provides some support for hypothesis H . We conclude that the usability softgoal ispartially satisficed. We begin with an analysis of the chosen research methodology, including threats to the researchvalidity. Then, we analyse the implications of the study results, including some notes on additionaltheories and ways to test them.
Good empirical studies include an assessment of how applicable the results are, and how well-foundedany conclusions may be. The following discussion is modeled after suggestions in Trochim [26].
Conclusion validity – Conclusion validity assesses whether the conclusion we drew was supportedby the evidence. We concluded that there was a relationship between differing requirements and
DrProject ’s usability and maintainability. The largest threat here is the use of code complexitymetrics, as discussed in Section 1.5.1. These provide a concrete number to which it is easy toassign more meaning than appropriate. Similarly, there are no doubt more appropriate metrics – ingeneral, there is a scarcity of suitable metrics for web application frameworks such as those usedhere. Such metrics would assess factors like framework complexity (J2EE vs Rails), type conversions,etc. Nonetheless, we maintain that the metrics generated do point to substantive differences inmaintainability. While not definitive, they do allow us to claim that the maintainability softgoal issatisficed. The qualitative procedures, while adequate, could have been broadened; the sample sizeswere somewhat small (although, we would argue, representative for this project). For more generalresults, better theoretical sampling is required.There is a question as to whether there is any possible failure associated with this hypothesis(that is, whether it is tautologous). One might argue that having identified the requirement of, forexample, reduced complexity, an outcome of reduced complexity is nothing surprising. We arguedifferently. During the timeframe of this study, there were other development goals beside the twosoftgoals we list. For one, all the functional requirements, such as multiple-project support, needed tobe addressed. Second, there were students involved in the project who had competing requirements.These were students funded to develop research prototypes, such as project dashboards, that were not Data available on request art of the core
DrProject platform. These tools distracted from the core purpose of the developmenteffort. Accordingly, there was a very real possibility that the new tool would be neither more usablenor more maintainable.
Internal validity – Was the relationship established a valid one? There may have been hiddenfactors influencing the results, but properly conducted grounded theory research [4] demands thatthe researcher examine his or her own sources of bias during the research, in order to accommodatethis. This we did by introspection of the methodology and results.
Construct validity – This measures the extent to which the theoretical constructs applied werevalid. In this case, the study was largely exploratory; the correct techniques and definitions are asyet largely undefined. We defined two construct, usability and maintainability. The measures werequantitative assessments of the relative degrees of complexity of each. There was some degree ofhypothesis guessing on behalf of the survey participants, who attempted to guide conclusions. Weaddressed this by using only codes which emerged from a majority of the data sources, not just oneor two. Finally, codes should be generated by more people to compare the inter-rater reliability ofthe results.
External validity – Qualitative studies can be hard to generalize. However, we believe the resultssuggest several interesting research opportunities, as the
Trac project seems characteristic of manysmaller open-source projects. How well this maps to larger projects, for example Apache, is uncertain.The only generalization we make is that the ability to fork a project to meet requirements which area superset of the existing ones seems possible given the experiences reported here. However, sincethis is but one of many such examples, more research is needed before concluding this is the case.
The confirmatory testing conducted allows us to conclude that
DrProject has partially satisficedthe usability softgoal, and satisficed the maintainability softgoal. Despite adding several significantfunctional enhancements, the codebase is less complex than the existing
Trac code (when one mightexpect it to be as complex or more complex). It has a student user-tested interface. The code isbetter documented. Finally, although not explicitly verified, the code has more testing infrastruc-ture to verify software correctness. The fork has been successful in adapting existing features tonew requirements. According to our initial, grounded theory of
Motivation , which questions whatmotivates a fork, we can conclude that a) there was good rationale for forking; b) the forked productaddressed new requirements successfully. This implies that, in this case, forking was an acceptablemeans of addressing the requirements identified previously.What were the reasons for this? We identify five factors. • There was a good match between the core requirements ( e.g. , repository integration, bug tracking,etc.). Indeed, the
DrProject proponents modeled many of the requirements after
Trac or similartools. • Existing code, while poorly documented, was well-structured and architected, and • One of the
Trac architects was willing to help transition the codebase during the fork period,answering questions and providing technical assistance for a period; • The
DrProject initiative had strong leadership with a clear vision of what the tool should be,and dedicated developers during the initial period of the fork; • Finally, lessons learned over the past years provided valuable experience on what a manageableset of functionality was.Does this imply that the technique of forking is a viable one for other domains? One could arguethat having a strong leader and clearly defined requirements meant that the
DrProject effort wouldsucceed regardless. There are two counter-arguments. One, the same leadership had attempted vir-tually the same project several times in the past, with little success. A strong vision alone is notufficient. Secondly, the scope of the project (as defined by the perceived requirements) was suchthat the four month development period would have been insufficient to fulfill all those requirements.We suggest that what occurred was that this process was actually the evolution of an existingsystem into a modified system, reflecting new external forces (requirements changes). The implicitrequirements model was revised accordingly. Forking the code therefore leverages the need for a first,‘throwaway’ implementation [27]. Assume the requirements for two systems are relatively similar.The newer system can build upon those other requirements ( e.g. , using pre-existing repository in-tegration to build advanced visual interfaces). This reduces the amount of time required to code anentire system. Forking becomes a software (and requirements) evolution and maintenance task. Thedisadvantage is that earlier developers typically will not assist in the process, but this is often thecase in software evolution, where previous developers have moved on.
Costs and benefits of the fork
Concluding that
DrProject met the requirements of improved maintainability and usability forthe fork suggests forking was beneficial. However, it is possible some costs were incurred as well.To investigate what these might be, we conducted a second set of interviews with the
DrProject leaders. The interviews were structured along the five concepts described in Section 1.4.2. We usedthese concepts to guide a separate discussion with each of the two leaders. From transcriptions ofthese interviews, we elicited the following ‘concerns’ about forking software.Branching vs. forking – The fork moved code out of the
Trac repository, but this wasn’t felt to bea major cost. Even had the project remained in the repository (branched), the differences inrequirements would have made it very difficult to automatically merge code into the mainline.There was some concern that change the other way would be difficult, e.g., getting the latestbugfixes and security patches from
Trac .Community – Forking has meant the loss of the
Trac community of developers. There are plans,however, to allow other schools to use the tool and possibly contribute to its development. Inthe short term, though, there are fewer users to help identify bugs and fix problems. The toolseems out of reach for undergraduates to participate in – certainly for a senior programmingassignment. A student might be happy to receive 80% on an assignment, but to one
DrProject leader, “I regard that as 20% of your code is buggy . . . which is just too high”.Power/control – Forking the codebase has meant the two
DrProject proponents have had to under-take the leadership of the new project, with concomitant responsibilities for management andfuture development. There is a hope that other universities will join in, but some involvementfrom these two seems likely in the immediate term. Tool documentation and code stability arenot a point where sharing the tool with others is feasible. Finally, some issues will remain local,e.g., authentication processes are controlled by the organization’s existing processes.Communication – There was some cost to the community associated with a less open communica-tions leading to less visible decision-making. Logging of IRC meetings is done, but it would be“wrong-headed ... to shift communication” to something the core team wasn’t comfortable with.As with all OSS, however, the fact the code is accessible means a de facto level of openness thatnon-OSS projects don’t exhibit.Frustration/emotions – There has been no evidence of negative feedback from the existing
Trac community. There are two suggestions for this. One is the work by one individual on both
Trac and
DrProject ˙This person is a key contributor, and his involvement lends credibility to a projecthe is involved with. In this case, this implies a validation of the differing requirements for thenew project. The second factor is that the reason for the fork was primarily for technical reasons,and not personal ones. This seems to be a less emotional process, where disagreement, if it isfound, is principally on technical grounds.Of these, the factors identified as most costly were the loss of bug fixes from the community, andthe potential gap in creating a new community. .6.3 Assessment of the research technique
This study used a sequential ‘explore and confirm’ mixed methodology to explain and understandthe nature of forking open-source software. This technique was useful in generating theories aboutthe problem where the literature didn’t provide much insight. Many of these theories, however, defyquantitative evaluation. For example, how can we understand how a particular project consideredrequirements other than qualitatively? We avoided this dilemma by focusing on the quantifiableaspects of the system, such as code metrics. There remain many interesting issues, such as teamdynamics, collaborative work, and communication, that are best assessed qualitatively.Another approach to evaluating forking (and system evolution in general) would consider othernon-functional requirements [5] such as testability, modularity, and maintainability. Techniques sim-ilar to those mentioned in Yu et al. [28] could be applied: reverse engineer a goal model from theexisting source; elicit new requirements for the potential fork; apply tradeoff analysis to derive anew goal model.
This research explored theories about forking, requirements, and open-source software. There weretwo problems addressed by the work. Firstly, what is the nature of the forking process? We identifiedsome questions in this area such as the nature of requirements in forking, the attitude between thecommunities, and whether requirements were a basis for a fork. Secondly, we identified one question,regarding the motivation to fork. We hypothesized that forking was required to address the softgoalsof maintainability and usability. To confirm this, we showed that the new codebase, while a bettermatch for the new functional requirements, satisfices the usability and maintainability softgoals.Given the relative success of the DrProject fork – adapting an existing tool to new requirements– we conclude that this is a potentially useful avenue in software development. We then analysedsome threats to the validity of the research, and discussed the findings and their implications. Thetheories generated in the first stage suggest several possible areas for future research in this problemdomain.
We would like to thank Greg Wilson, Karen Reid, and the students who worked on
DrProject ,as well as the Trac developers who took the time to respond to our survey. Finally, thanks to JenHorkoff and Jorge Aranda for their insightful reviews.
Neil A. Ernst is a Ph.D. candidate in software engineering at the University of Toronto. His researchinterests include requirements engineering, information visualization, and human factors.
Steve Easterbrook is Professor of Computer Science at the University of Toronto. His currentresearch goals focus on the analysis of requirements for complex software-intensive systems.
John Mylopoulos is Professor of Computer Science at the University of Toronto. His researchinterests include information modelling techniques, covering notations, implementation techniquesand applications, knowledge based systems, semantic data models, information system design andrequirements engineering. eferences
1. S. Weber,
The Success of Open Source
Information & Management , vol. 42, no. 5, pp. 669–681, July 2005. [Online]. Available:http://dx.doi.org/10.1016/j.im.2004.01.0113. J. W. Creswell,
Research Design: Qualitative, Quantitative, and Mixed Methods Approaches
Basics of Qualitative Research : Techniques and Proceduresfor Developing Grounded Theory
IEEE Transactions on Software Engineering ∼ nernst/papers/mylopoulos92representing.pdf6. J. Feller and B. Fitzgerald, “A framework analysis of the open source software developmentparadigm,” in ICIS ’00: Proceedings of the twenty first international conference on Informationsystems . Atlanta, GA, USA: Association for Information Systems, 2000, pp. 58–69. [Online]. Available:http://portal.acm.org/citation.cfm?id=3597237. D. M. German, “Using software trails to reconstruct the evolution of software,”
Journal of SoftwareMaintenance and Evolution: Research and Practice , vol. 16, no. 6, pp. 367–384, November 2004.[Online]. Available: http://dx.doi.org/10.1002/smr.3018. M. W. Godfrey and Q. Tu, “Evolution in open source software: A case study,” in
ICSM ’00:Proceedings of the International Conference on Software Maintenance (ICSM’00) . Washington,DC, USA: IEEE Computer Society, October 2000, pp. 131–142. [Online]. Available: http://portal.acm.org/citation.cfm?id=8534119. B. Fitzgerald, “A critical look at open source,”
Computer , vol. 37, no. 7, pp. 92–94, 2004. [Online].Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=131024910. E. S. Raymond,
The Cathedral & the Bazaar (paperback)
Producing Open Source Software : How to Run a Successful Free Software Project
IEESoftware , vol. 149, no. 1, pp. 24–39, 2002. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=99908813. W. Scacchi, C. Jensen, J. Noll, and M. Elliott, “Multi-modal modeling, analysis and validationof open source software requirements processes,” in
Intern. Conf. Open Source Software ∼ wscacchi/Papers/New/Scacchi-Jensen-Noll-Elliott-MKIDS04.pdf14. A. Bonaccorsi and C. Rossi, “Intrinsic motivations and profit-oriented firms in open sourcesoftware: Do firms practise what they preach?” in Intern. Conf. Open Source Software , Genova,Italy, July 2005, pp. 241–245. [Online]. Available: http://oss2005.case.unibz.it/Resources/Proceedings/OSS2005Proceedings.pdf15. P. Adams, D. Nutter, S. Rank, and C. Boldyreff, “Using open source tools to support collaborationwithin calibre,” in
Int. Conf. Open Source Software , Genova, Italy, July 2005, pp. 61–65.16. N. A. Maiden and C. Ncube, “Acquiring cots software selection requirements,”
IEEE Software , vol. 15,no. 2, pp. 46–56, 1998. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=66378417. M. Morisio, C. B. Seaman, A. T. Parra, V. R. Basili, S. E. Kraft, and S. E. Condon,“Investigating and improving a cots-based software development process,” in
Int. Conf. onSoftware Engineering , Limerick, Ireland, June 2000, pp. 32–41. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=87039418. R. Moen, “Fear of forking,” http://linuxmafia.com/faq/Licensing and Law/forking.html, November1999. [Online]. Available: http://linuxmafia.com/faq/Licensing and Law/forking.html19. J. C. Georgas, M. M. Gorlick, and R. N. Taylor, “Raging incrementalism: harnessing change with open-source software,” in
ICSE Workshop on Open source software engineering , vol. 30, no. 4. New York, NY,USA: ACM Press, July 2005, pp. 1–6. [Online]. Available: http://dx.doi.org/10.1145/1083258.10832630. H. A. Simon, “Motivational and emotional controls of cognition,”
Psychological Review
Communicationsof the ACM , vol. 32, no. 12, pp. 1415–1425, December 1989. [Online]. Available: http://dx.doi.org/10.1145/76380.7638222. S. Muthanna, K. Kontogiannis, K. Ponnambalam, and B. Stacey, “A maintainability model forindustrial software systems using design level metrics,” in
Working Conference on Reverse Engineering ,2000, pp. 248–256. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=89147623. C. Rajaraman and M. R. Lyu, “Reliability and maintainability related software coupling metricsin c++ programs,” in
Intl Symp. on Software Reliability Engineering , 1992, pp. 303–311. [Online].Available: http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=28589824. C. Snyder,
Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces
Software Cost Estimation with Cocomo II (with CD-ROM)
The Research Methods Knowledge Base
The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition