[PDF] Aristotle vs. Ringelmann: On Superlinear Production in Open Source Software

Abstract

Organizations exist because they provide additional production gains, in comparison to horizontal ways of allocating resources, such as markets, and the open source movement is deemed to be a new kind of peer-production organization somehow in between hierarchically organized firms and markets. However, to strive as a new kind of organization, open source must provide production gains, which in turn should be measurable. The open source movement is particularly interesting to study for this reason. Here, we confront and discuss two contrasting views, which were reported in the literature recently. On the one hand, Sornette et al. uncovered a superlinear production mechanism, which quantifies Aristotle adage: `the whole is more than the sum of its parts'. On the other hand, Scholtes et al. found opposite results, and referred to Maximilien Ringelmann, a French agricultural engineer (1861-1931), who discovered the tendency for individual members of a group to become increasingly less productive as the size of their group increases. Since Ringelmann, the topic of collective intelligence has interested numbers of researchers in social sciences and social psychology, as well as practitioners in management aiming at improving the performance of their team. In most research and practice case studies, the Ringelmann effect has been found to hold, while, in contrast, the superlinear effect found by Sornette et this http URL novel and may challenge common wisdom. Here, we compare these two theories, weigh their strengths and weaknesses, and discuss how they have been tested with empirical data. We find that they may not contradict each other as much as was claimed by Scholtes et al.

Full PDF

aa r X i v : . [ c s . S E ] A p r Aristotle vs. Ringelmann

On Superlinear Production in Open Source Software

Thomas Maillart ∗ University of Geneva, Switzerland

Didier Sornette † ETH Zurich, Switzerland (Dated: April 17, 2019)Organizations exist because they provide additional production gains, in compar-ison to horizontal ways of allocating resources, such as markets [1], and the opensource movement is deemed to be a new kind of peer-production organization some-how in between hierarchically organized ﬁrms and markets [2]. However, to striveas a new kind of organization , open source must provide production gains, which inturn should be measurable. The open source movement is particularly interesting tostudy for this reason. Here, we confront and discuss two contrasting views, whichwere reported in the literature recently. On the one hand, Sornette et al. [3] un-covered a superlinear production mechanism, which quantiﬁes Aristotle adage: “thewhole is more than the sum of its parts” . On the other hand, Scholtes et al. [4] foundopposite results, and referred to Maximilien Ringelmann, a French agricultural en-gineer (1861-1931), who discovered the tendency for individual members of a groupto become increasingly less productive as the size of their group increases [5]. SinceRingelmann, the topic of collective intelligence has interested numbers of researchersin social sciences and social psychology [6], as well as practitioners in managementaiming at improving the performance of their team [7]. In most research and practicecase studies, the Ringelmann eﬀect has been found to hold, while, in contrast, thesuperlinear eﬀect found by Sornette et al.is novel and may challenge common wisdom[3]. Here, we compare these two theories, weigh their strengths and weaknesses, anddiscuss how they have been tested with empirical data. We ﬁnd that they may notcontradict each other as much as was claimed by Scholtes et al. [4].

I. INTRODUCTION

In psychology (Gestalt theory [8]), biology (brain functions [9], ecological networks [10]),physics (spontaneous symmetry breaking [11] and the “more is diﬀerent” concept [12]),and in economics [13, 14], the famous adage by Aristotle “the whole is more than the sumof its parts” has inspired research in complexity science, in particular regarding emergingbehaviors in nature and society [15, 16]. Indeed, the raison d’ˆetre of societies is the prospectthat people will achieve more together, yet at some individual alienation costs [17]. For asociety to strive, these alienation costs should overall be smaller than the beneﬁts a societycan bring to its members. Ideally, fair distribution of beneﬁts should be organized throughinstitutions [18] that implement robust mechanisms to enforce cooperation [19, 20].One special instance of a society is the ﬁrm. A ﬁrm is an organization devoted toproduction, which is born from the internalization of transaction costs associated withgathering production resources, in particular human resources: at some point it is lesscostly to permanently hire an individual whose skills are needed often than sourcing themrepeatedly on a market [1]. Hence, the employee enters a permanent contractual relationand thus, an organizational structure at some alienation costs (less freedom to contractwith other parties).The open source movement operates in a slightly diﬀerent fashion: peer-production [2] prescribes that participants to an open source project mainly obey two rules : (i)task self-selection and (ii) peer-review. In a nutshell, contribution (i.e., production)enforcement mechanisms are very loose, neither relying on hierarchical organization normarket mechanisms and there is no clear counterpart to contributions in open sourcedevelopment. The lack of explicit organization rules in open source has generated muchattention in management science [21, 22], complex systems and network dynamics [23], lawand economics [24, 25], with one overarching question being how self-organized communitiesgather and, moreover, produce eﬃciently together in absence of organizational rules clearlytied to incentives [26]. ∗ Electronic address: [email protected] † Electronic address: [email protected]

The open source movement gathers people with heterogeneous incentives, rangingfrom hedonism to paid jobs [26]. It is therefore diﬃcult to measure the implications ofindividual, and of collective intelligence and coordination, on the production of sourcecode. In particular, there is the question of how cumulative innovation emerges fromself-selected contributions and peer-review, which on average make software more robustand help the emergence of new functionalities. Measuring production and productivity ofcollective intelligence may be a signiﬁcant addition to the debate, and attempts to measureproductivity of software developers is nearly as old as the software industry [27], withseveral models developed to measure the eﬃciency of software programmers, yet with theassumption that programmers work in a corporate environment, which is usually highlyscheduled.The bottom-up and collective intelligence aspects of production have been much less cov-ered, in open source and more generally, in open collaboration [28]. Dealing with groupssuch as ﬁrms and production units, management science also aims to understand when andhow a group can produce more than the sum of its individual contributions, and to designways to improve team performance [29–32], through the mechanism of complementarity inorganization [33, 34] and innovations [35]. Because most activities in our modern environ-ment require coordination and collaborative actions within groups of widely varying sizes,it is the fundamental aspiration of any manager, be it in the public or private sector, toﬁnd and master the determinants of enhanced productivity. Since Ringelmann, the topicof collective intelligence has interested numbers of researchers in social sciences and socialpsychology [6], as well as practitioners in management aiming at improving the performanceof their team [7].Despite their conﬂicting views, the contributions by Sornette et al. [3] and Scholtes etal. [4] provide key insights on that matter, in particular, yet not limited to, open sourcedevelopment. We focus on these two papers in particular because Scholtes et al. [4] chal-lenged evidence brought forth by Sornette et al. [3], creating confusion or perhaps evenworse the sentiment that the superlinear productivity law is stillborn, having been killedjust out of its academic womb. In the remaining of this paper, we describe and comparethe two approaches (Section II), then discuss the strengths and weaknesses of each approach(Section III), and conclude (Section IV).

II. PRODUCTION AND PRODUCTIVITY MEASURES FOR OPEN SOURCESOFTWARE

Here, we expose the two contrasting perspectives taken by Scholtes et al. [4] on the onehand, and by Sornette et al. [3] on the other hand.

A. The Ringelmann eﬀect in software engineering

There is a common wisdom supported by a vast majority of studies, which tend to showthat teams of software developers become less productive as they get bigger. In empiricalsoftware engineering, this phenomenon is known as the

Brooks law of software projectmanagement , which states that “adding manpower to a late software project makes it later” [27]. The identiﬁed cause for the Brooks’ law is the increasing coordination costs involvedas teams get larger. In social psychology, this phenomenon is also known as the Ringelmanneﬀect, in reference to Maximilien Ringelmann, a French agricultural engineer (1861-1931)who discovered the tendency for individual members of a group to become increasingly lessproductive as the size of their group increases [5].Scholtes et al. [4] performed a study using a dataset of 58 open source software (OSS)projects, which amount in total to more than 580,000 commits contributed by morethan 30,000 developers. Their study indeed ﬁnds that the Ringelmann eﬀect seems tohold on average . Here is the way they proceeded for their study. While in structuredorganizations, a team can be easily deﬁned and measured, a team in OSS projects is morecomplicated. Indeed, Scholtes et al. [4] reported that 40% of contributions to OSS projects(i.e., commits ) were made by one-time contributors. Researchers have identiﬁed diﬀerentcircles of contributors from a core team (producing up to 90% of the source code), toless involved contributors, to one-shot contributors, and ﬁnally, to lurkers, who follow theadvancement of a project without contributing to the source code, and yet participatinge.g., on the mailing list or posting issues [36]. The heterogeneous, distributed, and unevenproportion of contributions makes the study of OSS project organizations complicated,

Scholtes et al. Sornette et al.OSS Projects studied

56 164

Number of Contributors ,

845 15 , Number of Commits ,

353 8 , , Project sampling • > • >

50 active developers • ⊂

100 most popularprojects in GitHub Random sampling : Powerlaw distribution found

P r ( size > S ) ∼ /S . with S the number of contributorsTABLE I: Comparison of datasets used by Scholtes et al, and Sornette et al. particularly across projects, themselves of heterogeneous nature.Scholtes et al. provided a dynamic formulation of what a team is, considering that adeveloper, who has not contributed after 295 days has 10% chance to further contribute.Therefore, they chose a window of 295 days to deﬁne a team size at time t , which is thecount of contributors who have committed at least once in the last 295 days. The outputmust also be measured. Various measures of source code production have been developedto account for contribution eﬀort [37, 38]. Scholtes et al. decided to focus on quantifyingchanges, as measured by the edit distance (also called the Levenshtein distance [39]), i.e.,the minimum number of bytes one has to permute/add/delete to compare between a versionof the source code and a committed update. Scholtes et al. used averaged contributions overtime windows of 7 days (rationalized by the fact that in 90% of the cases, two consecutivecommits occur within this time window).Scholtes et al. ﬁrst measured the output (i.e., number of commits and contributions asdeﬁned by edit distance over the last 7 days) as a function of the input (i.e., active developerswithin the same 7 days window at time t ). They found that, when the number of developersincrease, the mean contribution per developer decreases. Moreover, when considering meancontribution per active developer as a function of team size (i.e., developers active in the last295 days), the results show the same negative scaling function [for commits : ∼ team − . ( p < .

001 and r = 0 .

16) and for contributions ∼ team − . ( p < .

001 and r = 0 . r in both case,reﬂecting the high variability of their average measures, and as such, they conclude that itis impossible to make robust predictions from these scaling laws. Considering the output asa function of team size (here, the input is considered as the amount of resources available,i.e., contributors who have roughly more than a 10% chance to contribute), again negativescaling properties are found [for commits : team − . ( p < .

001 and r = 0 .

44) and forcontributions : team − . ( p < .

001 and r = 0 . “OSS communities are indeed nomagical exception from the basic economics of collaborative software engineering” , and theyfurther attempted to substantiate the observed decreasing return on scale, considering twocommonly accepted reasons for the Ringelmann eﬀect : (i) free-loading and (ii) coordinationcosts. They concentrated on the latter because there is a substantial body of evidenceand research work on coordination in software engineering. Although the authors did notmention it, it is indeed hard, if not impossible, to deﬁne free-loading when contributorsare actually not compelled to contribute (following the general rules of peer-productionapplicable in OSS). To assess coordination eﬀort and its eﬀects on productivity, Scholteset al. borrowed from Cataldo et al. [40] and computed the co-edition directed network forall developers (direction stands for chronological inﬂuence) as a function of time (i.e., timewindows of 7 days), with a distinction between out-degree k out (i.e., one developer has tobuild on changes by k out other developers) and in-degree k in (i.e., k in developers must buildon changes by one developer). Scholtes et al. considered ﬁrst the mean out-degree as afunction of the size of the coordination network [69]. The mean out-degree and the size ofthe coordination network seem to be positively correlated, but it is not clear what we canlearn from this result (Figure 11 in Scholtes et al. [4]; not discussed in the paper). Finally,Scholtes et al. considered mean out-degree as a function of the negative productivityscaling exponent described above. They found that projects with “strongly negative andsigniﬁcant slopes for the scaling of productivity also exhibit pronouncedly positive scalingexponents for the growth of the mean (weighted) out-degree” (Figure 12 in Scholtes et al. [4]). Scholtes et al. Sornette et al.Activity window

Team deﬁnition s := at least one contribution inlast 295 days all developers Active developers within 7 days windows c (within 5 days windows) Primary productionmeasure commit contributions (Levenshteindistance between commits) R := number of commits Productivity n := mean number of commits; c := mean number of commitcontributions per active developer number of commits R performed by c activedevelopers Productivity scalingproperties n ∼ s α with α ≈ − . c ∼ s α with α ≈ − . R ∼ c β with ˆ β ≈ / Based on these results, Scholtes et al. asserted that OSS projects exhibit dis-economiesof scale in production as a function of team size, and hence, sub-linear productivity. Theyrejected the evidence that “the whole is more than the sum of its parts” evidenced by thesuperlinear productivity shown by Sornette et al. [3].

B. The Aristotle eﬀect perspective

In contrast to the previous section, by analyzing 164 open source software (OSS) projectsof broadly distributed sizes ranging from 5 to 1,678 contributors, Sornette et al. [3] foundthat contribution activity R , deﬁned in terms of number of commits, within a time windowof 5 days, is a superlinear function R ∼ c β of the number of active developers c during thesame period. The superlinear exponent is on average ˆ β ≈ /

3, over all projects studied,with a rather large variability with ˆ β ranging from 1 to 3. They found that ˆ β tends todecrease with the number of contributors in the ﬁve day window, ﬂuctuating around 1 orless for more than 30 to 50 contributors. Moreover, as reported in Sornette et al. [3], thedistribution of total number of developers per project is heavy-tailed, i.e., with many smallprojects and a few very large ones.Sornette et al. explored two possible mechanisms generating the observed superlinearphenomenon : (i) an interaction-based mechanism (including interactions leading to a phasetransition or to a super-radiance phenomenon [41]) and (ii) a large-deviation mechanism,based on the fact that, in the presence of a heavy-tailed distribution of contributors perproject, many developers contribute just few commits while a minority contribute most ofthe commits; then, the larger the group size, the more likely it is for a large contributorto be present, leading to the superlinearity phenomenon. The observation that a fewdevelopers dominate the overall contribution is well-known in OSS, and is also reportedby Scholtes et al. Sornette et al. did not attempt to distinguish which one of these twomechanisms might be at work. They however considered that both the interaction-basedand the large deviations mechanisms can be captured together by a generic cascade process,which has been found to be well described by self-excited Hawkes conditional Poissonprocesses [42], in particular for human dynamics [43–47], taking into account the speciﬁcsof human timing [48].The Hawkes process is deﬁned by the conditional point process intensity I ( t ) of events(commits) given by I ( t ) = λ ( t ) + X i | t t

1, the process is respectivelysub-critical, critical and super-critical [50, 51]. Interpreting a cluster or connected cascadein a given branching process of triggered contributions as the burst of production in agroup of developers, the distribution of contributions is thus mapped onto that of triggeredcluster sizes [52].Sornette et al. found and empirically validated that, at criticality, there is a relationshipbetween the power law tail distribution (with exponent γ ≈ .

5) of activity per contributorsper time bin of 5 days, the power law tail distribution of cluster size, which is equivalent tothe production R per contributor with renormalized exponent µ = 1 /γ and the superlinearscaling exponent β = γ = 1 /γ . However, as already mentioned, Sornette et al. found thatthe superlinear scaling exponent β tends to decrease as a function of the total number ofcontributors in an OSS project. Likewise, the frequency of productive bursts is reduced forlarger projects, suggesting that large projects bear additional coordination costs. III. ARISTOTLE VERSUS RINGELMANN ?

Before considering the fundamental diﬀerences between the two approaches presentedhere, and their validity, we shall highlight some results, which to some extent bearresemblance. Sornette et al. found that large projects tend to exhibit less powerful andless frequent superlinear productive bursts. This result may look similar to the ﬁndingsby Scholtes et al., who studied only large projects. However, Sornette et al. do not saythat there are dis-economies of scale, but rather that economies of scales appear to beweaker. Similarly to the co-edition directed network model developed by Scholtes et al.,the self-excited Hawkes conditional Poisson process measures how past commits inﬂuencefuture commits. It does so in a way that incorporates the inﬂuence all past events, whilethe network approach by Scholtes et al. relies only on 7-day contribution windows. In otherwords, Scholtes et al. considered that, in order to be dependent (and bear coordinationcosts between each other), two commits must occur within the same time window. Thenetwork approach, taking into account who changed which ﬁle, brings more information0regarding how contributions relate to each other. It is also interesting to note the closenessof the short- and long-term time windows used in the two studies : 7 and 295 days forScholtes et al. versus 5 and 250 days for Sornette et al. While Scholtes et al. provide ajustiﬁcation, Sornette et al. are only concerned with robustness and check that the sameresults are obtained by varying the short-term window. In Sornette et al., no rationale isprovided for the long-term window.Beyond these resemblance and arguably a common research question, nearly every otheraspects diﬀer in the two studies: the chosen approach, the deﬁnitions of productivity, andthe data used. This raises a number of questions on the main claim by Scholtes et al. thatSornette et al. were wrong about the superlinear productive bursts [4]. In the following,we thus highlight and discuss these methodological divergences. Moreover, we discuss theapproach chosen by Scholtes et al. to take on the results by Sornette et al, in an era thatpromotes open science and, most importantly, reproducibility of scientiﬁc results.

A. Productivity & Team Size

We start by a fundamental conceptual remark that illuminates one key diﬀerencebetween the approach of Scholtes et al. [4] and the one by Sornette et al. [3]. Scholtes etal. consider production in the mean, using as metric the average output per team member (Introduction, 2nd paragraph, line 3), and argue that it increases when synergy eﬀects arepresent and decrease due to communication and coordination overhead (which surges withlarger teams). In contrast, Sornette et al. argued and demonstrated that using an averageoutput is misleading in the presence of highly bursty dynamics characterized by power lawtail distributions with small tail exponents. This empirical fact is also cited by Scholtes etal. and well-documented in the open source software production literature and for otheropen collaboration projects, such as wikis and Wikipedia [53–56]. In open collaboration,a few contributors account for a majority of performed work, whether counted in lines ofcode, commits, ﬁles modiﬁed, and so on. This is one of the features associated with the factthat the distribution of contributions, counted in commits or in lines of codes, possesses apower law tail of the form P ( X > x ) ∼ /x µ with µ < wild [57] in the sense that their two ﬁrst statistical moments (mean and variance) are undeﬁned1and diverge as the sample grows. For such heavy-tail distributions, reasoning in mean isfundamentally erroneous, as Scholtes et al. could indeed experience when trying to performpredictions (c.f., Section 4.1 in [4]). For a ﬁnite number n of developers in the project,it is easy to show that the average production scales as ∼ n /µ for µ < ∼ n for µ ≥

1. Deﬁning productivity as the ratio of the total production by the number n of teammembers, this shows that productivity scales as ∼ n /µ n = n µ − for µ < µ ≥

1. This latter case is the null hypothesis of an approximate constant output per teammember. Superlinear production is quantiﬁed by µ <

1, leading to a growing productivityper team member, the larger the team. Searching for a superlinear productivity is diﬀerentfrom seeking a superlinear production, the former requiring µ − >

1, i.e., µ < /

2, whilethe later just needs µ <

1. In their dataset of 164 projects, Sornette et al. found that onlyfour projects are characterized by µ < . µ < ≈ .

8% chance to ﬁnd oneproject for which the average productivity scales superlinearly with team size.More generally, the deﬁnition of productivity needs to be carefully addressed. Indeed,an open source software community does not come into being fully grown. It startsrather small and then grows progressively – one could say organically – with the project.When growing, the community bears increasing communication and coordination costs aspointed out by Scholtes et al. While recognizing the importance of diﬀerent team sizes,Scholtes et al., picked projects meeting the following criteria: (i) at least one year ofactivity, (ii) 50 diﬀerent active developers, and (iii) being among the 100 most popularprojects, as measured by the number of forks on GitHub, a leading online service foropen source software production. In contrast, Sornette et al. chose a representativesample of the open source ecosystem with 134 projects with less than 50 developersand 30 projects with more than 50 developers (with a minimum of 5 developers). Therepresentative sampling of projects (see Figure 1 in [3]) showed that the superlinearproduction is usually valid only for projects of sizes no more than 30 to 50 members whoare active at a given time. Sornette et al. found statistically signiﬁcant evidence thatthe superlinear production tends to fade away to just linear production (i.e., constantproductivity per developer) for projects with more than 50 developers (see Figure 8 in [3]).In other words, the sample selection made by Scholtes et al. seems heavily biased towards2large projects, which represent the few large (presumably older) projects and are indeedexposed to more communication and coordination costs, and also exhibit less synergy eﬀects.More speciﬁcally, Scholtes et al. deﬁned a team as the set of developers who are activeat least once within a time window of 295 days, determined by the 90th quantile of thedistribution of times between two consecutive commits by the same person. This deﬁnitionexcludes developers with a unique contribution, who nevertheless account for 40% of allcontributions, as reported by Scholtes et al. in Section 3.2 of [4] (end of second paragraph).In line with our above remarks concerning the heavy-tailed distribution of contributionsizes, this deﬁnition amounts to throw away the baby with the bath , since it is fundamentallyill-suited to account for the fact that a few, often most senior, developers may not contributefor years in between two commits (see Figure 2 in [58]), while at the same time they mayaccount for most of the contribution production. The deﬁnition of contributors proposedby Scholtes et al. is thus biased with respect to the special nature of the open sourcesoftware community, which is, almost by essence, diﬀerent from a corporate organization,as documented in a number of management science articles (see e.g., [26] and referencestherein).Yet, productivity may be deﬁned in a variety of ways, each with their advantages andshortcomings. Scholtes et al. considered productivity as production per active developersamong a team (deﬁned as an aggregate of working developers in large – 295 days – timewindows), while Sornette et al. considered productivity as production per developer andper time unit (i.e., over a short time period of 5 days). Even though not perfect, the latterdeﬁnition is more ﬁne-grained than the one proposed by Scholtes et al., and precisely allowscapturing the subtle highly non-linear bursts of activity reported in [3], which could notbe observed by averaging developer engagement (over a team aggregate and over time).In essence, two visions oppose each other: Scholtes et al. adopted a software engineering perspective, which takes roots in the necessity to measure the eﬀort and productivity bysoftware developers in a competitive industry. Because of the complexity of informationsystems, and the importance of outsourcing, the software development industry may suﬀerfrom the principal-agent problem [59] and hence, may require controlling. The softwareengineering perspective is reﬂected by the sampling of only large projects (suggesting3that smaller projects are not really worth studying), and by the deﬁnition of activecontributors and teams, which ignores 40% of the contributors. On the contrary, Sornetteet al. considered the OSS ecosystem with no ﬁlter, taking a more general approach inproject sampling, in the deﬁnition of contributions, and in theory elaboration and validation.

B. Commits & superlinear scaling

Productivity is the ratio of an output and an input. So far, we have mainly discussedthe developer input, i.e., the human capital. Scholtes et al. raised concerns about theoutput, and claimed that the number of commits is an erroneous measure of production.For that, they bring forth the following argument: the total number of commits contributedby n developers active in a given time period cannot – by deﬁnition – be less than n , which iswhy the total number of commits must scale at least linearly with team size. This apparentlycommon-sensical claim is incorrect as we demonstrate here. Let us consider n developers.The largest contributor makes N commits (resp. lines of code). The second one contributes N/ α commits. The third one contributes N/ α commits, and the n -th one contributes N/n α commits. If 0 < α < n and N are such that N/n α > n N /α ), thenthe total number of commits contributed by n developers is given by S ( n ) = N α + N α + N α + · · · + Nj α + · · · + Nn α ∼ N · n − α . (2)Thus, in this example, the total contributions of these developers grow sub-linearly as afunction of group size n , with exponent 1 − α . Let us illustrate this demonstration by anumerical example, showing that the sublinear eﬀect is clearly visible even for small teamsizes. Let us assume that N = 10 and α = 1 /

2. For n = 5 developers, the total number ofcommits is 32. For n = 25, the total contribution is equal to 86 commits, which is 2.7 timesthat for the team of 5 developers (and not 5 times more). Note that for the team of 25developers, the ﬁrst contributor makes 10 commits and the last one contributes 2 commits.We believe Scholtes et al. made a very common confusion between absolute numbers andscaling properties. More generally, in the ﬁeld of fractals, this error is also often found inthe literature that confuses the fact that the fractal dimension (here, the scaling exponent)tells nothing (or very little) about the density (here, the number of commits per developer).4Dismissing commits as a measure of production, Scholtes et al. used the Levenshtein editdistance [39] of source code changes between two consecutive commits (i.e., so-called diﬀs ).The Levenshtein edit distance counts the number of permutations, additions and deletionsof characters necessary to match two diﬀerent strings. Using the Levenshtein edit distanceis without doubt more detailed, but it is not suﬃcient to dismiss commits. Even thoughthey have not used the Levenshtein edit distance, Sornette et al. showed that superlinearscaling production holds as well for lines of code (see Figure 3 in [3]), up to an additionalscaling factor that deﬁnes the relation between commits and lines of code. In order toproperly dismiss commits as a measure of contribution, Scholtes et al. may have wanted toshow that there is no relation between commits and their contribution metric, for whichthere is no clear consensus in the scientiﬁc literature. The Levenshtein distance is moredetailed than commits, but may not necessarily contain additional relevant information.Moreover, at a qualitative level, we should stress that using the more detailed Levenshteinedit distance is not without its own problems. One may indeed argue that changing onecharacter or a single line of code in a piece of software, while quantiﬁed as minor bythe Levenshtein distance, could be in some cases a tremendous output reﬂecting a majorcommitment in terms of human capital (think e.g., of a small edit correcting a securityvulnerability) [60–62]. We suggest that a truly faithful measure of input would be the timeeﬀectively spent in front of a computer by a contributor in order to achieve a task for thefocal open source software project. Unfortunately, this information is not available to opensource software researchers and, even if it would be available, one could endlessly debateon a broad (resp. narrow) deﬁnition of time consumption, and whether the coﬀee breakand the ping-pong sessions are actually parts of the production time: nearly all SiliconValley software companies would include this time as truly productive time. Another wayof proceeding would be to use a robust approach to attribute value to each contributioninstead of assuming value. Such an approach to attribute value to contributions has beenpreviously proposed by Maillart et Sornette [63]. In more conditioned environments, otherways to attribute contribution value to individuals engaged in collective intelligence havebeen tested and studied [64].5 C. From Aristotle to Ringelmann : a missed opportunity for reproducible science ?

Open science is nowadays highly promoted to ensure reproducibility of scientiﬁc results,and to encourage research groups to “debug” and build upon each other works [66, 67]. Theopen science movement is inspired by the open source software movement, best summarizedby the seminal adage :

Given enough eyeballs, all bugs are shallow [65]. The authorsof the paper

From Aristotle to Ringelmann: a large scale analysis of team productivityand coordination in Open Source Software projects are (or were) members of the Chairof Systems Design. The Chair of Systems Design has been known to be a pioneeringresearch group at ETH Zurich, advocating the use of open source software and contributingsigniﬁcantly to the open access movement. Sornette et al. published in PlosOne, which theﬁrst and leading open access scientiﬁc journal. Along with the paper, they submitted andshared the data they used for their study. Scholtes et al. clearly framed their paper as aresponse to Sornette et al. ( “From Aristotle to Ringelmann” in the title), and they claimedthat the results published by Sornette et al. do not hold. The claim by Scholtes et al. isat best misleading as they did not bind to elementary principles of science reproducibility.First, they neither used the same data nor detailed the potential implications of usinga diﬀerent dataset. Second, they neither invalidated the method by Sornette et al. norcompared thoroughly both approaches, with their pros and cons, as we have done above.Third and foremost, they did not bring compelling arguments for changing the assumptionsunderlying the analyses. These limitations deeply undermine their claims that the resultsof Sornette et al. are incorrect, as we have shown above. As a result, it is challenging toweigh the value of one approach against the other, and in this regard, limits the pertinenceof the contribution by Scholtes et al.Scholtes et al. submitted to and published their paper in the

Journal Empirical SoftwareEngineering . First, one would expect that claims questioning the validity of the resultsobtained by Sornette et al. should have been sent to the same journal (i.e., PlosOne), asa comment or a follow-up paper to the editors. Second, it is rather surprising that theeditors and the reviewers of the

Journal Empirical Software Engineering did not raise anyissue concerning the approach by Scholtes et al. to rebut the ﬁndings by Sornette et al.,in particular given the many problems that we have highlighted above. Third, when the6present authors attempted to send an earlier version of this manuscript [68] as a responseto the editor, they received the following response from the editors of the

Journal Empiri-cal Software Engineering : “the [...] journal does not publish any responses to articles. Weencourage you to expand the response to a full research paper, e.g., by repeating the exper-iments, adding additional research questions, etc” . In other words, the editors barred thepossibility to react to Scholtes et al. in their journal, and they asked the present authors toperform what they should have requested at ﬁrst for the manuscript by Scholtes et al. IV. CONCLUSION

We have carefully described the two methods and results by Sornette et al. [3] andScholtes et al. [4], and their apparent opposite results (i.e., the Aristotle vs. Ringelmanneﬀects), with emphasis on their commonalities and diﬀerences. Despite claiming that theresults by Sornette et al. do not hold, Scholtes et al did not use the same data (made availableby Sornette et al. following open access standards), they used a totally diﬀerent methodology(based on averages) that does not allow a direct testing of the methods and results bySornette et al. (designed to be able to quantify bursty dynamics and large deviations).However compelling and probably valid in its own way, the method followed by Scholtes et al.does not help directly invalidate the results and theory by Sornette et al. We believe there ismuch room for the

Aristotle vs. Ringelmann debate, and we are glad that Scholtes et al. tookupon the challenge. Our conclusion is that Sornette et al.’s results hold for no more than 30-50 contributors working simultaneously, while Scholtes et al.’s results may apply for largerprojects. Yet, we believe that proceeding in a way that follows good practices regardingopen and reproducible science, as well as using a more standard publication channel fortheir challenge, would have helped developing a much more data grounded, constructiveand serene debate.7

Acknowledgements

Thomas Maillart acknowledges support from the Swiss National Science Foundation(grants P3P3P2 167694 and P300P2 158462). [1] Coase RH The Nature of the Firm,

Economica : 386–405 (1937).[2] Benkler Y, Coase’s Penguin, or, Linux and ”The Nature of the Firm”, The Yale Law Journal : 369+ (2002).[3] Sornette, D., Maillart, T. and Ghezzi, G., How Much Is the Whole Really More than the Sumof Its Parts? 1? 1= 2.5: Superlinear Productivity in Collective Group Actions. Plos one ,e103023 (2014).[4] Scholtes, I., Mavrodiev, P. and Schweitzer, F. , From Aristotle to Ringelmann: a large-scaleanalysis of team productivity and coordination in Open Source Software projects,

Empir.Software Eng. , , 642-683 (2016).[5] Ringlemann, M., Recherches sur les moteurs anim´es: Travail de l’homme, Annales de l’InstitutNational Agronomique.

1, (1913).[6] Woolley, A. W., Chabris, C. F., Pentland, A., Hashmi, N., and Malone, T. W. Evidence for acollective intelligence factor in the performance of human groups. science, (6004), 686-688,(2010).[7] Woolley, A. W., Aggarwal, I., and Malone, T. W. (2015). Collective intelligence in teams andorganizations.

Handbook of collective intelligence, MIT Press , 143-168.[8] Humphrey, G., The Psychology of the Gestalt,

Journal of Educational Psychology : 401(1924).[9] Damoiseaux, J., Greicius, M.,Greater than the sum of its parts: a review of studies combiningstructural connectivity and resting-state functional connectivity. Brain structure & function : 525–533 (2009).[10] Jorgensen, S.E. Introduction to Systems Ecology (Applied Ecology and Environmental Man-agement).

CRC Press (2012).[11] Anderson, P.W. Plasmons, Gauge Invariance, and Mass.

Physical Review Online Archive(Prola) [12] Anderson, P.W. More Is Diﬀerent. Science

University of Michigan Press (1994).[14] Krugman, P. The Self Organizing Economy.

Wiley-Blackwell, 1 edition. (1996).[15] Sornette, D. Critical Phenomena in Natural Sciences: Chaos, Fractals, Self-organization andDisorder: Concepts and Tools (Springer Series in Synergetics).

Springer, 2nd edition. (2006).[16] Perc, M. Self-organization of progress across the century of physics.

Scientiﬁc Reports , John Wiley & Sons (2014) .[18] Ostrom, E., Governing the commons.

Cambridge University Press (2015).[19] Axelrod, R., The Evolution of Cooperation (Revised Edition),

Basic Books , (2006)[20] Ostrom, E. Gardner, R. and Walker, J., Rules, games, and common-pool resources.

Universityof Michigan Press (1994).[21] Lakhani, K. R., and Wolf, R. G. Why hackers do what they do: Understanding motivation andeﬀort in free/open source software projects.

MIT Sloan Working Paper No. 4425-03 (2003).[22] Roberts, J. A., Hann, I. and Slaughter, S.A., Understanding the motivations, participation,and performance of open source software developers: A longitudinal study of the Apacheprojects

Management science Physical Review Letters

21 : 218701 (2008).[24] Sen, R., Subramaniam, C. and Nelson, M. L., Determinants of the choice of open sourcesoftware license.

Journal of Management Information Systems Journal of Economic Perspectives

2, 99-120 (2005).[26] Von Krogh, G., Haeﬂiger, S., Spaeth, S., and Wallin, M. W., Carrots and rainbows: Motivationand social practice in open source software development.

Mis Quarterly , 36(2), 649-676 (2012).[27] Brooks, F.P., The mythical man-month.

Addison-Wesley (1975).[28] Chesbrough, HW. Open innovation: The new imperative for creating and proﬁting fromtechnology.

Harvard Business Press (2006).[29] Tziner, A., Eden, D., Eﬀects of Crew Composition on Crew Performance: Does the Whole Equal the Sum of Its Parts?,

Journal of Applied Psychology : 85–93 (1985).[30] Sundstrom, E., De Meuse, K.P., Futrell, D., Work teams: Applications and eﬀectiveness. American psychologist : 120 (1990).[31] Cohen, S.G., Bailey, D.E., What Makes Teams Work: Group Eﬀectiveness Research from theShop Floor to the Executive Suite, Journal of Management : 239–290 (1997).[32] Neuman, G.A., Wright, J., Team eﬀectiveness: beyond skills and cognitive ability., Journal ofApplied Psychology : 376 (1999).[33] Ennen, E., Richter, A., The Whole Is More Than the Sum of Its Parts Or Is It? A Review ofthe Empirical Literature on Complementarities in Organizations. Journal of Management : 207–233 (2010).[34] Lin, Y., Beyerlein, M.M., Communities of practice: A critical perspective on collaboration. Advances in Interdisciplinary Studies of Work Teams : 53–79 (2006).[35] Sacramento, C.A., Chang, M.W.S., West, M.A., Team innovation through collaboration, Ad-vances in Interdisciplinary Studies of Work Teams : 81–112 (2006).[36] David, P. A., and Rullani, F., Dynamics of innovation in an open source collaboration en-vironment: lurking, laboring, and launching FLOSS projects on SourceForge, Industrial andCorporate Change , (4), 647-710 (2008).[37] Boehm, B.W., Software engineering economics, IEEE Trans Software Eng (1):421 (1984).[38] Boehm, B.W., Clark, H., Brown, R., Chulani, MR, Steece, B., Software cost estimation withCocomo II with Cdrom, 1st edn. Prentice Hall PTR, Upper Saddle River (2000).[39] Levenshtein, V. I. . Binary codes capable of correcting deletions, insertions, and reversals.

Soviet physics doklady (8),707-710 (1966).[40] Cataldo, M., Wagstrom, P.A., Herbsleb, J.D. and Carley, K.M., Identiﬁcation of coordinationrequirements: implications for the design of collaboration and awareness tools, In: Proceedingsof the 2006 20th anniversary conference on computer supported cooperative work, CSCW 06,ACM, New York , pp 353 362. doi:10.1145/1180875.1180929 (2006)[41] Gross, M., Haroche, S. ,Superradiance: An essay on the theory of collective spontaneousemission,

Physics Reports : 301-396 (1982).[42] Hawkes, A.G. and Oakes, D. (1974) A Cluster Process Representation of a Self-ExcitingProcess, Journal of Applied Probability Point Process Modeling of Crime.

Journal of the American Statistical Association , 100-108 (2011).[44] Baldwin, A., Gheyas, I., Ioannidis, C., Pym, D. and Willams J., Contagion in CybersecurityAttacks,

In: Workshop on the Economics of Information Security , (2012).[45] A¨ıt-Sahalia, Y., Cacho-Diaz, J., and Laeven, R.J.A. Modeling Financial Contagion UsingMutually Exciting Jump Processes.

Journal of Financial Economics (3), 585-606 (2010).[46] Filimonov, V. and Sornette, D., Quantifying reﬂexivity in ﬁnancial markets: Toward a pre-diction of ﬂash crashes,

Physical Review E : 056108+ (2012).[47] Filimonov, V., Bicchetti, D., Maystre, N., and Sornette, D. Quantiﬁcation of the high level ofendogeneity and of structural regime shifts in commodity markets. Journal of InternationalMoney and Finance Physical Review E : 056101+(2011)[49] Daley, D.J., Vere-Jones, D. An introduction to the theory of point processes, Springer, 2ndedition (2003).[50] Helmstetter, A., Sornette, D. Subcritical and supercritical regimes in epidemic models ofearthquake aftershocks

Journal of geophysical research (B10): 2237 (2002).[51] Helmstetter, A., Sornette, D. Importance of direct and indirect triggered seismicity in theETAS model of seismicity,

Geophys Res Lett (11): 1576+ (2003).[52] Saichev, A.I., Helmstetter, A., Sornette, D. Power-law distributions of oﬀspring and generationnumbers in branching models of earthquake triggering, Pure and Applied Geophysics :1113-1134 (2005).[53] Robles, G., Koch, S., and Gonzalez-Barahona J.M., Remote analysis and measurement of libresoftware systems by means of the CVSAnalY tool.

In Proceedings of the 2nd ICSE Workshopon Remote Analysis and Measurement of Software Systems (RAMSS)

In Proceedings of the 2008 international working conference on Miningsoftware repositories , 99-108 (2008).[55] Alali, A., Kagdi, H., and Maletic, J. I., What’s a typical commit? A characterization ofopen source software repositories.

IIn: The 16th IEEE International Conference on Program Comprehension. ICPC 2008, pp 182191. (2008).[56] Arafat, O., and Riehle, D., The commit size distribution of open source software, In: 42ndHawaii International Conference on System Sciences HICSS’09, 1-8, (2009).[57] Mandelbrot, B., and Taleb, NN.. Mild vs. wild randomness: focusing on risks that matter.

Erscheint in: Diebold, F (2007).[58] Saichev, A., Maillart, T., and Sornette, D., Hierarchy of temporal responses of multivariateself-excited epidemic processes.

The European Physical Journal B , (4), 1-19 (2013).[59] Keil, P., Principal agent theory and its application to analyze outsourcing of software devel-opment. In ACM SIGSOFT Software Engineering Notes

4, 1-5 (2005).[60] Maillart, T., Zhao, M., Grossklags, J., and Chuang, J., Given enough eyeballs, all bugs areshallow? Revisiting Eric Raymond with bug bounty programs.

Journal of Cybersecurity , (2),81-90 (2017).[61] Kuypers, M.A., Maillart, T. and Pat´e-Cornell, E., An Empirical Analysis of Cyber SecurityIncidents at a Large Organization. Stanford Working Paper (2016).[62] Kuypers, M., and Maillart, T. Designing Organizations for Cyber Security Resilience,

In:Workshop on the Economics of Information Security, WEIS’2018 (2018).[63] Maillart, T., and Sornette, D. Using Prediction Markets to Incentivize and Measure CollectiveKnowledge Production arXiv preprint arXiv:1406.7746 (2014).[64] Gulley, N., and Lakhani, K. R. The determinants of individual performance and col-lective value in private-collective software innovation.

Harvard Business School Tech-nology & Operations Mgt. Unit Working Paper No. 10-065. Available at SSRN:https://ssrn.com/abstract=1550352 (2010).[65] Raymond, E. The cathedral and the bazaar.

Knowledge, Technology & Policy , (3), 23-49(1999).[66] Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., ... andContestabile, M., Promoting an open research culture. Science , (6242), 1422-1425 (2015).[67] Munaf`o, M. R., Nosek, B. A., Bishop, D. V., Button, K. S., Chambers, C. D., du Sert, N. P.,... and Ioannidis, J. P. A manifesto for reproducible science. Nature Human Behaviour , (1),0021 (2017).[68] Maillart, T., and Sornette, D., Aristotle vs. Ringelmann: A response to Scholtes et al. onSuperlinear Production in Open Source Software. arXiv preprint arXiv:1608.03608 (2016).2