The scale-free character of the cluster mass function and the universality of the stellar IMF
aa r X i v : . [ a s t r o - ph ] M a y Draft version December 4, 2018
Preprint typeset using L A TEX style emulateapj v. 03/07/07
THE SCALE-FREE CHARACTER OF THE CLUSTER MASS FUNCTION AND THE UNIVERSALITY OFTHE STELLAR IMF.
Fernando J. Selman and Jorge Melnick
European Southern Observatory, Santiago, Chile.
Draft version December 4, 2018
ABSTRACTOur recent determination of a Salpeter slope for the IMF in the field of 30 Doradus(Selman and Melnick 2005) appears to be in conflict with simple probabilistic counting argumentsadvanced in the past to support observational claims of a steeper IMF in the LMC field. In this pa-per we re-examine these arguments and show by explicit construction that, contrary to these claims,the field IMF is expected to be exactly the same as the stellar IMF of the clusters out of which thefield was presumably formed. We show that the current data on the mass distribution of clustersthemselves is in excellent agreement with our model, and is consistent with a single spectrum bynumber of stars of the type n β with β between -1.8 and -2.2 down to the smallest clusters without anypreferred mass scale for cluster formation. We also use the random sampling model to estimate thestatistics of the maximal mass star in clusters, and confirm the discrepancy with observations foundby Weidner and Kroupa (2006). We argue that rather than signaling the violation of the randomsampling model these observations reflect the gravitationally unstable nature of systems with onevery large mass star. We stress the importance of the random sampling model as a null hypothesis whose violation would signal the presence of interesting physics. Subject headings: galaxies: evolution – galaxies: star clusters – galaxies: stellar content – Galaxy:stellar content – stars: formation – stars: mass function – star formation – initialmass function – IMF – Star clusters INTRODUCTION
In a recent paper (Selman and Melnick 2005) we mea-sured the Initial Mass Function (IMF) of the fieldin the 30 Doradus super-association and found thatfor 7 ≤ m/M ⊙ ≤
40 the field IMF can be char-acterized as a power law with the Salpeter (1955)slope. This result contradicts claims of a steep IMFfor the LMC field (Massey et al. 1995; Massey 2002;Gouliermis et al. 2005), and lends support to the hy-pothesis of a universal IMF. However, the observationof an initial mass spectrum of the same slope in clus-ters and the field goes against the probabilistic count-ing arguments of Vanbeveren (1982) as interpreted byKroupa and Weidner (2003, henceforth KW2003). Fol-lowing Vanbeveren, KW2003 posit that if the field pop-ulation is entirely formed out of disrupted clusters, thenthe field IMF must be steeper because there are manymore low mass clusters than massive ones, and low massclusters cannot contain stars more massive than the clus-ters themselves.Although an extensive review of the literature is be-yond the scope of the present work, a brief tour will placeit in its proper context. van Albada (1968b) built groupsof stars by randomly sampling an IMF f ( m ) dm and givesthe formulas for the general order statistics where thedistribution function of the maximum mass star is oneparticular case . Reddish (1978) gives the formulationused by Vanbeveren, which appears to be one of the firstreferences that gives the formula for the mass of the max-imal stellar mass as an integral of the IMF (Equation 16 Recently Oey and Clarke (2005b) have shown using the ran-dom sampling model that the statistics of the maximal mass starin a number of OB association shows evidence of an upper masslimit in the range 100-200 M ⊙ . below). Larson (1982) studied the correlation betweenthe maximum stellar mass and the mass of the parentmolecular clouds in star-forming regions. He noted thatthe observed correlation, M max ∗ = 0 . M . cloud , could beexplained by stochastic sampling of an IMF with Salpeterslope. To study dynamical biasing (van Albada 1968a) inbinary star formation McDonald and Clarke (1993) useda two-step process in which they sample stars assumingthat a certain fraction g ( N ) of them come from groupsof size N, and then sampled a stellar mass spectrum tobuild several statistics of binary stars. The method wasextended by Sterzik and Durisen (1998) to study the de-cay of gravitational few body systems. To build theirclusters they introduced what they called a “two-step”approach which later lead them to the “two-step initialmass function” (Durisen et al. 2001): first draw a clus-ter mass from a cluster mass-function, then draw enoughstars from an stellar IMF to add to the cluster mass. Themethod presented here is similar, but with the importantdifference that we do not censor by mass, but we ratherwork with a cluster spectrum by number, and draw starsfrom a stellar mass function. This has the importantconsequence that we know by construction that thereare no preferred mass scales other than those present inthe stellar mass spectrum or the spectrum of clusters bynumber. A similar model was used by Oey et al. (2004)to study the distribution of clusters by numbers in theSMC to conclude that the data for the high mass group-ings studied is consistent with an n − distribution.Vanbeveren (1982) using the then assumed Salpeterslope for the field stellar IMF concludes that “massive ag-gregates would contain more OB type stars than predictedby the Salpeter IMF.” Interestingly, KW2003 turn theargument around and use the well established Salpeter Selman and Melnickform for the cluster stellar IMF for m > M ⊙ to infera slope steeper than Salpeter for the field stellar IMF inthe same mass range. With the exception of the LMCwork mentioned above, the evidence for a Salpeter slopefor the field stellar population is overwhelming, from theoriginal Salpeter (1955) work on the Milky Way to morerecent work such as that of Scalo (1986) that steepenslightly the slope of the high mass end from 2.35 to 2.7(for recent reviews on this topic the reader is referred toKroupa 2002; Elmegreen 2006). Recently, Weidner andKroupa (2006; henceforth WK2006) used an extensiveset of Monte Carlo simulations to investigate the ques-tion of whether clusters could be constructed by sam-pling stellar IMFs using different sampling prescriptions.Their strong conclusion is that the model in which clus-ters are built by random sampling of a (Salpeter) stellarIMF is falsified by the statistics of the maximal mass starin clusters, stating: “With this contribution we demon-strate conclusively that the purely statistical notion isfalse, and that the stellar IMF is sampled to a maximumstellar mass that correlates with the cluster mass.” The purpose of this paper is to revisit this issue andto examine the question of whether clusters and the fieldsample a universal stellar mass distribution. We showthat what really matters is the (poorly determined) clus-ter mass spectrum in the range of single star masses(
M < M ⊙ ), and that it is more natural to workwith the cluster “number of stars spectrum”, P ( n ), theprobability of a cluster having n stars. We address thisquestion in two different ways: by studying it from firstprinciples, and by actually doing Monte Carlo experi-ments building clusters randomly sampling a universalstellar IMF and comparing the results with the observa-tions. The random sampling model has no other physicsin it than that input from the stellar mass spectrum andthe cluster number spectrum. It should be considered asa null hypothesis for interesting physical processes: itsviolation signals the presence of interesting physics. Oc-cam’s razor should be used with all models that violatethe null hypothesis until strong observational evidencerenders the model untenable.In Section 2 we present the formal framework for thesubsequent analysis, and give an analytical parametriza-tion of the stellar IMF that agrees reasonably well withobservations at all masses. In that section we also presentan analytical relationship between the cluster mass func-tion and the stellar IMF. We use this relation to conductMonte Carlo experiments to simulate the mass distribu-tion of clusters. In Section 3 we compare our simulationswith the observed distribution of embedded clusters pre-sented by Lada and Lada (2003, henceforth LL2003).The claim by LL2003 that there is a preferred mass scalefor cluster formation is not born out by our analysis, anda critical discussion to uncover the sources of this discrep-ancy is presented. In Sections 3.2 and 4 we challenge theview that all stars form in clusters and argue that ourresults favor a view where stars form, or at least acquiretheir final properties, before cluster formation. Section 5summarizes our results and ends with the usual plea formore observations. BUILDING A FIELD POPULATION FROM CLUSTERS:THE FORMALISM.
We will use the term population in the statistical sense:a set with infinitely many elements (Brandt 1998). Con-sider a population of stars with a Salpeter frequency dis-tribution of masses f ( m ). The mass m of the stars there-fore is a random variable with a frequency distribution f ( m ). Let us draw samples from such population with afixed number of stars n and frequency distribution P ( n ) .Each of the samples will be called a cluster althoughsuch “clusters” can contain a single star. This construc-tion is analogous to those used in previous work studyingthe properties of HII regions in galaxies (Oey and Clarke2005a), the more general study of Poissonian fluctuationsin population synthesis models by Cervi˜no et al. (2002),and the analysis of the isolated massive stars in the MilkyWay by de Wit et al. (2005).The frequency distribution function of cluster masseswill be given by, ξ cl ( M ) = ∞ X n =1 M = m m ··· + mn F n ( m , m , · · · , m n ) P ( n ) , (1)where F n is the multivariate frequency distribution ofmasses for a sample of size n , and the summation isunderstood also as a multiple integral over all masses m , · · · , m n satisfying the constraint that they add up to M (which imply quite a complex domain of integration).If the sample is random then the following two conditionsare satisfied:(a) the individual m i must be independent, that is, F n ( m , m , · · · , m n ) = f ( m ) f ( m ) · · · f n ( m n ) , (2)(b) the individual marginal distributions must be iden-tical and equal to the frequency distribution of theparent population, that is, f ( m ) = f ( m ) = · · · = f n ( m n ) = f ( m ) . (3)We can write an explicit expression for the cluster massfunction (we will use lowercase for stellar quantities anduppercase for cluster quantities). Because we consideronly random samples, the variable M = m + m + · · · + m n is also a random variable. Thus, the distributionfunction of M , F n ( M ), can be written as F n ( M ) = + ∞ Z −∞ M Using the formalism described in the previous sectionwe build clusters by randomly sampling the following“universal” stellar IMF, dN ∝ m α e − ( m/m ) q ( m + m ) γ/ dm (15)where γ , α , m , m and q are chosen to give the appropri-ate behavior at low and high masses: − α − γ = − . m = 0 . M ⊙ , m = 150 M ⊙ , q = 3. Figure 1shows this analytical stellar IMF together with the IMFof the Trapezium cluster (Hillenbrand and Carpenter2000; Luhman et al. 2000; Muench et al. 2002). Our an-alytical formula departs from the observations at thehigher masses because we have chosen to preserve theSalpeter slope for 1 ≤ m/M ⊙ ≤ ∼ n stars assuming a scale-free frequency distribution P ( n ) ∝ Selman and Melnick Fig. 1.— The analytical stellar IMF that we use for our MC ex-periments compared with the IMF of the Trapezium cluster byHillenbrand and Carpenter (2000), steps; Luhman et al. (2000),segmented solid lines; and Muench et al. (2002), asterisks. n β , and we build the cluster mass spectrum using Equa-tion 10, ξ cl ( M ) = ∞ X n =1 F n ( M ) P ( n ) , where F n ( M ) is the mass distribution function of clusterswith exactly n stars. Notice that this process is scale-freeonly if the sum starts from n = 1, in which case the onlymass scales of the problem are m l and m u , the lower andupper mass cut-offs of the stellar IMF. Our Monte Carlosimulations consist of repeatedly drawing n stars fromthe Salpeter IMF, calculating M = m + · · · + m n asthe mass of a cluster with n stars, and then obtaining F n ( M ).For the value of β there have been a multitude ofstudies of massive clusters in galaxies, which gives forthe mass functions values ranging between β = − . β = − . β = − . 7. For massive clusters onecan directly use the same exponent for the mass functionas for P ( n ) because, as discussed above, the total massscales with n and the width for fixed n scales with √ n .In this work we will explore β = − . , − . 0, and − . COMPARISON WITH OBSERVATIONS We have identified two observational tests that can beperformed to check the validity of our null hypothesis. First, we will see if we can reproduce the form of theembedded cluster mass function; second, we will see ifwe can reproduce the statistics of the most massive starin clusters. We are aware that we are leaving out testsregarding the characteristics of small n multiple systems,which could falsify it . But multiple systems, althoughnumerous, are not the main source of stars in the field, sothey will not affect the main conclusions of the present The study of the statistics of small n multiple systems is beyondthe scope of the present work, but even here where observationsof the frequency of high mass doubles appear to violate the sim-ple random sampling model, there are physical mechanism whichexplain them preserving the model, namely, dynamical biasing(seeSterzik and Durisen 1998, and references therein). work, namely that the stellar and cluster field IMF canbe the same. The embedded cluster mass function Due to the difficulty defining unbiased complete sam-ples, the important range of clusters masses in the regimeof stellar masses is not well studied. There are nev-ertheless two relatively recent sources based on exten-sive surveys of the literature at the time of publication:Porras et al. (2003) and Lada and Lada (2003). We pre-fer to use LL2003 four our analysis because they giveestimates of the masses of the clusters, although only 4of the clusters in the Porras et al. list that satisfy theconstraint on minimum number of stars of LL2003 arenot included in this catalog. The cluster masses givenin LL2003 were obtained by modeling source counts asa function of limiting magnitudes for two model clusterswith ages of 0.8 Myr and 2 Myr, corresponding to theages of the Trapezium and IC 348 clusters respectively.They assumed a universal IMF and used the average ofthe mass determined for the two assumed ages.Figure 2 shows the empirical data of LL2003 togetherwith the results of six runs of our MC experiments inwhich we built clusters with the above P ( n ) for n ≥ M ∼ . 95 and log M ∼ . β = − . ∼ 40% of the sim-ulations and in almost 100% of the simulations containsless than 2 clusters. LL2003 proposed that the down-turn at smaller masses was evidence for a favored clusterformation mass scale at around M ∼ M ⊙ . However,our simulations indicate that this downturn is naturallyexplained by the cutoff in n they introduced in an oth-erwise scale-free spectrum, without the need to invokea special cluster formation scale. The figure shows thatthe data is best modeled if the cutoff in n is a bit largerthan the LL2003 criterion of n > 35 to select clusters.This is probably the effect of having a sample with aninhomogeneous magnitude limit so that n > 35 becomesonly a lower limit to the actual cutoff. The statistics of the maximal mass star in clusters WK2006 argued that their observed correlation be-tween the maximal star mass and the total mass of clus-ters is not consistent with the hypothesis that clusters areformed by random sampling of a universal stellar IMF.WK2006’s sample of clusters is strongly affected by asize-of-sample effect (see Appendix A). Because of theimpracticality of finding a large ensemble of small clus-ters and thus avoid the problems introduced by the size-of-sample effect, WK2006 performed Monte Carlo ex-periments to determine the statistical properties of eachof their three sampling methods. We were puzzled bytheir Figure 3, which shows that for their random sam-pling method (that corresponds to our Monte Carlo mod-els), the curves of maximal mass star versus cluster mass( M ecl ) have two maxima in the range between ∼ M ⊙ and ∼ M ⊙ . For example, for M ecl = 100 M ⊙ thecurve peaks at ∼ M ⊙ and then again at ∼ M ⊙ .he cluster mass function and the universality of the IMF 5 Fig. 2.— The Lada and Lada (2003) spectrum of masses of em-bedded clusters together with the results of Monte Carlo simula-tions in which we sample the stellar IMF with a number probabilitydistribution β = − . 8, top row; β = − . 0, middle row; β = − . n ≥ 35, left column; and n ≥ 70, right column. Since it is precisely in this mass range that the modelcurve departs most strongly from the data points, wethought that the double peaks could be the result of acomputational error. We therefore decided to repeat thecalculations using our independent algorithm to build thebivariate (maximal stellar mass – cluster mass) probabil-ity distribution. The results of our simulations are shownFigure 3, where, much to our surprise, we reproduce thedouble peaks obtained by WK2006!Figure 3 shows the results of many MC experimentsof the random sdampling model from which we have cal-culated the bivariate probability distribution to have acluster in a given log-mass bin of width 0.5, with a starof maximal mass in a log-mass bin of width 0.1. Lighterareas correspond to a higher probability. Figure 3a cor-respond to MC experiments in which anything is consid-ered a cluster, even system with n=1. Figure 3b consid-ers only cluters with n > 50. The vertical line in Figure 3amarks the position of log M = 1 . 8. Notice that as onemoves from bottom to top along this line one will crosscontour levels that at first increase until the bivariatedistribution reaches a maximum at log m ∗ max ≈ . 9. Ifone continues to move it will reach a minimum at ap-proximately log m ∗ max ≈ . 4, and then it starts increas-ing again reaching a maximum at the point in whichall the cluster mass is in a single star. Notice that thisdouble (local) maxima feature comes from the nature ofthe probability distribution of star masses conditioned tocluster mass ((Fig 4, see below).Interestingly, the clusters from the compilation ofWeidner and Kroupa (2006) (crosses in Fig. 3) all con-centrate in the ridge of the distribution defined bythe first (lower mass) peak described above and donot cover the full mass range allowed by our models.To a lesser extent this is also true of the models byWeidner and Kroupa (2006), whose Figures 4 and 5 showthe data to have a much smaller dispersion around the mean than the models. Nevertheless, the random sam-pling method deviate most from the data due to its “dou-ble peaked” mass distribution. With our preferred ran-dom sampling method clusters with masses in the stellarmass range the most massive stars can have masses sim-ilar to the total cluster mass. Moreover, the conditionalprobability distribution of stellar masses, P ( m | M ), forclusters in the stellar mass range shows that some clus-ters can be dominated by one or at most a few high massstars (Fig 4). This increase in the probability distribu-tion of stellar masses near the cluster mass is also visiblein Figure 2 of Durisen et al. (2001), so the critical ques-tions are whether this effect is real, and if so, whetherit is significant. The fact that the effect is seen in threeindependent investigations argues strongly for the realityof the peak in the random sampling model, but its sig-nificance is debatable. On the one hand, the effect arisesfrom partitioning the data into mass bins, and it is forcedinto existence by the need to satisfy Equation 14, so itis of no physical significance. On the other hand, it bi-ases (toward large values) the mean maximal stellar massversus cluster mass curve used by Weidner and Kroupa(2006) to falsify the random sampling model, so it ishighly significant.Is this a real violation of our null hypothesis signal-ing the presence of some interesting physical effect or isit the effect of improper data, or its analysis, or both?Although WK2006’s conclusions are based on a very lim-ited data-set , it is unlikely that this alone can explainthe difference in the distribution of the data points andthat predicted by the models: the number of clusters ineach mass bin is small but the total sample is not thatsmall and at all masses the data points are delineatingthe lowest maximal stellar mass ridge of the distribution.One possible explanation could come from the highlyhierarchical nature of young stellar systems and thesomewhat arbitrary way in which the parent object of themaximal star is chosen. For example, the cluster Wester-lund 2 has the well known massive binary WR20a, bothcomponents of which with masses ∼ M ⊙ (Rauw et al.2004, 2005). Considering that the ratio of its componentsseparation to the cluster size is smaller than the ratio ofthe cluster size to that of the Milky Way, there is noobjective reason not to have such binary star as a singlepoint in Figure 3, in which case we would have a datapoint in the area devoid of points in that Figure. An ob-jective algorithm to identify clusters of different numberof stars is needed. A step in that direction is the work ofOey et al. (2004) which used the algorithm by Battinelli(1991) to identify groupings of stars and showed that thedistribution of clusters by number of stars is consistentwith n − down to single stars. What is needed is then todetermine the maximal mass star and total mass of thoseclusters to build the bivariate probability distribution.Finally, another explanation is that these data do in-deed falsify our null hypothesis and that they signal thepresence of some interesting physical effect that results inthe mass of the most massive star to depend on the massof its parent cluster. One possibility could be that stars For example, their favored sorted sampling model predicts nosingle star clusters at all, while de Wit et al. (2004, 2005) find trulyisolated massive stars of spectral types ranging between O5 andO9 (see also Zinnecker and Yorke 2007). None of these “single starclusters” are included in KW2006. Selman and Melnick -1 0 1 2 3 4 5 6 ⊙ M/Mgol -1.0-0.50.00.51.01.52.02.53.03.5 x a m ⊙ M / ∗ m g o l -12.0-10.5-9.0-7.5-6.0-4.5-3.0-1.5-1 0 1 2 3 4 5 6 ⊙ M/Mgol -1.0-0.50.00.51.01.52.02.53.03.5 x a m ⊙ M / ∗ m g o l -12.0-10.5-9.0-7.5-6.0-4.5-3.0-1.5 Fig. 3.— (a, top) The bivariate probability distribution functionof maximum stellar mass and cluster mass. The overlaid pointscorrespond to the data in Weidner and Kroupa (2006). This figureshow the result with a cluster P ( n ) ∼ /n starting at n=1. Thegrey levels correspond to the bivariate probability of finding a clus-ter in a log-mass bin of size 0 . × . 1. The white circle representsWR20a taken as a system. The points represent the data points inWK2006 with a few additions from Weidner (2007). The verticalline is drawn at a log M value of 1.8. (b, bottom) Same as (a) butwith n > 50. For details see main text. form in an ordered fashion, less massive stars formingfirst. Once a high mass star is formed this one cleans thecluster of its placental material and the cluster rapidlydisintegrate (Elmegreen 1983). This is the explanationfavoured by WK2006. But there is another possibility:clusters with maximal mass star of too large masses canbe more gravitationally unstable. A cluster with a max-imal mass star of very large mass is characterized alsoby having a smaller total number of stars, and its con-ditional stellar mass function is flatter (see Figure 4).Terlevich (1987) found that a flatter mass spectrum re-sults in a considerably smaller half-life: her model XVevolves an order of magnitude faster that a cluster witha normal Salpeter IMF. This explanation has the featurethat it does not violate the null hypothesis, because clus-ters are formed according to the random sampling modelbut their lifetimes, and thus the probability of observingthem, depend on the mass of the maximal stellar massin them. Fig. 4.— Conditional probability distributions of stellar massesconditioned to cluster mass. The topmost curve correspond toP(m) where the cluster mass has bin marginalized. The othercurves correspond from top to bottome to P(m—M) for log Mequal to -1.25, -0.75, -0.25, 0.25, 0.75, 1.25, 1.75, 2.25, 2.75, 3.25,3.75, and 4.25 respectively. DISCUSSION As mentioned in the Introduction, our results departradically from those of KW2003. Their conclusion thatthe field built by clusters must have a steeper stellar IMFcan be criticized on several accounts as follows. To begin,as noted by Elmegreen (2006), for the observed range ofcluster masses, the predicted steepening in the stellarIMF is rather small going from Γ = − . 35 in clustersto Γ ∼ − . ξ cl ∼ M − for 5 ≤ M/M ⊙ ≤ .As mentioned in the previous section, however, LL2003claim that the cluster mass spectrum turns abruptlydown for masses below M ∼ M ⊙ , that is, it appearsto have a preferred mass-scale. This reduces the numberof small mass clusters thus reducing the predicted differ-ence between cluster stellar IMF and the field. Almostparadoxically, our model invoking a universal stellar IMFshows that this preferred mass-scale is most likely not aphysically significant feature of cluster formation!Another problem is that KW2003 use a Procrusteanapproach in their modeling of clusters where all clustersare forced to have one maximal mass star , that is, a starof the maximum mass allowed by the stellar IMF. Thisassumption, 1 = Z m u m max ∗ f ( m ) dm, (16)forces the upper mass cut-off of the IMF to be an increas-he cluster mass function and the universality of the IMF 7ing function of cluster mass, varying as M /x . From thediscussion leading to Equation 14 it is clear that we cannot recover the input IMF unless we include in our sumsclusters for which m = M .Elmegreen (2006) expanded the formalism of Vanbev-eren and showed analytically and numerically that for apower-law mass distribution of clusters of slope β ≤ P ( m | M ) adopted byElmegreen does not satisfy Equation 14 for any value of β ; it just happens that for β ≤ almost the same as the cluster IMF (they differ by a log-arithmic multiplicative factor). This clearly shows thatthe finding of KW2003, that for β > P ( m ) and P ( m | M ) have the same func-tional form, (simple power-laws of the same slope in thecase of Elmegreen). While this is a very good assump-tion for very massive clusters, it clearly does not applyfor clusters in the stellar mass range which are the onesresponsible for “tilting” the sum-IMF for β > 2. Thus,even within the Vanbeveren formalism the IMF of clus-ters and the field can be strictly the same for any clustermass distribution.The results of WK2006 are confirmed in this work inthe sense that we also find that real clusters cover a sig-nificantly smaller part of the maximal stellar mass andcluster mass space than permitted by the random sam-pling model. WK2006 go then to modify their samplingalgorithm in such a way as to reduce the probabilityof having very large mass stars in their clusters argu-ing that “Star clusters appears to form in an orderedfashion, starting with the lowest-mass stars until feedbackis able to outweigh the gravitationally induced formationprocess.” Although this is a possible scenario we favoura different one in which the area of the bivariate distribu-tion allowed by the random sampling model is renderedunstable due to the extreme nature of the mass spectrumtherein, in accordance with the simulations of Terlevich(1987). How much “trimming” of the bivariate distribu-tion can be actually accomplished this way will be thesubject of a future work.Our strong conclusion is that the observations of thestellar IMF and the mass spectrum of young clusters areconsistent with the hypothesis that clusters form by ran-dom sampling of a universal stellar IMF. This conclusionleads us to challenge the received hypothesis that clustersare the fundamental building blocks of the stellar pop-ulations in galaxies (de Wit et al. 2004, 2005). In thisview clusters are given an independent existence from be-fore the time that stars form. In our view, following theideas of Elmegreen (1997), stars form in giant molecularclouds in a hierarchy of structures with different numbersand masses. Some of these structures end up forminglarge clusters (which will later dissolve) and some don’t:they become part of small associations of stars formed inneighboring regions almost by chance. In fact, the ob-servations of de Wit et al. (2004, 2005), that are used asstandard references for the view that clusters form first,find that 30% of young massive Galactic field stars arenot members of clusters or OB associations. Of these, about 50% are runaway star candidates . They concludethat 4 ± 2% of the stars in their sample result from trulyisolated high-mass star formation, a number that can bereproduced “assuming that all stars are formed in clus-ters that follow a universal cluster distribution (by N ∗ )with slope β ∼ − . down to clusters with a single mem-ber.” If we consider single star clusters the statementthat all stars are formed in clusters becomes a tautol-ogy.Our results hint at a strong universality hypothesis forthe IMF where not only the power-law part, but thefull function may be universal. Clearly this claim hasprofound implications for understanding how stars formand therefore its foundations require considerably moreobservational work than was available for the tests pre-sented in this paper. SUMMARY This paper combines four separate results within a sin-gle unified view. The unification is actually a result ofEquation 11 (that is derived formally in the first sectionof the paper) which relates the cluster mass function withthe probability distribution of number of stars in clusters,and the (universal) stellar IMF. These results are • the IMF of a field stellar population and thatof the clusters out of which the field was built can be strictly the same. This result contra-dicts Kroupa and Weidner (2003), and was implicitin the models by Larson (1982), Oey and Clarke(2005a,b), and de Wit et al. (2004); • the observations of the lower end of the cluster massfunction, as given by Lada and Lada (2003), agreewith the random sampling model presented hereif: (a) the distribution of clusters by the numberof stars they contain is a scale-free power law, n β ,with β between -1.8 and -2.2; (b) the stellar IMFis independent of n and it is given by the Salpeterform; • the observed special mass scale for cluster forma-tion claimed by Lada and Lada arises from the ar-bitrary cut-off in the number of stars imposed bythem; • the interpretation of the statistics of the most mas-sive star in clusters is a valuable tool to study clus-ter formation processes as the observations, takenat face value, violate the null hypothesis repre-sented by the random sampling model. Althougha proper observational study requires a sample in-cluding many clusters with only a few members, webelieve that the observations presented by WK2006are at worst compelling. Nevertheless, it is arguedin the present work that the discrepancy is due tosystems which are rendered gravitationally unsta-ble by the presence of one or more very massivestars.We would like to thank our anonymous referee whosecomments helped us to improve this work. Selman and Melnick Fig. 5.— The maximum stellar mass in a sub-sample of the data of WK2006 analyzed as detailed in the main text. The x-axis show theaverage mass of the clusters in the 800 M ⊙ “super clusters”, while the y-axis show the mass of the maximal mass star in the super clusters.APPENDIX SIZE-OF-SAMPLE EFFECT The size-of-sample effect arises in many areas of astronomy, and we have studied it in the context of the distributionof sizes of super-associations in galaxies (Selman and Melnick, 2000). The formalism developed there can be translated mutatis mutandi to the present context as follows: Let the whole set of star clusters to be analyzed be denoted by C = { C i } Ni =1 , where C i is the i-th cluster with mass M i and maximum stellar mass m max ∗ i . We will assume C to beordered from the most massive to the least massive cluster, that is, i < j ⇒ M i > M j . Let M l = P li =1 M i /l be theaverage mass of the l most massive clusters. From C we will draw N S sub–samples, S j , of equal mass, M l , defined by S j = { C i } n j i = j , where n j is defined by the expression, n j X i = j M i = M l . Thus, the sub-sample S j contains the cluster C j and the next n j − M l . We will assign to each sub-sample j two numbers: ˜ m max ∗ j , and ˜ M avgj , defined as˜ m max ∗ j = max C i ∈S j m max ∗ i , ˜ M avgj = 1 n j − j + 1 n j X i = j M i . ˜ m max ∗ j is equal to the maximum stellar mass of all the members of S j , and ˜ M avgj their average mass. We will referto sub–sample j as “super–cluster” j. Because all “super–clusters” thus defined have approximately equal total mass( ≈ M l ), we can compare the mass of their maximal star without having a size–of–sample effect.Regrettably, the data set and the metod of analysis used by WK2006 is far from what is needed for this kind ofanalysis. Their Table 1 lists 17 clusters with masses ranging from 25 M ⊙ to 10 M ⊙ . We have seen that we shouldhe cluster mass function and the universality of the IMF 9actualy work with the number of stars instead of the mass of the clusters as conditioning to cluster mass introducesunphysical mass scales. If we use an average stellar mass of 0 . M ⊙ then the smallest mass cluster corresponds to acluster with ≈ 75 stars while the largest mass clusters corresponds to ≈ × stars. The sample should includeat least 4000 clusters with 75 stars to meaningfully compare the maximal mass of this artificial “super-cluster” withthe maximal mass of a cluster of 10 M ⊙ (such as R136). Figure 5 plots the maximal mass against total mass for thebest sub-sample of “super-clusters” that can be constructed from the data of WK2006 using the algorithm describedabove. This sample consists of NGC6530 as the first “super-cluster”; NGC 2264, Mon R2, and σ Ori in the second;Mon R2, σ Ori, NGC 2024, and IC 348 in the third; and σ Ori, NGC 2024, IC 348, ρ Oph, NGC 1333, Ser SVS2,and Taurus-Auriga in the fourth. Each of these four “super-clusters” has a total mass of approximately 800 M ⊙ .The “best-set” shows no correlation between maximal star mass and cluster richness: the claimed correlation was asize-of-sample effect. (It is possible to construct other sub-samples having a more massive star in the upper mass bin,but these sub-samples contain only 2 or 3 super-clusters.) REFERENCESP. Battinelli. A&A , 244:69, 1991.S. Brandt. Data Analysis . Springer-Verlag, New York, 1998.M. Cervi˜no, D. Valls-Gabaud, V. Luridiana, and J. M. Mas-Hesse. A&A , 381:51, 2002.R. de Grijs and P. Anders. MNRAS , 366:295, 2006.W. J. de Wit, L. Testi, F. Palla, L. Vanzi, and H. Zinnecker. A&A ,425:937, 2004.W. J. de Wit, L. Testi, F. Palla, and H. Zinnecker. A&A , 437:247,2005.R. H. Durisen, M. F. Sterzik, and B. K. Pickett. 952:952, 2001.B. G. Elmegreen. MNRAS , 203:1011, 1983.B. G. Elmegreen. ApJ , 648:572, 2006.D. Gouliermis, W. Brandner, and Th. Henning. ApJ , 623:846,2005.L. A. Hillenbrand and J. M. Carpenter. ApJ , 540:236, 2000.D. A Hunter, B. G. Elmegreen, T. J. Dupuy, and M. Mortonson. AJ , 126:1836, 2003.M. Kendall and A. Stuart. The advanced theory of statistics .Charles Griffin and Company, London, 1977.P. Kroupa. Science , 295:82, 2002.P. Kroupa and C. Weidner. ApJ , 598:1076, 2003.C. J. Lada and E. A. Lada. ARA&A , 41:57, 2003.R. B. Larson. MNRAS , 200:159, 1982.K. L. Luhman, G. H. Rieke, E. T. Young, A. S. Cotera, H. Chen,M. J. Rieke, G. Schneider, and R. I. Thompson. ApJ , 540:1016,2000.P. Massey. ApJS , 141:81, 2002.P. Massey, C. C. Lang, K. DeGioia-Eastwood, and C. Garmany. ApJ , 438:188, 1995. J. M. McDonald and C. J. Clarke. MNRAS , 262:800, 1993.P. M. Morse and H. Feshbach. Methods of theoretical physics .McGraw-Hill, New York, 1953.A. A. Muench, E. A. Lada, C. J. Lada, and J. Alves. ApJ , 573:366, 2002.M. S. Oey and C. J. Clarke. AJ , 115:1543, 2005a.M. S. Oey and C. J. Clarke. ApJ , 620:43, 2005b.M. S. Oey, N. L. King, and J. Wm. Parker. AJ , 127:1632, 2004.A. Porras, C. Micol, J. Di Francesco, T. S. Megeath, and P. CMyers. AJ , 126:1916, 2003.G. Rauw, M. De Becker, Y. Naz´e, P. A. Crowther, E. Gosset,H. Sana, K. A. van der Hucht, J.-M. Vreux, , and P. M. Williams. A&A , 420L:9, 2004.G. Rauw, P. A. Crowther, M. De Becker, E. Gosset, Y. Naz´e,H. Sana, K. A. van der Hucht, J.-M. Vreux, and P. M. Williams. A&A , 432:985, 2005.V. C. Reddish. Star formation . Pergamon Press, 1978.E. E. Salpeter. ApJ , 121:161, 1955.J. Scalo. Fund. Cosmic Phys. , 11:1, 1986.F. Selman and J.. Melnick. A&A , 443:851, 2005.M. F. Sterzik and R. H. Durisen. A&A , 339:95, 1998.E. Terlevich. MNRAS , 224:193, 1987.T. S. van Albada. Bull. Astr. Inst. Netherlands , 19:479, 1968a.T. S. van Albada. Bull. Astr. Inst. Netherlands , 20:57, 1968b.D. Vanbeveren. A&A , 115:65, 1982.C. Weidner and P. Kroupa. MNRAS , 365:1333, 2006.H. Zinnecker and H. W. Yorke.