[PDF] Self-Organizing Teams in Online Work Settings

Abstract

As the volume and complexity of distributed online work increases, the collaboration among people who have never worked together in the past is becoming increasingly necessary. Recent research has proposed algorithms to maximize the performance of such teams by grouping workers according to a set of predefined decision criteria. This approach micro-manages workers, who have no say in the team formation process. Depriving users of control over who they will work with stifles creativity, causes psychological discomfort and results in less-than-optimal collaboration results. In this work, we propose an alternative model, called Self-Organizing Teams (SOTs), which relies on the crowd of online workers itself to organize into effective teams. Supported but not guided by an algorithm, SOTs are a new human-centered computational structure, which enables participants to control, correct and guide the output of their collaboration as a collective. Experimental results, comparing SOTs to two benchmarks that do not offer user agency over the collaboration, reveal that participants in the SOTs condition produce results of higher quality and report higher teamwork satisfaction. We also find that, similarly to machine learning-based self-organization, human SOTs exhibit emergent collective properties, including the presence of an objective function and the tendency to form more distinct clusters of compatible teammates.

Full PDF

SSelf-Organizing Teams in Online Work Settings

IOANNA LYKOURENTZOU,

Utrecht University, The Netherlands

FEDERICA LUCIA VINELLA,

Utrecht University, The Netherlands

FAEZ AHMED,

Massachusetts Institute of Technology, USA

COSTAS PAPASTATHIS,

University of Peloponnese, Greece

KONSTANTINOS PAPANGELIS,

Rochester Institute of Technology, USA

VASSILIS-JAVED KHAN,

Eindhoven University of Technology, The Netherlands

JUDITH MASTHOFF,

Utrecht University, The NetherlandsAs the volume and complexity of distributed online work increases, the collaboration among people whohave never worked together in the past is becoming increasingly necessary. Recent research has proposedalgorithms to maximize the performance of such teams by grouping workers according to a set of predefineddecision criteria. This approach micro-manages workers, who have no say in the team formation process.Depriving users of control over who they will work with stifles creativity, causes psychological discomfortand results in less-than-optimal collaboration results. In this work, we propose an alternative model, calledSelf-Organizing Teams (SOTs), which relies on the crowd of online workers itself to organize into effectiveteams. Supported but not guided by an algorithm, SOTs are a new human-centered computational structure,which enables participants to control, correct and guide the output of their collaboration as a collective.Experimental results, comparing SOTs to two benchmarks that do not offer user agency over the collaboration,reveal that participants in the SOTs condition produce results of higher quality and report higher teamworksatisfaction. We also find that, similarly to machine learning-based self-organization, human SOTs exhibitemergent collective properties, including the presence of an objective function and the tendency to form moredistinct clusters of compatible teammates.CCS Concepts: •

Human-centered computing → Collaborative and social computing ;Additional Key Words and Phrases: online teams, distributed work, complex work, macro-task, self-organization

ACM Reference Format:

Ioanna Lykourentzou, Federica Lucia Vinella, Faez Ahmed, Costas Papastathis, Konstantinos Papangelis,Vassilis-Javed Khan, and Judith Masthoff. 2020. Self-Organizing Teams in Online Work Settings. 1, 1 (Febru-ary 2020), 39 pages. https://doi.org/10.1145/1122445.1122456

As online work increases in complexity, crowdsourcing research and practice turns more andmore into collaboration. Examples of problems where large-scale crowd collaboration has provenvaluable include scientific research and article authoring [116], designing software prototypes [98],

Authors’ addresses: Ioanna Lykourentzou, Utrecht University, The Netherlands, [email protected]; Federica LuciaVinella, Utrecht University, The Netherlands, [email protected]; Faez Ahmed, Massachusetts Institute of Technology, USA,[email protected]; Costas Papastathis, University of Peloponnese, Greece, [email protected]; Konstantinos Papangelis, RochesterInstitute of Technology, USA, [email protected]; Vassilis-Javed Khan, Eindhoven University of Technology, TheNetherlands, [email protected]; Judith Masthoff, Utrecht University, The Netherlands, [email protected] to make digital or hard copies of all or part of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and thefull citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored.Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requiresprior specific permission and/or a fee. Request permissions from [email protected].© 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.XXXX-XXXX/2020/2-ART $15.00https://doi.org/10.1145/1122445.1122456 , Vol. 1, No. 1, Article . Publication date: February 2020. a r X i v : . [ c s . H C ] F e b Ioanna Lykourentzou, Federica Lucia Vinella, Faez Ahmed, Costas Papastathis, KonstantinosPapangelis, Vassilis-Javed Khan, and Judith Masthoff writing stories [68, 69], and collaborative idea generation [110]. Crowd teams differ from traditionalteams working in face-to-face or in online corporate settings in that they (i) consist largely ofpeople who have not worked together before prior to the crowdsourcing task, (ii) need to performefficiently in relatively little time, and (iii) cannot be assumed to share a common set of workplacevalues, such as loyalty to a specific organization.The scale of crowdsourcing and its online nature, means that algorithms are often involved whenit comes to forming the crowd teams. This approach is in contrast to typical team formation takingplace in face-to-face settings, where a manager or an expert user may assign team members to theteam based on a knowledge of their skills, or of their past history of working together [47].Algorithm-based methods for team formation attempt to do the same at scale, either before thetask starts, by pre-profiling workers and then assigning them to teams [94], or during the task, bychanging the group synthesis to improve collaboration elements such as viewpoint diversity [104]or interpersonal compatibility [80]. Broadly speaking, team formation algorithms belong to thecategory of crowdsourcing management algorithms, the objective of which is to match persons toother persons or persons to tasks, and in this way to optimize the speed or efficiency of crowdsourcedtask production.The problem with existing algorithms is that they micro-manage workers by design . Workershave little to no say in who they collaborate with, for how long or how. In the case of algorithmsthat assign individuals to teams before the task starts, the algorithm deduces the individual’sperformance within their future team based on a set of pre-calculated profiling features, such asexpected performance. Unfortunately, these features are often incomplete and subject to significantvolumes of noise, given the sparsity of data in regards to - for example - complex skills. In addition,elements such as interpersonal compatibility, which are inevitably only revealed after the team hasworked together, are not taken into account. If the collaboration does not go as expected, workerscannot signal so, and they cannot change teams during the process. In the case of algorithms thatdo change the teams during the task , these changes happen again by deduction and in an invasivemanner, without explicitly asking the workers or requesting feedback from them. Workers literallyreceive a message informing them that they have been placed with a new team, and they have toadapt to this decision. User control of the collaboration process is largely absent, with repercussionsthat can range from psychological discomfort [95] to less-than-optimal collaboration results [31].As one may expect, not actively involving the workers, but rather assigning them directly toa group (or a task) comes with disadvantages. Latest research in management sciences [76] andalso crowdsourcing [97] indicates that too close a monitoring can stifle worker creativity andinitiative-taking: two features that are absolutely necessary in creative, complex teamwork. Inparallel, a substantial body of literature on the nature of collaboration indicates that providingagency to the workers can result in better collaboration and increased feelings of control [52].Finally, trusting the workers to co-design the task workflow, which in the case of collaborativework means to co-decide who works with whom, has proven to be beneficial for solving complexproblems, for which no evident solution is to be found [27].A problem therefore arises: (how) can we balance the necessity for algorithm-based team forma-tion, which is necessary given the scale of crowdsourcing, with the need to give each individualonline worker a say in who they will work with to best accomplish the task? In this work we explorea new concept: algorithm-assisted self-organization ; an approach designed to empower onlineworkers with the opportunity to choose their teammates, and thus “guide” the algorithmic processof team formation. Self-organization is a management principle that has often been used in othertypes of collaboration settings, such as Open Source Software communities and agile corporateteams [11]. It has however never been used in an online collaboration setting that requires theparallel involvement of a team formation algorithm.

Self-Organizing Teams (SOTs) , which are the , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 3 result of our approach, can be better understood through a metaphor from the machine learningfield . In a similar way that Self-Organized Maps (SOMs) gradually re-arrange their input data toform highly coherent clusters based on Euclidean distances, people in the SOTs concept gradu-ally discover the best teammate(s) to work with, based on reputation and personal experienceof working together as the task progresses. An algorithm assists this process, by using people’sexplicit teammate preferences to form teams that are dictated not by an assumption of whichteams could work, but based on which teams the workers indicate that will work. The fact that thealgorithm does not rely on any pre-established assumption, e.g. which worker features to use toform the teams, means that SOTs are largely task-agnostic and can thus be applied on a varietyof complex online work tasks. Being task-agnostic is the second advantage of this method, giventhat currently complex task workflows tend to be both expensive to develop and over-adapted to aspecific task [111].Self-organization in crowdsourcing is a new research line, and many alternatives can be exploredregarding its implementation. In this paper we work with a collaborative-competitive setting, wherethe task is accomplished in discrete steps, called rounds. The specific task that we work with iscollaborative short story writing, chosen as it is a complex task that does not require specialisedexpertise, does not have one evident solution, and can highly benefit from co-creation [34]. Duringeach round, workers collaborate in small teams to continue a main story.At the end of each round workers decide as a collective, on a single “winning” team, by votingtheir preferred story continuation. Before the next round starts, they can opt to stay with the sameteam or to change, and an algorithm accommodates this process based on their choices. Gradually,round by round, workers get to explore the “space” of candidate teammates to discover those withwhom they might work best to win. We juxtapose this setting with a setting where the crowd teamsremain fixed throughout the process, without any choice for self-organization or agency. Resultsshed light into how people select their teammates and how they gradually form desirable team“clusters”, the effect that self-organization has on their collaboration, but also on their self-perceivedeffectiveness, sense of control and in general on the way that they decide.The rest of this paper is organized as follows. In the next section we present related literature,including the limitations of current algorithm-driven team formation. We also summarize relatedwork insights about self-organization from other domains, and present the hypotheses of thisstudy. Next, we go through the study design, including the self-organizing team framework and itssupporting SOT algorithm. Then we describe the experimental process and the main experimentalresults. Finally, we discuss the study findings, with special emphasis on future directions given thenovelty of this approach, and conclude the paper.

One may distinguish two types of algorithms guiding the online team formation process: i) al-gorithms which select which worker should work with whom before the task begins, which wehereby call crowd team building algorithms , and ii) algorithms that manage the team processes after the task has begun, which we call crowd team coordination algorithms .Team building algorithms view team formation as a mathematical optimisation problem. Theyhave been developed for the scale of online work and crowdsourcing, which make it impossiblefor traditional approaches (e.g. a human manager) to put together an effective team. Assuming alarge pool of workers with known profiles (e.g. skill level) and a large pool of tasks, the objectiveof a crowd team building algorithm is to match each task with a group of workers to accomplishthe task optimally within given constraints, such as deadline, upper budget, or minimum quality. , Vol. 1, No. 1, Article . Publication date: February 2020.

Ioanna Lykourentzou, Federica Lucia Vinella, Faez Ahmed, Costas Papastathis, KonstantinosPapangelis, Vassilis-Javed Khan, and Judith Masthoff

In this line, Rahman et al. [94] proposes an algorithm that utilises affinity and upper critical massto recommend tasks to groups of online workers, taking also into account the aggregated workerskills and cost of effort. In addition, Liu et al. [77] propose a task pricing algorithm that attempts toassemble a team of crowd workers to complete a given task with the lowest cost. Both these worksrely on predictive learning algorithms, which make their team formation choices based on a limitedset of pre-calculated worker profiling features, without worker feedback. The risk of relying onsuch algorithms is to reduce workers to a set of dimensions that do not account for a person or –since teams consist of people – a team’s evolution, intentionality, and needs [5, 38], and thereforerisk creating rigid, incomplete and less-than-optimal team structures.Crowd team coordination algorithms manage the team structures during the task. In this researchdirection, Kim et al. [68] propose Ensemble, a methodology to create stories through the crowd.Ensemble is coordinated in a top-down manner, with teams that feature story “leaders” directing ahigher-level story vision, and workers materializing this vision into concrete story pieces. Workersdo not get to decide on the final story, and their contribution is limited to proposing drafts, commentsand votes, i.e. assisting the leader. Valentine et al. [117] and Retelny et al. [98] propose Foundry, acrowd management system that also relies on a hierarchically structured approach, to “assemble”workers into role-based teams. Workers can request a change in the initial team structures, but thefinal decision is taken in a top-down manner by a small number of expert workers and the taskrequester. Eventually, workers are notified as to which team they pertain. Although this systemdoes incorporate worker feedback, it does so in the form of worker suggestions and not decisions.Salehi et al. [104] propose an algorithm that rotates workers across teams based on viewpointdiversity, to improve the quality of a creative design task. Workers are not asked whether theywould like to switch teams. Although the use of the rotation algorithm did produce higher qualitytask results, the authors acknowledge that forcing participants to work with specific people ledto discomfort, and that the teams actually wished for the ability to prevent algorithm rotationdecisions. Zhou et al. [126] propose an algorithm based on multi-armed bandits with temporalconstraints, which explores different team structures and timings to apply these structures on ateam. The algorithm explores various exploration-exploitation trade-offs and chooses, from a finiteset of possible structural changes, when and which change it should make in the structure of a team.However, in this case too, the team formation algorithm is the driver and principal decision-makerbehind the team structure. On the analogous subject of crowd-led authored content, Kim et al. [69]’sMechanical Novel system provides AMT crowd workers the opportunity to create short fictionstories in loops of reflection and revision, and in a manner that decentralizes the decision-makingprocess much more than past systems. Although these past works, and especially [104] and [69]touch on the subject of user agency, the present study explores team behavior in a setting thatincorporates user control systematically, as the utmost prominent aspect of the system design.Finally, Salehi et al. [105] and Lykourentzou et al. [79] both propose systems of automated teamformation, which take worker feedback into account regarding the quality of past collaborations.In both these systems, the workers evaluate their peers after having worked with them, and thesystem uses these ratings to calculate an overall benefit function that drives team formation ona new, self-contained task. Our study shares similarities with these works in that it also activelyrequests worker feedback regarding past teammates. It differs in that the objective function to beoptimized is not decided a priori by the algorithm designer, but it is generated on-the-fly by theself-organisation decisions of the user collective.Latest research [97] acknowledges that externally, for example by an algorithm, pre-definedplanning of the way the team will work or its structure, is not optimal, especially for open-ended,complex tasks. The reason is that such planning can inhibit workers from adapting to the needs ofthe complex problem they need to solve, in real-time. The authors suggest that complex problems , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 5 require approaches that enable open-ended adaptation. This work is in line with prior research on accountable governance work models, which showcases that creativity is fostered when individualsand teams have relatively high autonomy in their everyday work processes, as well as a sense ofownership and control over their own work and ideas [4, 7]. Allowing independence around workprocesses also enables workers to resolve and adapt to problems in ways that better utilize theirexpertise and creative thinking skills [3].SOTs is an approach in this direction. In contrast to the centralised manner of organizing crowdwork, SOTs enable workers to collectively decide, rather than suggest, on the best course of actionas the task progresses, flexibly adapting both the involved team structures and the output solutionas the task unfolds.

User agency has been studied under at least two major approaches in the literature: as a product ofdirect user negotiation (or reciprocal agreement), and as a product of algorithmic mediation. Thefirst approach demands mutual agreement among the workers before forming teams. The secondapproach uses an algorithm to mediate this process and determine the teams based on preferences.The problem of agency as the product of user negotiation has been investigated through agent-based modeling by research like the one of Guimera et al. [49]. Their work simulates the emergenceof collaboration networks in creative enterprises based on the users’ propensity to collaborate undermultiple constraints (team size, the fraction of newcomers in new productions, and incumbents’tendency to repeat previous collaborations). The explicit intent to remain in a team – ergo, theirdirect negotiation – is part of a study on self-assembled teams by Zhu et al. [127]. The study enabledonline gamers to join or to leave virtual teams across some period. The players could only jointeams sequentially, and their decision to remain in those was determined by: a) whether they playedtogether synchronously, b) whether the team did not change in size during the cooperation, and c)whether the team became inactive for longer than 30 minutes (after which the team was dismantled).The study of Tacadao and Toledo [113] observed the emergence of team self-assembly, under thedifferent approach of algorithmic mediation. The model in [113] is designed for collaborativelearning scenarios where the cohorts produced are evaluated on the parameter constraints and thenumber of teammate preferences they satisfy. Finally, the work of Meulbroek et al. [86] studiesalgorithm-supported matchmaking in the context of student teams, where the authors developed asystem based on the CATME algorithm to determine the preference ranking of the students.In our study, the algorithm supports the users’ choice whilst easing the complexity of thenegotiation (e.g., cognitive overload, group size, etc.), which could slow down the execution ofthe task and deplete the users’ working memory resources. However, in contrast to the studiesmentioned above that examine user agency through simulations, or in settings other than onlinework, we propose a solution for self-organization, which is deployed with real users and on theparticular setting of online work.On the particular area of online system design, research has only recently started exploringthe perceptions of users when it comes to choosing their teammate, if they are given the choice.Gómez-Zará et al. [45] examine how people search and choose teammates in online platforms.Their research indicates that users search for teammates based on competence, common values,similarity in social skills and creativity levels, as well as prior familiarity. They also find thatusers eventually choose teammates who are well-connected, with many prior collaborators. Theirstudy concludes that future systems should be hybrid, augmenting user agency with algorithms.Jahanbakhsh et al. [62] examine the perceptions of users regarding automated team formation ineducational settings. Their findings reveal that although users valued the rational basis of using analgorithm to form teams, they did identify mismatches between their preferred team formation , Vol. 1, No. 1, Article . Publication date: February 2020.

Ioanna Lykourentzou, Federica Lucia Vinella, Faez Ahmed, Costas Papastathis, KonstantinosPapangelis, Vassilis-Javed Khan, and Judith Masthoff criteria and those of the algorithm, and expressed the need for having a say in the process. Thisstudy too closes with the recommendation to give users more agency in the selection of theirteammates, and even advocates for constrained team self-formation, in line with earlier works inthe educational domain [9].

Self-organization is a term often used to describe the governance ofsoftware development teams, either within a company or in Open Source Software Developmentcommunities [66]. In fact self-organization is one of the 12 principles behind the Agile Mani-festo [11].In this context, self-organization is defined as a process followed by teams that manage their ownworkload, shift tasks based on needs and best fit, and participate in the group decision making [57].The ability to self-manage has been found to significantly affect the performance of agile teams,because it “brings the decision making authority to the operational level, thus increasing speedand accuracy of problem solving” [89]. It has been repeatedly found to help software developmentteams cope with the increased dynamism, resource control autonomy, and decentralisation, whichare inherent in today’s globalised environments [32].Self-organizing teams have certain characteristics [114]. First, they are driven by “zero informa-tion”, where prior knowledge does not apply. This enables the team members to challenge existingknowledge status quo and have the potential to create something truly novel. Second, they exhibitautonomy, as they do not have a top-down appointed leader, but rather leadership is a propertythat emerges as the team members divide their roles [59]. Third, they exhibit self-transcendence, i.e.the team pursues ambitious goals, and fourth, cross-fertilization, i.e. team members have a varietyof backgrounds, viewpoints and knowledge.In summary, self-organization seems to improve effectiveness of development teams, by providingthem the ability to innovate from zero and by enabling them to maintain the autonomy and diversityof their members backgrounds. We use these two characteristics of self-organization in the notionof online work SOTs that we introduce. SOTs, however, have one important difference with theself-organized teams in a software development context: whereas the team members of agile teamsare hand-picked and then given self-organized autonomy, we apply self-organization from theteam formation stage, allowing individuals to explore the “space” of candidate teammates availableto them and discover those with whom they might work best, through a trial-and-error approach.

Self-organizationin complex systems can be defined as the emergence of global structures out of local interactionsin a process that is collective, parallel and distributed [56].Emergence is a property of complex dynamical systems where global processes cannot be reducedto the sum of the fundamental units and which are generated without the need to tune controlparameters to precise values. Self-organization, as a mechanism, is seen to emerge in numerouscontexts, from the ants’ food foraging, molecules formation, collective dynamics of flocks of birdsand schools of fish.Similarly, self-organization in complex dynamic systems is a principle applied in computerscience and engineering by relying on autonomous entities to achieve robustness and adaptabilityof the simulated environments [108]. Agent-based models (ABM) are micro-scale models [50] thatsimulate the appearance of complex phenomena with the concurrence of micro-scale behavioursand macro-scale states. Nature inspired algorithms with ABM can be seen, for instance, in theparticle swarm optimisation simulating the flocking of birds [67]. Another known example ofself-organization- inspired computation is the artificial neural network of the Self organizing Maps , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 7 (SOM) trained by unsupervised learning. SOMS identify spatial-temporal patterns in complexsystems by the use of dimensionality reduction and clustering of observations based on similarity.On the topological level of social networks, some units exhibit a tendency for optimising theirutility by clustering together with other units. The clustering, when based on preferential choice,is observable in a number of emerging social patterns like the variation of the classic SchellingSegregation model with Strategic Agents [22]. The property of emergence will be extensivelyexplored in this paper within the practice of Self organized teams formation and coordination, inparticular under the condition of complex macro tasking and pair-wise matching.

Matching users in multiplayer online games has beeninvestigated from the perspective of multiple fields, from networking [1] to algorithmic optimization[39]. Most of these studies focus on predictive feature-based matchmaking, which accounts for theskill levels of the players [88], their personalities [21], their play style [39], their ranking [48], oran ensemble of the above [99]. There is, however, a gap in the related work around the concept ofself-organization for online virtual gaming. The most pertinent investigation of self-organizationcomes from analyzing the network structures of popular social network gaming platforms such asSteam [12] or 100.io [106]. Platforms of this kind take advantage of both the pre-existing socialnetworks of the players, as well as the use of lobbies. Steam’s peer-to-peer matchmaking is builtaround the concept of a lobby that can be queried by the ISteamMatchmaking API interface [ ? ].Skill-based matchmaking is added on top of this system, while players can search other availableplayers based on their geographic proximity. Social networking is a strong component of gamingplatforms as they provide the option to play with friends and friends of friends from the accountslist. On the contrary, crowdsourcing platforms do not supply the crowd or the requester withsupport to build social ties. This lack of networking reinforces the need for user agency in crowdmatchmaking to stimulate the workers’ engagement and satisfaction [79].In online dating settings, collaborative filtering recommender systems have been widely usedto limit information overload [19], propose best matches, and provide some level of serendipity.Although engineered to fit with the concept of user-centered systems, popular dating apps have beenaccused of reinforcing popularity biases [101], and to intrinsically disfavor minority groups [13]. Inour study, we deliberately remove some of the most sensitive user details, such as the workers’ realnames and their profile pictures, to lower the number of user choices that are based on cognitiveand social biases. We do not use recommender system methods in the current framework, since thepool of workers is limited to a manageable size. Our study on self-organization of smaller batchesof workers collaborating in dyads follows the upward trending research on the “Renaissance ofsmall groups” as seen in studies dated from 2018 onward [55]. Lykourentzou et al. [80] explores how to dynamically create crowdteams from a large population of potential workers without any prior knowledge of worker profiles,which they call team dating. The idea behind team dating is that the task authors delegate teambuilding to the crowd workers themselves and ask them to try out different candidate co-workers,evaluate them, indicate those that they like working with, and make crowd teams based on theseindications. Rokickiet al. [100] also talk about self-organization for crowd team building, by studyingvarious team competition strategies including balanced teams, self-organizing teams (built uponone worker as the first administrator, accepting/denying the contribution of other members) and acombination of team and individual strategies. The results show that teams outperform individualsat task annotation without impacting the quality of the end product. In this paper we explore inparticular the effects that self-organized dyadic team formation has on the individual’s sense ofentitlement, reward and creative outreach, without the presence of a single-handed administratorto moderate and steer the team towards effectiveness. , Vol. 1, No. 1, Article . Publication date: February 2020.

Ioanna Lykourentzou, Federica Lucia Vinella, Faez Ahmed, Costas Papastathis, KonstantinosPapangelis, Vassilis-Javed Khan, and Judith Masthoff

Some works have started exploring the notion of exploration-exploitation in crowd work settings,albeit with a different meaning. Johari et al. [65] consider the exploration-exploitation trade-off inthe context of labor platforms where flash teams and on-demand tasks can be improved by theassistance of a matching algorithm modelled on a binary classification of the agents in a pair-wisefashion. The exploration herein is defined as the learning performance of untested teams againstthe exploitation of repeating previously tested teams. The stylised model relies on known andunknown features for near-optimal matching in consideration to the population distribution (aprioriknowledge) and the payoff structure under the aggregated performance objective with the lowestregret. Unlike Johari et al., we present a self-organized team framework which does not reducethe workers to simplistic terms (analogous to binomial label classification problems) and does notpreclude existing knowledge of the performance of the collaborators prior the start of the job.More so, the intensification and diversification of the strategy is not led centrally by a coordinatingalgorithm, but defined by the worker’s initiative only, as we allow the collaborators to guide thesystemic changes occurring across the work phases.Another application of the exploration and exploitation trade-offs in the search space is presentedby Xiang et al. [42] with a mathematical model of social team building optimization (STBO) and theadoption of swarm theory and swarm intelligence. According to this research, the team-buildingstates, or phases, lead to a converging point dependent on the exploration of new solutions and theexploitation of already visited neighbourhoods considering the energy and entropy of the team.This team building optimisation algorithm greatly depends on a defined social team hierarchywhich is not a prerequisite in the model we propose, enabling the process of team formation tofunctionally generate a hierarchy as a result.In Zhou et al. [126] exploration-exploitation means changing the team structure or keeping thesame, and it is decided by an algorithm, based on a finite set of (five) decision elements. In ourcase it means changing teammate or staying with the same, and it is decided by users, based on anunknown set of decision elements, constrained only by the number of team dynamics’ cues theycan process through their text-based interaction with another user. Humans in this case take upthe role of exploring the decision space, instead of an algorithm.

As we saw in the previous sections, current approaches in algorithm-based large-scale teamformation are primarily top-down, and they do not allow workers freedom in regards to whothey will collaborate with, or decide on the team work structures and the collective task outcome.Related literature also highlighted the fact that such top-down, micro-management techniques maywork for routine, very well defined tasks, but they fail for ill-defined, complex and creative work,such as innovation generation or creative writing. In contrast, bottom-up approaches, have beenfound to promote creativity and team performance in complex task settings, because they motivateworkers to own and take responsibility for the creative process and its outcome.Aiming to address the above, in this work we will explore, for the first time a new human-centeredcomputational structure , namely Self Organizing Teams (SOTs). The SOT structure does not justgive users agency over who they will work with, or help them form these teams at-scale using analgorithm. It also enables users to control, correct and guide the task output solution as a collective,by competitively filtering out weak candidate solutions and selecting the most promising one, asthe task progresses. Through the self-organization and collective control over the task solution– and supported but not guided by an algorithm – users gradually build a consensus-based andcommunity-approved global solution. , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 9

To motivate users to win and consider switching teams if needed, the workflow includes an elementof competition, in the form of a bonus payment (for paid workers) or an increase of the obtainedscore (for non-paid users). More specifically, every time a team wins, its members gain an extraamount equal to the base pay (for paid participants) or the base score (for non-paid participants).In line with the latest recommendations for academic requesters regarding fair payment , the basepay for Amazon Mechanical Turk (AMT) workers was e e

20. For non-paid participants, the monetary payment was transformed to a basescore; these participants would get a base score of 5 points for participating in the task and 5 morepoints every time their team won. As a further element of driving competition among non-paidparticipants, a leaderboard was shown at the end of the task, illustrating their placement relative tothe other participants. As we will discuss later in the results and in the Discussion section, thisreward choice directly affects the teammate selection decisions made by the participants throughoutthe task, i.e. it determines to a large extent the objective function of the SOT as a collective.

Group size is critical to team performance, especially when it comes to creative tasks, with researchshowing that the number of creative ideas per person increases as team size decreases [16, 96].Thornburg further shows that a group’s Creative Production Percent (the percent performanceof a group compared to the performance of an individual) improves as the team size decreases,until it reaches its peak at a group size of two, i.e. dyads. The reason, is that dyads have a uniqueone-to-one capability to share and exchange ideas, while the inhibitors that typically occur inlarger teams, like social loafing, groupthink and production bottlenecks, are less likely to occur inthese groups [17, 70]. Furthermore, dyad interactions permit the observation of key team processeslike coalition formation, inclusion/exclusion, power balances and imbalances, leadership andfollowership, cohesiveness, and performance [119], all of which have been linked with expressionsof team member agency and autonomy in various settings [26, 40, 60, 74]. Taking the above intoaccount, in this study we work with dyads. We elaborate how our model can be extended toaccommodate triads or larger groups in the Discussion section 5.5.In addition to team size, it is also important to decide the size of the batch, i.e. how many peoplewill be recruited for a single SOT lifecycle. The batch size needed to be an even number, so that allparticipants find a team to join. Given that we work with dyads across three rounds, the minimumbatch size for a meaningful choice is 6 (less than that means that participants are highly likely toend up with a teammate they do not prefer). Then, the larger the batch size, the more options aparticipant would have in selecting candidate teammates. However, too large a batch means thatpeople will not be able to process and compare effectively all candidate teammates, due to bothshort-term memory limitations but also due to the limited time they have to choose a teammate.With the above in mind, we opted for batches of 6 to 12 people. This allows for an adequate numberof different dyadic team formations, while keeping the cognitive load of processing multiple userprofiles manageable [15, 71].

In defining the appropriate task for the study, we needed to take intoaccount a number of requirements. First, we needed a task that involves complex, open-ended work https://wearedynamo.fandom.com/wiki/Guidelines_for_Academic_Requesters, Vol. 1, No. 1, Article . Publication date: February 2020. for which no single solution is evident, cannot be easily decomposed to fixed workflow structures,and requires workers to maintain the global context and full semantic overview of the problem whileiteratively refining it [2, 81]. Recent crowdsourcing literature refers to these tasks as macrotasks [78,107], distinguishing them from microtask-based work. Examples of candidate macrotasks includedbrainstorming, writing, prototyping, product development, innovation development, formulatingan R&D approach and so on. These macrotasks require the combination, through trial-and-error, ofthe diverse knowledge, skills and creativity of multiple collaborating individuals [78]. As such, thesetasks can benefit the most from the SOTs structure, the purpose of which is precisely to enable thecontinuous ad-hoc adaptation of the solution output and work processes to the task needs. On theother hand, tasks that are close-ended, those with known knowledge and skill interdependencies [8],or tasks for which a specific work process can be determined a priori [93], would not be appropriatecandidates, as these can be optimally solved through workflow management and crowd coordinationalgorithms, like the ones described in Section 2.1. Furthermore, the task needed to adhere to threekey criteria for an online creative work setting, involving people working online together for thefirst time: no requirement of prior expertise, short duration, and ability to express creativity [35].The task that was selected to fulfill the above criteria is a creative writing challenge, inspired bythe exquisite corpse method [18], where participant teams co-create a fictional story by graduallybuilding on each others’ contributions, across multiple rounds. Creative writing tasks of theabove type, can be used for applications such as rapid game scenario design (e.g. to provide moretruthfulness and content to online gaming AI) or to generate content for the creative industries(film making, advertisement, etc.). In line with the SOT framework, the task allows for cyclesof collaboration, where teams work internally to produce candidate story continuations, andcompetition, teams compete for the single best story continuation through peer review.The pre-authored story used as input to the creative writing task of this study is the following: At a restaurant, Mary receives an SMS and reads the follow-ing message: “Your life is in danger. Say nothing to anyone.You must leave the city immediately and never return. Repeat:say nothing.” Mary thinks for a second and then ...

The proposed framework is designed to work with an ongoing flow of users joiningthe task at slightly different times, as it is typically the case when working with commercialcrowdsourcing platforms. The system is programmed to account for a minimum threshold ofregistrations (between 8 and 12, depending on the flow of the workers) and a maximum waitingtime, after which it redirects the workers to unique batches of experiments. By monitoring andassessing the registration flow of the workers across multiple trial runs, we were able to determinethe average batch size for the experiments without encountering critical levels of delays that couldovertake a large portion of the task. Even though the job fitted some of the characteristics of micro-tasks (real-time, short-termed, and unique), to be able to hire workers from the Amazon MechanicalTurk platform, its core is designed to be executed as a macro-task (complex, collaborative, andanalytical).

Commercial crowdsourcing platforms do not encourage collaboration and do not permit workerallocation in self-organized teams. For this reason, we designed a tailor-made framework and itssupporting platform. The proposed work has been designed with the intention to address theindividual’s ability to leverage collaboration with a certain degree of freedom and given creativeagency as structural part of the collaboration process. , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 11

Story collaboration phase Rating teammate phase Story voting phase Winning storyTeammate voting phase

Fig. 1. The SOT collaborative-competitive framework. Participants collaborate in dyads to progress a creativetask, which for this study is continuing a short fictional story, and then they evaluate their in-betweencollaboration (story collaboration and rating teammate phases). Next, teams compete for the best storycontinuation through peer review (story voting and winning story phases). Finally, participants indicatewhich teammates they want to collaborate within the next round (team voting phase). The SOT algorithmforms the teams based on these choices, facilitating self-organization. The process repeats for three rounds.

The SOT framework presented below is illustrated in Fig. 1. It is designed to function on thebasis of a collaborative-competitive setting, completed across various rounds. During each round(collaboration phase) workers form teams and collaborate with their teammates to progress acreative task, which in our case is the continuation of a – pre-authored and same for all teams –short fictional story. The task is described in more detail in Section 3.3. At the end of each round(competition phase), users employ peer review to vote for their favorite story continuation, withoutthe possibility to vote for their own team’s story. The winning story is appended to the main story,the winning team is announced and its members receive an award, which for paid crowd workersis monetary and for non-paid volunteer participants is in the form of a score. Then, before a newround starts, users decide whether they wish to continue with their previous team, or change.The

SOT algorithm receives this input and forms the teams, aiming to accommodate to the bestpossible extent the teammate preferences of each user in the batch. The algorithm is described indetail in Section 3.6. The cycle of collaboration and competition continues, with users each timeassessing the benefits and risks of switching teammates (e.g. higher probability of winning in caseof switching teammates in an under-performing current team, but a steeper curve for learning towork together), making their team formation decisions, forming new or old teams and continuingthe main story as it was formed in the previous round. At the end of the final round, users arepresented with the outcome of their collective work (finalised main story), and a ranked userlist showing the number of times each user won, in descending order. Finally, users fill in a finalquestionnaire on their experience.

Users register to our experimental platform intwo ways. In the case of paid crowd workers, they enter with the credentials of the crowdsourcingplatform used to hire them, in order to facilitate their automatic payment once the task finishes. , Vol. 1, No. 1, Article . Publication date: February 2020.

In the case of volunteer participants, they register with a unique identification number. For eachexperiment of our study, once the desired batch of people has arrived, the experimental platformstops hiring people, and those registered are moved to the next step. From that step onward, thesystem is synchronized, meaning that all workers are moved from one step to the other after aspecific amount of time has elapsed. Users are always shown the remaining time, the round thatthey are currently in and the amount they have won so far on the top of their screen.

Workers arepresented with the task instructions, which briefly present the creative task, its goal and theirreward upon completion. This stage takes just over a minute, and users are given the followinginstructions:(1)

The task : Users are instructed to work in teams to continue a short story in English. Theyare informed that the task has 3 rounds and that in each round the story that gets the mostvotes wins and is appended to the main story. This extended main story will be then shownas a prompt to all participants and a new round will begin.(2)

The goal : Users are instructed that their goal is to be in the winning team. They are explainedthat in each round the system will automatically match them with the same or with anotherteammate. They are prompted to do their best so that your team’s story gets the most votes.(3)

The reward : Users are explained that they get a [Base reward ] for participating in the taskand an extra [Base reward] every time they are in a winning team (3 chances for this). Theyare also told that their maximum gain is [4 x Base reward].Next, users are asked to fill in a short questionnaire about their demographic information, namely:(i) gender, (ii) age, (iii) ethnicity, (iv) education level, (v) employment status, (vi) prior experience ina creative task like the one they are about to work on, and (vii) self-perceived creativity levels. Toassess the creative self-efficacy , we used the eight-item scale from [20, 23]. In our experiments,this stage takes less than a minute for completion. Then, each user is presented with the start of apre-authored fictional story (same for all users), and is asked to write a brief continuation for it.We use this input as a sample of the quality of the individual’s work (“writing sample") in twoways. First, we add it to their profile, visible to all users of the batch, so that they can themselvesdetermine that individual’s writing skills. Second, we also evaluate it separately, using an externalcrowd, for comparison purposes during our results’ analysis. Users have three minutes to completethe individual writing sample stage.

Next, users are moved to the teammateselection step, illustrated in Fig. 2. Here they will select their preferred teammate(s) from the fulllist of user profiles of the batch. Users can see each others’ profiles, where each profile contains thefollowing information about them: (i) username, (ii) demographic information, and (iii) writingsample. They can also see the (iv) average rating each candidate teammate has received by his/herprevious teammates (“others’ rating") and (v) rating the person looking at the profile page mayhave given to that particular candidate teammate if they have already worked together in the past(i.e. “own rating"). Note that items (iv) and (v) are only shown from the second round onward afterusers have already collaborated at least once. The Base reward can be either a Base pay or a Base score, depending on the type of participant batch (paid or volunteer);see Section 3.1 for details. Creative self-efficacy is defined as individuals’ beliefs in their ability to produce creative ideas. Past work [109] has shownits positive relationship with creative outcome., Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 13

DemographicsWritingexperienceCreativity levelSample storyDemographicsWritingexperienceCreativity levelSample storyDemographicsWritingexperienceCreativity levelSample storyDemographicsWritingexperienceCreativity levelSample storyDemographicsWritingexperienceCreativity levelSample story DemographicsWritingexperienceCreativity levelSample story

Fig. 2. Self-organization – Round 1. Self-organization takes place during the teammate selection phase.Participants vote for their preferred teammates based on their teammates’ profiles (consisting of demographics,writing experience, creativity level and sample story), and the SOT algorithm uses these votes to form theteams. In the next rounds, participants also see the average rating of each person, as well as their own ratingfor that person (if that exists).

In the teammate selection stage, users will be asked whether they want to work with the sameteammate or not. Users are also asked to indicate up to two other candidate teammates to workwith. These latter choices are useful to the system for two reasons: i) in case the user indicatedthat they do want to work with their previous teammate, but that teammate is unavailable or ii)in case the user indicated that they no longer want to work with their previous teammate. TheSOT algorithm (Section 3.6) will use these choices to construct a “preferences matrix” and form theteams of the next round.The teammate selection stage is a critical step in self-organization. It demands users to quicklyassess multiple sources of information across multiple users, and balance potentially conflictingcandidate decisions: e.g. the psychological safety of working with a person similar to them [84],versus the choice of choosing a highly rated person with whom they might not have a lot in common.In the next rounds, when the information available to the users for making a choice increases, userswill also need to individually assess their relative gain from continuing with the same teammate(lower communication overhead and presence of transactive memory [61], since the team haslearned to work together) versus the risk of losing the chance to work with a new teammate (forexample a previous round winner) who could potentially increase their chances of writing thatround’s winning story. Users are given two minutes to choose their preferred teammates.

As soon as the algorithm has placed usersin teams based on their indicated preferences, each team is moved to an online synchronouscollaboration space, with a text writing area and capability to chat . Here each team is instructedto continue the story so far (“main story”). In the first round, the main story is simply the initialpre-authored story presented to the users at the individual writing sample stage. In the next round,the main story will gradually increase, since after every round the winning team’s story will beappended to it. Team members are free to discuss and work together to continue the main story, inany way that they like. This allows us to observe different team dynamics and interaction patterns, The software Etherpad (https://etherpad.org/) was used to facilitate the teams’ synchronous collaboration in this study., Vol. 1, No. 1, Article . Publication date: February 2020.

Fig. 3. Participant profiles as seen by other users in our experimental platform interface – Teammate selectionphase, Round 1. In the next rounds, participants also have the choice to indicate if they wish to stay withtheir previous teammate or not. work and collaboration strategies, creativity patterns, etc. Each collaboration round lasts for fourminutes. Thirty seconds before time is up, users also see a reminder to wrap up their story.Once time is up, the teams are asked to evaluate one another on three axes and on a Likert scaleof 1 to 5. (i) Skillfulness (“How skillful was [teammate’s username] in continuing the story?"), (ii)Collaboration ability (“How good is [teammate’s username] as a collaborator?"), (iii) Helpfulness(“[teammate’s username] comments were helpful"). Each team member is also asked to assesstheir own helpfulness level (“My ideas and comments were helpful") on a Likert scale from 1 to 5.Finally, users are asked to assess the number of core competencies they noticed having in commonwith their teammate (“[Teammate’s username] and I were similar in:”), with four possible options(multiple or none can apply): (i) task commitment (“Commitment to working hard on this task”),(ii) work strategy (“How we think the work should be done”), (iii) Skill similarity on task (“Generalabilities to do a task like this”), and (iv) Personal values (“Personal values”). These ratings areused to enrich the profile of each user, both in terms of the “other’s ratings” (average rating byprevious teammates) and in terms of the “own ratings (of the person looking at that user’s profile),as explained earlier (Section 3.5.4). Team members have half minute to complete their evaluation ofone another.We determined the timeline of the experiments after several experimental trials with multiplecombinations of time slots. When adjusting these time slots we also took into consideration theAMT outsourcing model, which favors micro-tasks. Although extending the time for each phaseof the task could have been beneficial to some workers, we noted that most were able to producetheir judgment within the given time. Batch sizes did not differ greatly between experiments andthe teams’ stories that needed to be voted on by each worker were no more than three at a timeand considerably short in length. With batches significantly larger than what we used in this study,lengthier time slots would have been even more so applicable. We discuss further the scalability ofthe system in Section 5. , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 15

After evaluating theirteammate for that round, each individual user votes for their preferred story continuation, amongthe 𝑆 − 𝑆 being the total number of teams from the previous round(users cannot see or vote for their own team’s continuation). In voting for the best story, userscan see which team (i.e. which two usernames) produced which story continuation. Users haveone and a half minute to read and decide on their preferred continuation. Once the time is up, thestory with the most votes (“winning story”) is presented to them, along with the usernames ofthe two members of the winning team. The profile of winning team members is updated so thatthe bonus amount for winning is added to their individual total earned reward. Presenting thewinning story before users are asked to make a decision on their teammate of the next round isimportant to give users an overview of their results so far. From a task point-of-view, the story peerassessment at this stage allows for a collective decision to emerge regarding the outcome of thetask, i.e. users collectively have full control over the task result. Peer review is also a proven wayof incorporating quality assurance during the task [118]. Alternative ways of evaluating the teamresult after each collaboration round can be envisioned and they are straightforward to incorporate,without affecting the core of the proposed system. These ways include assessment by an externalcrowd or by one person, such as the client who commissioned the task.Next, and assuming the predetermined total number of rounds is not over, users return to theteammate selection stage. As explained in Section 3.5.4, here they must decide whether they wantto continue with the same teammate as in the previous round, or whether they want to change. Inboth cases, they are also asked to indicate up to two additional candidate teammates, from the fullcandidate teammate profile list, for the algorithm to use either in case it cannot accommodate theirfirst choice (in case they wanted to stay with their previous teammate), or for the algorithm to useto match them with a suitable alternative teammate (in case they wanted to change).The cycle of self organization-collaboration-competition continues, with the main story graduallyincreasing in length as more and more team continuations are appended to it. After a numberof rounds, which for the purposes of this study is set to three, users see the final story, the finaluser ranking (in a descending order based on the number of times a user has been a member ofa winning team), and a final questionnaire about their overall experience. Once they fill in thisquestionnaire, users are redirected to the crowdsourcing platform and paid. The SOT algorithm is one of the first to maximize user agency in algorithm-based crowd teamformation. Its aim is to assist but not dictate the self-coordination process, matching users with thoseteammates that they mostly prefer working with. The algorithm receives as input the individualuser profile ratings, encoded in a tabular form as follows. Assume user 𝐴 has worked with user 𝐶 inthe previous round. If a user 𝐴 indicates that he/she wants to continue working with 𝐶 in the nextround, then 𝐴 → 𝐶 =

3. If user A indicates that he/she does not want to continue working withC, then 𝐴 → 𝐶 =

0. User 𝐴 can also select from the list of other candidate team members, withwhom he/she has not worked in the previous round. Assume this list contains users 𝐵 and 𝐷 , andassume the user indicates 𝐵 as a preferred teammate for the next round, and does not indicate 𝐷 .Then 𝐴 → 𝐵 = 𝐴 → 𝐷 =

1. In brief, a user’s preference is given the highest weight to thoseteammates that the user has worked with and wants to continue working with, and the lowestweight is given to those that the user has worked with but does not want to continue working with.Using a user rating vector per user (i.e. two per team) as shown in Fig. 4a, the algorithm thenconstructs a complete graph (“affinity graph") with candidate team members as nodes, and theaverage pairwise ratings between individual users as the edges (Fig. 4b). Next, the algorithmidentifies all possible candidate teams, i.e. all possible graph cuts of size two. Next, it ranks the , Vol. 1, No. 1, Article . Publication date: February 2020. candidate teams on a list based on their average pairwise rating (edge value) from the highest tothe lowest (Fig. 4c). In other words, the algorithm ranks the candidate teams starting from thosethat want to work together again, continuing with those that have not worked together before butwould like to, and ending with those that do not want to work together again. From this ranked list,the algorithm selects the first team, and removes all other candidate teams that contain the selectedteam’s members (as one person can only be in one team at a time). Fig. 4d shows the selected teamof 𝐵 and 𝐶 in green and removes gray nodes containing users 𝐵 or 𝐷 as options. The algorithmcontinues in this manner, until the list of candidate teams is empty, and all users have been placedin a team. In case of ties, the algorithm chooses randomly. The pseudo-code for this process isshown in Algorithm 1. Algorithm 1:

Self-organizing team formation algorithm. The algorithm creates the teamsbased on the ratings and preferences of the users regarding their candidate teammates.

Data:

Individual profile ratings

Result:

Final list 𝐹 of teams for next round Create complete graph 𝐺 = ( 𝑉 , 𝐸 ) : 𝑉 : candidate team members, 𝐸 : average pairwise ratings; Find all possible graph cuts of size 2 (candidate teams), → 𝐶 ; Sort 𝐶 in descending order; 𝐹 ← ∅ ; while 𝐶 ≠ ∅ do Pick first element 𝐶 𝑖 = in 𝐶 : x,y ∈ 𝑉 ; 𝐹 = 𝐹 ∪ { 𝐶 𝑖 } ; 𝐶 = 𝐶 − 𝐶 𝑖 − { 𝐶 𝑗 } : x ∈ 𝐶 𝑗 | y ∈ 𝐶 𝑗 ; return 𝐹 ;Next, we will describe the experimental conditions we designed to study our methods, whichshed light on how team self-organization affects quality of work, team satisfaction and collaborationstyles. For this study we work with three experimental conditions, one examining the proposed approachand two benchmark conditions. • SOT : This condition stands for “Self Organizing Teams”, and studies the proposed approachof self-organization in online teams. People are given the choice to indicate their preferredteammates, including the option to stay with the same teammate that they worked with inthe previous round. The algorithm respects these choices and aims to place each person withthe teammate of their choice. • Placebo : This is the first benchmark condition. This condition creates the illusion of agency,where participants believe that they can self-organise, but eventually they cannot do so. Theworkflow and experimental interface of this condition is identical to that of SOT, i.e. usersdo have access to the teammate selection stage. The difference between the two conditionshappens in the background, where participant choices at the teammate selection phase are not taken into account by the algorithm. Instead, each participant is paired with a randomlyallocated teammate, and stay with them for the entire task. • No-Agency : This is the second benchmark condition. It is made to resemble existing methodsof placing people in ad-hoc online teams, where users lack agency over teammate selection. It , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 17 (a) Bidirected graph,all profile ratings (b) Affinity graph (c) All possible affinitygraph cuts (d) Selected affinitygraph cutsFig. 4. Steps of the algorithm’s operation. Nodes represent the participants, edges represent participantteammate preferences (values 0-3, higher values mean higher teammate preference). The algorithm (a)first constructs a complete bidirected graph comprising all user preferences, and then (b) it constructs theaffinity graph comprising the average pairwise ratings. Next (c) it ranks all possible teams (i.e. graph cuts) indescending order of collaboration preference and (d) respectively forms the teams. is also similar to the benchmarks used in the literature for testing online team formation froma worker crowd (see for example [105]). In this condition, participants are paired randomlywith one teammate at the beginning of the task, and stay with them throughout the process,without the option to choose their teammates (neither functional nor placebo). The differencewith the Placebo condition is that the task workflow of the No-Agency condition skips theteammate selection stage entirely.

A total of 140 people took part in a total of 18 experiments for this study. Eight of these droppedout due to internet connection issues, resulting in a final total of 132 people who finished theexperiment. The study participants were recruited either as university students or as AmazonMechanical Turk workers, in batches of 4 to 12 people depending on availability. The batches ofparticipants belonging to these two different user groups (paid and volunteered) were managedin separate sessions and they were equally distributed between the conditions. The allocation ofpeople to condition was made in a round robin manner to avoid biases due to participant type orbatch size, resulting in six batches per condition.The total number of people who participated in the Placebo condition was 52 and those par-ticipating in the SOT condition was 48, and those participating in the No-Agency condition was32.To further exclude the possibility of confounding factors, we conducted a series of post-hocchecks. First, using the demographics information filled in by the participants in the beginning ofthe experiment (Section 3), the sample was controlled for statistically significant differences acrossthe conditions in terms of demographics, namely gender, age, ethnicity, education, employmentstatus, prior experience and self-perceived creativity. An Analysis of Variance (ANOVA) showedno significant differences across any of the aforementioned axes (all at 𝑝 > . 𝑝 > . , Vol. 1, No. 1, Article . Publication date: February 2020. teammate selection usefulness, from the final questionnaire answers, showed no statisticallysignificant difference across the conditions ( 𝑝 > . We organize our results as follows. First, we look into the quality of the produced work by differentteams in the three conditions to investigate the question, “Did the teams formed under the SOTcondition produce stories of higher quality than those of the Placebo and No-Agency benchmarkconditions?”. Second, we look into the quality of the collaboration to investigate the question,“Did participating in the SOT condition enable participants to collaborate better and be moresatisfied with the process of collaboration, compared to participants in the Placebo and No-Agencybenchmark conditions?”. After answering these key questions we look deeper into the mechanics ofself-organization, examining two emergent patterns of self-organization, namely the presence of anobjective function that drives the participant collective, as well as network clustering phenomena.

A total of 196 unique story continuations were produced by the teams. The final winning storieswere 18. To evaluate the quality of these stories, we employed a crowd of external judges, hiredthrough AMT. Each story continuation was evaluated by 10 AMT workers, on a ten-point Likertscale (1-10), and on five quality criteria: grammar and syntax (“How grammatically and syntacticallycorrect is the story?”, ranging from “Not correct” to “Very correct”), interest (“How interesting isthe story?”, ranging from “Not interesting” to “Very interesting”), originality (“How original is thestory?”, ranging from “Not Original” to “Very Original”), plot structure (“How good is the storyplot?”, ranging from “It doesn’t make sense” to “It flows nicely”), and overall impression (“Overallhow much did you like the story?”, ranging from “Not at all” to “Very much”). These criteria wereselected as they are among the most frequently used by professional short story evaluators [14], andbecause they represent a balanced mix of both objective (grammar, plot structure) and subjective(interest, originality, overall impression) axes [33].An analysis of variance indicated that SOT teams create stories of significantly higher qual-ity than the benchmark condition teams, across all five quality criteria, albeit with slightly dif-ferent absolute value differences between the conditions (Figure 5). Specifically, the SOT teamstories were rated higher in terms of grammar ( 𝑚 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . , 𝑆𝐸 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . 𝑚 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . , 𝑆𝐸 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . 𝑚 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 = . , 𝑆𝐸 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 = . 𝑚 𝑝𝑙𝑜𝑡 = . , 𝑆𝐸 𝑝𝑙𝑜𝑡 = . 𝑚 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . , 𝑆𝐸 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . 𝑚 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . , 𝑆𝐸 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . 𝑚 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . , 𝑆𝐸 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . 𝑚 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑖𝑡𝑦 = . , 𝑆𝐸 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑖𝑡𝑦 = . 𝑚 𝑝𝑙𝑜𝑡 = . , 𝑆𝐸 𝑝𝑙𝑜𝑡 = . 𝑚 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . , 𝑆𝐸 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . 𝑚 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . , 𝑆𝐸 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 = . 𝑚 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . , 𝑆𝐸 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 = . 𝑚 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑖𝑡𝑦 = . , 𝑆𝐸 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙𝑖𝑡𝑦 = . 𝑚 𝑝𝑙𝑜𝑡 = . , 𝑆𝐸 𝑝𝑙𝑜𝑡 = .

109 and 𝑚 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . , 𝑆𝐸 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 = . 𝐹 𝑔𝑟𝑎𝑚𝑚𝑎𝑟 ( , ) = . , 𝑝 < . , 𝜂 = . 𝐹 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 ( , ) = . , 𝑝 < . , 𝜂 = . 𝐹 𝑜𝑟𝑖𝑔𝑖𝑛𝑎𝑙 ( , ) = . , 𝑝 < . , 𝜂 = . 𝐹 𝑝𝑙𝑜𝑡 ( , ) = . , 𝑝 < . , 𝜂 = .

075 and 𝐹 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 ( , ) = . , 𝑝 < . , 𝜂 = . 𝑝 < . 𝑝 < . 𝑝 < . 𝑝 < . , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 19 Story

Quality

SOT Placebo No ‐ Agency

Fig. 5. SOT teams produced stories of higher quality compared to the teams of the two benchmark conditionsas rated by external evaluators. Results across all five axes are statistically significant at 𝑝 < . .data, with round as a random effect, showed that the round does not account for the relationshipbetween the higher performance of the SOT condition compared to the others.Finally, a Pearson correlation analysis between ratings for attribute vs. the other (like grammarand overall) showed that the rater’s evaluations were significantly correlated (p < 0.01) witheach other, indicating the presence of the “halo effect”, which is well-documented in many socialjudgment settings. The Halo effect implies that a rater’s judgments of one quality dimension tendsto influence others, even in the presence of sufficient information to allow for an independentassessment of them [92, 102, 121]. See Table 1 in the Appendix for detailed values.Overall, the external ratings on multiple factors show that stories produced in the SOT conditionwere better quality compared to the benchmark conditions. Next, we will discuss the perceivedquality of their collaboration by participants in different conditions. One of the major goals of team formation is for individuals in the team to collaborate effectively.While collaboration effectiveness can be studied using many methods, we first focus on the perceivedcollaboration by individuals in the team.

On average, the team members in the SOT condition rated each other significantly higher ascollaborators (on a scale of 1 to 5) ( 𝑚 𝑐𝑜𝑙𝑙𝑎𝑏 = . , 𝑆𝐸 𝑐𝑜𝑙𝑙𝑎𝑏 = . 𝑚 ℎ𝑒𝑙𝑝 = . , 𝑆𝐸 ℎ𝑒𝑙𝑝 = . 𝑚 𝑐𝑜𝑙𝑙𝑎𝑏 = . , 𝑆𝐸 𝑐𝑜𝑙𝑙𝑎𝑏 = . 𝑚 ℎ𝑒𝑙𝑝 = . , 𝑆𝐸 ℎ𝑒𝑙𝑝 = . 𝑚 𝑐𝑜𝑙𝑙𝑎𝑏 = . , 𝑆𝐸 𝑐𝑜𝑙𝑙𝑎𝑏 = . 𝑚 ℎ𝑒𝑙𝑝 = . , 𝑆𝐸 ℎ𝑒𝑙𝑝 = . 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = .

213 and 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = . , Vol. 1, No. 1, Article . Publication date: February 2020. TeammateCollaboration quality TeammateHelpfulness Own

Helpfulness TeammateSkillfulness

Intra ‐ team collaboration evaluation Placebo SOT No ‐ Agency

Fig. 6. SOT team members rated their teammates higher in terms of collaboration quality, helpfulness,skillfulness, and perceived their own contributions as more helpful to the team’s final output, compared toboth benchmark condition teams (all four axes at statistical significance level 𝑝 < . ). groups formed under the other two benchmark conditions ( 𝑝 < .

001 for each comparison), whilethere was also a statistically significant difference between the Placebo and No-Agency benchmarkconditions (with 𝑝 < . 𝑝 < .

001 for each comparison). Here too, the post hoc testrevealed that the Placebo groups perceived their teammates as more collaborative than the groupsformed under the No-Agency condition ( 𝑝 < . 𝑚 𝑜𝑤𝑛𝐻𝑒𝑙𝑝 = . , 𝑆𝐸 = . 𝑚 𝑜𝑤𝑛𝐻𝑒𝑙𝑝 = . , 𝑆𝐸 = . 𝑚 𝑜𝑤𝑛𝐻𝑒𝑙𝑝 = . , 𝑆𝐸 = .

13 for the No-Agency condition), with 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = . 𝑝 < .

05 between the SOT and Placebo condition and 𝑝 < .

001 between the SOT and No-Agencycondition). Interestingly however, there was no statistically significant difference between thePlacebo and No-Agency groups ( 𝑝 = . 𝑚 𝑠𝑘𝑖𝑙𝑙 = . , 𝑆𝐸 𝑠𝑘𝑖𝑙𝑙 = . 𝑚 𝑠𝑘𝑖𝑙𝑙 = . , 𝑆𝐸 𝑠𝑘𝑖𝑙𝑙 = .

11 for the Placebocondition and 𝑚 𝑠𝑘𝑖𝑙𝑙 = . , 𝑆𝐸 𝑠𝑘𝑖𝑙𝑙 = .

13 for the No-Agency condition), with 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = . 𝑝 < .

05 between the SOT and Placebo conditions, 𝑝 < .

001 between the SOT andNo-Agency condition and 𝑝 < .

05 between the Placebo and No-Agency condition. Interestinglythe aforementioned higher perception of skillfulness is not because SOT members are indeedmore skillful; in fact, as also mentioned in Section 3.8, participants in the three conditions do not , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 21

0% 10% 20% 30% 40% 50% 60%

CommitmentWork strategySkill similarityPersonal values

Percentage of teams reporting common items Competencies in common No ‐ AgencySOTPlacebo

Fig. 7. Percentage of teams reporting common competencies across the three conditions. SOT team membersreported being significantly more similar in terms of work style, at 𝑝 < . , compared to the other twobenchmark conditions. The teams did not differ significantly in their perceived skill similarity, personal valuesor commitment to the task. differ statistically in terms of skillfulness as evaluated by external evaluators on their individualwriting samples. Previous research [53] demonstrates that when people are more satisfied by theircollaboration, then they tend to think more highly of their peer, thus being more prone to associateaffective trust to positive expectation about belonging to that team. For similar reasons, groupcohesiveness can positively affect the perception of satisfaction and team performance.We note that from the three conditions, the No-Agency benchmark condition, i.e. the one wherepeople were not given any (not even a placebo) option to choose their teammate was the onewith the lowest intra-team evaluations in terms of all four axes of collaboration, helpfulness andskillfulness.These results, summarized in Fig. 6, indicate that the teams formed under the SOT condition aremore satisfied during their collaboration, and able to collaborate and help each other more, despitenot being objectively more skillful than the individuals of the benchmark condition teams. During their peerevaluations the participants of each team also indicated which, if any, competencies they had incommon with their teammate. As explained in detail in Section 3.5.5, they could report a commonsense of commitment to the task, work strategy similarity, skill similarity, and/or common personalvalues.A chi-square test of independence was performed to examine the relation between conditionand number of work style items that the teams reported having in common. The relation betweenthese variables was significant, 𝜒 ( , ) = . , 𝑝 < . 𝜒 ( , ) = . , 𝑝 < . , Vol. 1, No. 1, Article . Publication date: February 2020. Chat

Kristy : hey

Peter : Any ideas on how to start the story?

Kristy : yeah

Kristy : lets make it a thriller

Peter : should it be a terrifying one, a joke?Peter: ok I agree

Kristy : everyone likes a good thriller

Kristy : think of what happened in the past

Peter : hmm there is limited space

Peter : what do you think?

Story

Authors: Kristy, Peter sees someone looking weird at her fro the other side of the restaurant. She remembered himfrom last night, they had a slight disagreement at a bar where they drunk beer for hours. Thenshe decides to pay the bill, get out the restaurant and leave.She wanted to say many things, ask why this happens to her but there is no

Fig. 8. Overview of a collaboration space and chat under the SOT condition

This result indicates that when people’s choice of a teammate (SOT condition) is honored, thenthey tend to pair with teammates with whom they share common work practices, confirming priorliterature [115].

Teams assembled underthe SOT condition produced shorter story texts ( 𝑚 𝑡𝑒𝑥𝑡𝑝𝑎𝑑 =

234 characters, 𝑆𝐸 = 𝑚 𝑡𝑒𝑥𝑡𝑝𝑎𝑑 =

320 characters, 𝑆𝐸 = 𝑚 𝑡𝑒𝑥𝑡𝑝𝑎𝑑 =

556 characters, 𝑆𝐸 = 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = . 𝑝 < .

05. One explanation for this could be that the benchmark condition teammates knoweach other better, since they have worked together in the past rounds, whereas the same is notalways true for SOT teammates, who may or may not have worked together in the past. If that isthe case, then the chats between the three conditions could be expected to differ statistically inlength, with the SOT teams chatting more in an effort to establish a common ground in each round;therefore having less time to work on the actual task. However, the analysis of variance comparingthe total chat length between the three conditions showed that there is no statistically significantdifference.Looking deeper into the process of story writing, we examine the way that the teams in the threeconditions produce their common story text by measuring a metric for “turn-taking”. Turn-takingis a property of collaboration [103] based on construction contribution which allows two or moreentities to build a discourse from separate units. The metric was chosen for the evaluation of thegiven experiments as it considers the amount and the timing of the individual contribution towardsthe group work.We measure turn-taking as follows. First we identify every text piece (every segment of textentered by the users) each team member entered in the common text area, and the order in whichthis member entered this piece. This gives a sequenced order of contributions. For example, assumea team consisting of person A and person B and a writing sequence of their collaboration to be {ABABAAA} . We encode this sequence as {-1,1,-2,2,-3,-4,-5} , sum and normalise it by the sequencelength. The turn-taking was tracked by the back-end part of the system, which saved the finalversion of each story continuation when the writing phase ran out of time. The software usedfor hosting this kind of synchronous collaboration, called Etherpad [37], helped to automatically , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 23 Chat

Tom : let's both write 50 words the word count isnot valuable

Bob : yes

Bob : we are over the words right?

Bob : we need to trim it

Tom : is it a problem?

Tom : w e have no time to shorten it Story

Authors: Tom, BobS he starts singing "jingle bells" in order to confuse whoever is holding her captive. Marycontinues singing jingle bells in order to come over as a psychopath, so that whatever herpotential use for this kidnapping is seems now irrelevant.she can't hear or see anything. She tries to ﬁnd something familiar. Then, footsteps. They getcloser and Mary holds her breath. The steps opens a door and turns on the lights. SURPRISE.All her friends are in the room. Its a pizza party for the completion of her successful company.As she is full of joy she gets to her friends only to ﬁnd out that all the pizza has pineapple onthem.She now escapes to Alaska to set up a new business with people who do not put pineapple onpizza.

Fig. 9. Overview of a collaboration space between two teammates under the Placebo condition highlight the text in a different color for each user, meaning that user A could see his or her text ina different color than that of User B. Figures 8 and 9 illustrate the above.Every time the contribution of one team member is followed (“matched") by the contribution ofthe other team member, the value of this metric is equal to zero. The more person A dominatesthe writing process, the more negative values the metric receives. The more person B dominatesthe writing process, the more positive the metric becomes. Hence, values around zero indicate abalanced writing process in terms of turn-taking.The analysis of the story writing processes for the two conditions, using a random allocationof team members to position A and B of the metric, shows that the SOT teams have significantlymore balanced text logs, in terms of turn-taking style ( 𝑚 𝑡𝑢𝑟𝑛 = . , 𝑆𝐸 = . 𝑚 𝑡𝑢𝑟𝑛 = − . , 𝑆𝐸 = . 𝑚 𝑡𝑢𝑟𝑛 = − . , 𝑆𝐸 = . 𝐹 ( , ) = . , 𝑝 < . , 𝜂 = . 𝑝 < . Sample Stories.

Figure 8 shows an example chat and collaboration workspace between two people,Kristy and Peter, in the SOT condition. In line with the statistical results discussed above for SOTteams, we observe that the two teammates collaborate on the story creation by taking turns inboth the chat and the story workspaces. The continuation of the sample story is made of equalchunks of text written and discussed by the ad-hoc pair of workers, who discuss about the onlinecollaboration through a fairly balanced chat thread and the use of open questions. Workers seem towork synchronously, and build on each other’s contribution in a manner that coherently continuesone another’s previous statements. , Vol. 1, No. 1, Article . Publication date: February 2020.

Figure 9 shows an example chat and collaboration workspace between two people, named Tomand Bob, in the benchmark Placebo condition. We observe that Tom and Bob agree to split theworkload into separate chunks and then they work individually: each essentially writing a differentcontinuation of the preceding winning story. This strategy of collaboration did not produce thebest optimal outcome since the resulting plot was incoherent and unbalanced. Also worthwhilenoting is that the first paragraph written by Tom "

She starts singing ’Jingle bells’ .. " got moved to the top of the story just a few seconds before the end of round, whereas Bob’scontribution got abruptly pushed underneath it. While this example is not representative of allchat logs and stories in the Benchmark condition, it does illustrate the lack of collaboration andquality that was also confirmed by the statistical results discussed above for that condition.

After examining how self-organization affects the work and collaboration quality of participatingteams across the conditions, we now take a deeper look into how this method affects more subtlebehavioral elements of team formation.

So far, we have seen that SOT teams tend to collaborate better, feel moresatisfied by their collaboration, and produce better work results. We now look into what motivatespeople to team up the way they do. For this, we compare the SOT and Placebo conditions, sincethese are the only ones where people were given the option to select their teammate (although thisoption is not honored in the Placebo condition). Since the participants of the Placebo condition donot eventually change teams, we look at voting intention , as revealed during the teammate selectionstage of each condition. In this stage, which takes place after each collaboration session as explainedin detail in Section 3.5.4, participants get to indicate which teammate they would like to work with inthe next round. We first examine whether people tend to select previous winners as their preferredteammates. Indeed, an Analysis of Variance shows that, regardless of condition, the winners ofprevious rounds gather significantly more profile votes on average ( 𝑚 𝑤𝑖𝑛 = . , 𝑆𝐸 = . 𝑚 𝑛𝑜𝑛 − 𝑤𝑖𝑛 = . , 𝑆𝐸 = . 𝐹 ( , ) = . , 𝑝 = . , 𝜂 = . The aforementioned strategic voting strategy seems to pay off. In the question“What mattered the most when choosing a teammate” at the end of the task, we observe thatwinning participants were twice as likely to report that they chose the person that would makethem win (28% of winners, i.e. 22 out of 79) compared to non-winning participants (8 out of 55,i.e. 15% of non-winners). In contrast to winners, who seem to choose strategically their teammate,non-winners were twice as likely to choose people whose profile information they liked the most(11% of non-winners versus only 4% of winners).These two elements, i.e. winners driven more by a “playing to win” strategy and non-winnersdriven more by their teammate’s profile information, were the only answer items that distinguishedwinners and non-winners, as the rest of the participants’ reported answers to this question receivedrelatively equal percentages of answers (Fig 10). We also note that from the other teammate selectionstrategies, choosing based on skill, i.e. a prospective teammate’s individual writing sample, was theone preferred by most participants (almost 50% both in winners and non-winners alike), but as wesaw it did not, in the end, make a real difference as to winning probability.

Of the teams that belonged to the SOT condition,26% (10 out of 39) stayed together across 3 rounds, 28% (11 out of 39) stayed together for two roundsand the remaining 46% (18 out of 39) of the teams worked together only once across the three , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 25

0% 10% 20% 30% 40% 50% 60%OtherI chose the people whose profileinformation I liked the mostI chose the people whose initial story Iliked the mostI chose randomlyChoosing the person that would makeme win

Teammate

Choice

Strategy

Non ‐ winner Winner Fig. 10. Participants self-reported strategies of selecting a teammate. Winners, i.e. people that had been in awinning team at least once, were twice as likely to purposefully select teammates that would help them win.Non-winners were twice as likely to choose people whose profile information they liked the most. The rest ofthe answers received similar percentages by both winners and non-winners. rounds. The results show that most of the users changed their teammates often (at least two times)whilst only a minority of those (26%) stayed together for the entire duration of the task. Teamsthat stayed together for 3 rounds won 55% of the times (10 victories across the total 18 roundsof the SOT experiments), those that stayed together two times won 28% of the times (5 victoriesacross the total 18 rounds) and those that stayed together only one round won 17% of the times (3victories across the total 18 rounds). Even when considering only their own cohort (comparingteams that stayed together across the same number of rounds), we see that the teams who stayedtogether three times had a 30% chance of winning (10 victories out of the possible 30). Those teamsthat stayed together for two rounds had a 22% chance of winning (5 victories out of 22). For thoseteams that stayed together for one round only,they had a 17% chance of winning (3 victories out of18). From the above we see that when considering users by their winning potential, those who wonmore are also those that decided to remain in the same team across all the given rounds.

Finally, in a parallelism with machine learning and data-driven self-organization, we look foremergent patterns in the way that people “cluster" across the three rounds, in the two conditionsof SOT and Placebo (the option to self-organisation does not apply in the No-Agency conditionby default). To do so, we represent the experimental batches as bidirectional affinity graphs, witheach affinity graph consisting of a set of users (nodes) and a set of dyadic ties (edges) among userpairs, as explained in section 3.6. Each edge between a pair of user nodes receives a value, whichcorresponds to their “pairwise affinity", i.e. the intent of these two people to collaborate with oravoid one another, as denoted by their voting preferences at the end of each collaboration round.We then apply social network analysis, which provides a set of methods for observing the emergingpatterns of the teams.In our analysis of the graphs, we take into account the process of change produced by thedynamic connectivity of the vertices, with every worker accumulating votes across the rounds.The set of V of vertices of the graph is fixed, whereas the set of E edges changes with time, inan incremental fashion. The calculation of the final graphs is the same as the sequential analysis , Vol. 1, No. 1, Article . Publication date: February 2020. Fig. 11. Network clustering visualisation of the team formation process. Each node is a worker, and eachedge is the team formation preference between that worker and a candidate teammate, as denoted by thevoting matrices. The closer two nodes are, the more these two workers want to work together. In the SOTcondition (left), we notice that participants form a large cluster, where a set of participants tend to prefereach other, while a few are not-preferred by most. In the benchmark Placebo condition (right), we do notobserve a single large cluster. technique used for cumulative sum. The networks formed by the partial sums of weights describethe overall interactions between actors across the phases of the experiments. Where a pair of usersconsistently voted to remain together, the ties between the two nodes shorten, indicating strongerattraction for collaboration. Greater distances between nodes can be formed by repulsion as eitherone or both the teammates down-voted the other.The results from a preliminary analysis of the network topology show that the teams that workedunder the SOT condition create on average more clusters or chains, unlike the benchmark condition,which display for the great majority of the graphs, dyadic clustering. Both in-degree and out-degreeweights are considered. Stronger ties between workers creating larger clusters mean the potentialcreation of channels for interpersonal communication, determined by the person’s choice of one ormultiple teammate/s. More so, social networks formed under the SOT condition display strongerpolarised attraction-repulsion mechanisms that are less detectable in the alternative condition.Considered together, the above indicate a pattern in human collaboration behavior that is similar tomachine learning-based self-organisation: when given the agency to form their own virtual teams, andan “objective function” to maximize (which in this case is winning a reward), people do tend to activelyexplore their candidate teammate space and to gradually form clusters of “compatible” sub-groupsthat are fairly distinct and separate one from the other.

As we also explain in the Discussion section,further studies can be made in this very interesting direction, to examine whether other patternsobserved in machine learning can also be observed in online work settings, such as the gradualstabilization of the compatible self-formed team clusters, how many exploration rounds this wouldrequire etc.

In this work, we make a first attempt to explore the phenomenon of self-organization in the contextof online work. Our results show that enabling participants to make their choice of teammate, andhonoring this choice through an algorithm that aims to maximize intra-group preference, yieldsresults of higher quality and teams that are more satisfied by their collaboration. A number ofpoints merit further exploration and can be the object of future work. We discuss these pointsbelow. , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 27

The experimental setup of this study was set on a collaborative-competitive setting, which activelyencouraged people to team up in such a way that their team would write the best story. The settingencouraged people to do so by significantly increasing their score or monetary reward every timetheir team’s story would be voted as the winning one of the round. In view of the above, peoplechose strategically whom they would pair up with, showing an explicit preference for previouswinners. Looking at this result from a macroscopic point-of-view, the worker collective adaptedtheir objective function (i.e. how they make their teaming decisions) to their environment. Thisis in direct alignment with machine learning self-organization (e.g. self-organizing maps), wherethe objective function is determined by Euclidean distance, and the learning goal is to group thedata points into clusters that minimize intra-group variance. The difference is that in the case ofmachine learning the objective function is known and given apriori by the algorithm designer,while in the case of human self-organization it can be driven (but not explicitly given) by externalfactors, such as the reward structure.Observing the emergence of a collective behavior pattern, in our case the objective function, isalso in line with prior work by Woolley and colleagues, who have consistently drawn attentionto the presence of a general collective intelligence factor to explain group performance [123, 124]across participant contexts and cultures [36].In the future, it would be useful to explore how to explicitly affect the objective function of theSOT by changing the environmental context of the task. For example, one could alter the rewardstructure, and instead of compensating participants higher for winning, to compensate them equallyor with a minimal bonus, regardless of whether their team won the round or not. This could beexpected to alter the way that people select teammates, potentially encouraging them to exploremore diverse ways of teaming up and leading to different final work outputs (e.g. stories withdifferent originality levels, plot structures etc.).Another factor that can affect the objective function in a self-organization context is the timingof specific prompts given to the participants. In our study, participants saw who were the winnersof the previous round directly before choosing their teammates of the next round. This may belinked with the fact that, as we saw in section 4, many people intentionally tried to pair up withprior winners, in an effort to boost their chances of winning the next round. Changing this order,and asking participants to first indicate their preferred teammates before showing them the round’swinner could have an effect on the way they form their teammate decisions, and would thus beinteresting to explore as part of future work.Although the most successful strategy, which in the end determined the objective function ofthe collective, was “playing to win”, this does not necessarily means that all participants share thesame goal. As we also saw in Figure 10, the objective function of an individual may differ from thatof the others within a batch.

During the teammate selection stage, when making the decision of whether to switch to a newteammate or to stay with their previous one, participants are trading between the strategies ofexploration and exploitation respectively. For the participants, opting for exploration means thatthey have an increased chance of finding the best, for them, teammate in the pool. However,exploring various pairings comes at the cost of a higher collaboration learning curve and morecognitively demanding interactions with the new teammate(s). The benefit of the exploitationstrategy is that participants will work with a person they have already worked with, missing howeverthe chance of discovering a potentially better teammate. These trade offs between exploration , Vol. 1, No. 1, Article . Publication date: February 2020. and exploitation have a direct impact on the performance of the team formation algorithm, andconsequently on the performance of the batch as a collective.First, exploration-exploitation is affected by the behavior patterns of the participants of theparticular batch. The teammate decisions taken by the participants during each teammate selectionround, shape the affinity matrix that the SOT algorithm uses to form the teams. For example, ifmost users do not opt for exploitation, meaning that they consistently vote to work with the sameteammates, the affinity matrix is more sparse, providing less options for pairing the persons. Oneway to fix this would be to utilise the network analysis information, such as the one illustrated inFig. 11, to potentially pair participants based on the clusters that are being formed, i.e. recommending“teammates of teammates”. In the future it would be useful to explore whether such an approach,which is similar to collaborative filtering used in recommender systems, could help address suchcold-start problems, while still ensuring user agency in teammate selection. An overall risk-aversebatch of workers can also make the team formation algorithm susceptible to local optima: if peoplestick to their initial teammates, they will not have the chance to work with teammates with whomthey might collaborate even better. To amend this, future work could explore allowing a certaindegree of randomness in the SOT algorithm to enable some users to experience working withothers, and therefore explicitly favor exploration. In recommender systems, this is achieved throughthe element of serendipity, and future work could explore introducing it at different degrees andtime points.Second, the exploration and exploitation patterns are affected by the choices of the task designer,for example in terms of the number of rounds and the batch size. In our study we used threerounds, and most participants reported that this number was sufficient, irrespective of condition .Nevertheless, we noted that exploration had not finished at the end of the third round, sinceparticipants were still changing their teammate selection preferences. In the future, it would beinteresting to examine how many rounds it takes for an average batch to converge, i.e. how manyrounds it would take before the SOT collective stops exploring and starts to fully exploiting itsstabilised team formations. Simulations could be a useful tool to explore this part of future work,as adding more rounds means lengthier and more costly tasks. Similarly, simulations could explorehow the batch size affects convergence and the performance of the SOT algorithm. Attention is one of the most scarce resources of our information overloaded era [28]. Althoughthe estimates for the length of the human attention, in particular for general sustained tasks, havenot yet been proven [120], a number of studies have pointed out that the type of task is the mostinfluential factor in the quality and length of the average adult attention span. This is very muchtask-dependent, and, as supported by findings [30], it can even be conditioned by the engagementlevels and intrinsic motivation of the individual. Attention, according to the attention economy,comes at a cost [28], and when intertwined with multiple concurrent activities, it becomes evenmore in short supply. In our task design, all of the stages of the workflow were designed to appearin front of the users for only a limited amount of time, usually a few minutes on average. Userscould follow the timer at every step of the collaboration as indicated by a visible countdown event.Moreover, various time lengths were tested through a number of trials at development mode. Theresults helped establishing a reasonable timeline that would best accommodate time-on-task anduser performance. The choice of restricting the work spaces to finite window frames comes from the In the question “The game had three rounds. Ideally, how many rounds would you need to decide who is the best teammatefor you?” at the questionnaire at the end of the task, most participants reported a value close to 3, with no statisticallysignificant difference between the two examined conditions., Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 29 need to meet the constraints of task synchronicity, time management and attention retention. Thistechnique is closely related to the concepts of timeboxing [63] and iterative development used insoftware design and pair programming [75]. The benefit of this sort of time management techniques[25], designed around enhancing performance and mono-tasking, is the known reduced impact ofinternal and external interruptions on focus and flow. It provides a clear structure and sustainedfocal point around the single priority at hand. Limiting time on task also tries to address the risk thatextended periods of effort on a single task lead to performance decline [73]. As with every designchoice, drawbacks also exist, and these are mainly related to the inherited limitations to thinkingand planning time, as well as the need for users to effectively manage the collaboration spacewithin fixed timelines. The collaborative task requires cooperation between teammates and quickjudgment of their characteristics for team formation. Fitting together time management, complexdecision making and time-on-task attention depletion into a functional collaborative framework isone of the greatest challenges met by our framework. The complexity of this challenge becomesgreater as the number of workers increases. Therefore, to attain a fair level of scalability for ourframework, we rely on concepts such as thin-slicing [112] and cognitive heuristics [85] to reducethe mental effort of the workers and aid the path to problem solving by confining the task allotmentto fixed sizes which can be adjusted to the level of complexity of the given subject. The sample story,used in all of the experiments, was designed according to its simplicity, brevity and ambiguity ofmeaning, and purposefully crafted to trigger a number of diverse responses from the collaborators.Testing other sample stories and topics, as well as different lengths of the exercises could furtherthe contribution of our research on team formation for creative and complex tasks, independentlyof the type of subject.

A typical risk in online work settings is the low level of veracity in worker responses, e.g. in termsof the evaluation scores workers give to their teammates. As shown in [43], it is often difficult todistinguish good from low-performance workers in paid crowdsourcing platforms, because mostworkers have high reputation scores as a result of social pressure. Competitive self-organisation onwhich the SOT approach is based, is in line with the main idea championed in the aforementionedpaper, i.e. the need to incentivize accurate feedback by rebounding the consequences of feedbackto the person who gave it. By giving truthful responses in regards to their teammate’s capabilities,and by selecting the people with who they really want to work with, workers give themselves morechances to collaborate with a good teammate, and thus to win. In contrast, if workers give highscores to all their teammates, regardless of actual skill or collaboration capability, their chances to bepaired with an appropriate teammate – and thus win – are lowered. This competitive element, whichis built directly into the SOT framework, becomes in this way a powerful drive of truthful behavior.As a second built-in quality mechanism, the proposed SOT framework uses peer assessment at theend of each round, in order to select the story fragment that will be used to continue the mainstory. As shown in [118], peer assessment is another powerful mechanism to ensure that the bestoutcome is chosen and allow the best capabilities of the worker collective to fully emerge.

The above two quality mechanisms, i.e. competitive self-organisation and peer assessment, comple-ment the importance of the element of agency within the SOT framework . Relying on the elementagency alone could mean that workers choose teammates with whom they get along well, butnot necessarily those with whom their performance is optimized, especially in the presence forexample of social pressure. By adding the element of competition (primarily) and peer assessment(secondarily) workers have a strong motivation to accurately assess the capability of the personthey choose. Aside from the above two quality mechanisms, in the future, it would be interesting , Vol. 1, No. 1, Article . Publication date: February 2020. to investigate which additional mechanisms, for example in the form of intrinsic motivators, canbe added to the SOT framework to further safeguard truthful responses and high quality of work.

A long-term scholarly debate exists in small groups’ literature on whether dyads constitute teams,which inspires our following discussion on applying the proposed model to triads or even largergroups. On the one hand, scholars such as Moreland [90] are of the opinion that the size of ateam should be at least three, because dyads are more ephemeral than larger groups, and certainphenomena like majority/minority relations, coalition formation, and group socialization can onlybe observed in larger groups. Other researchers, such as Williams [119], disagree, arguing that twopeople can be considered to be a team, since some of the most interesting group processes, likeinclusion/exclusion, power dynamics, leadership and followership, cohesiveness, social facilitation,and performance occur in dyads in the same manner that they do in larger groups, and that, inmost instances, dyads operate under the same principles that explain group dynamics in teams ofthree or more.In the experimental part of this study we worked with dyads, as we are primarily interestedin the team processes that are present already from the dyad setup (such as inclusion/exclusion,leadership/followership, or performance), and less interested in phenomena that occur exclusivelyin large groups (like majority/minority relations). This choice was also motivated by the fact thatin the particular domain of online collaboration, it is not uncommon to work with pairs as theessential foundations for studying team phenomena that have implications for larger groups [24,51, 79, 83, 87, 91, 125].Nevertheless our proposed model can be extended quite straightforwardly to cover groups ofthree members or more. Specifically, the algorithm that helps form the teams first creates a completegraph with candidate team members as nodes and average pairwise ratings as edges. The algorithmthen produces all possible graph cuts of a given team size, which in our experiments was set to two.Next, the algorithm greedily selects those teams that maximize inter-group affinity, eliminating allalternative teams that the selected individuals could have participated into, until all individualsbelong to a team. The size of the cut is a parameter, and the algorithm can be adapted to computeall possible cuts of a given size, equal to the size of the desired teams (naturally, depending onthe specified team size the teams formed last may have less members). The specific algorithm isgreedy, as we note that the problem of optimally partitioning a complete graph into subsets ofequal cardinality falls under the NP-Hard complexity category [41]. Other approaches, includingheuristics and metaheuristics, can also be used to create teams efficiently, for example an adaptationof the k-nearest neighbor algorithm. Finally, in case the goal is to specify the number of teams to becreated, rather than the team size, polynomial-time algorithms can adapted, such as the algorithmproposed by Andreev and Racke to solve the Balanced Graph Partitioning problem [6].

Our proposed model and system has a number of implications for future system design, particularlyfor the design of online work platforms. We discuss these in the following.A first implication concerns the advantages and disadvantages of purely member-driven teams,and, consequently, the desired level of algorithmic involvement. Our work is driven by the need tomitigate the problems brought by the tight algorithmic supervision of current top-down team for-mation systems, which range from inefficient collaboration to significant psychological discomfort.As we also show in this paper, incorporating agency can improve the outcomes of collaborativework and worker well-being. In the long run, giving workers more control of whom they workwith and how can help create online work systems that are more empowering and offer more , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 31 opportunities for personal development for the participating workers. Despite its many advan-tages, user agency in teammate selection also comes with ethical concerns and potential risks forthe workers, but also for the platforms. Delegating team formation fully to human participantsmeans that some workers may be more sought after than others. This can prove beneficial forthe collaboration (e.g., a person knowing that they can work better with a certain teammate),but it may also be detrimental (e.g., exclusion of specific individuals due to demographic factors,similarity, or familiarity [58]). Recent evidence from the Human-Computer Interaction commu-nity [46], shows that allowing full individual control may be at least as problematic because it canreplicate systematic inequality, exclude people who do not “look like” good team members, or leadto segregated teams. To mitigate possible selection biases, our system can be expanded to explicitlypromote diversity. For example, in an approach inspired by traditional recommender systems [72],the SOT algorithm could be parameterized to promote candidate teammate profiles that the userhas not seen or selected before, based on feature dissimilarity. Alternatively, the system could beparameterized to reward (monetarily, or with more time for example) workers for collaboratingwith people outside their “comfort zone”. Another potential risk stemming from fully human-ledteam formation is the exclusion of low-ranked users, for example those that received low scores inearly evaluations, or newcomers (for versions of the system that allow this). To help those usersrecover and avoid segregation, the SOT algorithm could be parameterized to include the element ofserendipity, or to explicitly reward “mentorship”, i.e. teams that mix low-ranked and high-rankedusers. Finally, recommendation diversity and hierarchical clustering techniques can prove usefulto ensure the scalability of the system. In our experiments we used batches of up to 12 people,however batches with more workers may be available or necessary for a particular task. To helpworkers efficiently process dozens or hundreds of candidate worker profiles, one could envisionextending our proposed SOT algorithm with hierarchical clustering recommendations, which startby presenting to the worker diverse teammate “types”, and then allow the user to explore clustersof candidate teammates that are similar to their preferred type. What is important to note here isthat, it is in such situations, where algorithmic involvement can prove beneficial to the teammateselection process rather than blocking it.Another risk of full self-organization, this time impacting the whole online work system, isplatform disintermediation in which workers negotiate, collaborate, and transact with one another,and potentially with the client, outside the platform boundaries.

Constrained self-organization ,such as the one advocated in our proposed system, where an algorithm assists worker teammatenegotiations, has been conceptually proposed [64] as the golden mean between safeguarding workerautonomy and protecting digital work platforms from disintermediation.Another design implication comes from the form used by the system to grant agency to theworkers. One can distinguish two main methods of granting user agency: direct negotiations andmediated approaches. Direct agency negotiations take place when the system enables users to askothers whether they want to become teammates, be able to answer positively or negatively, anddirectly negotiate with several candidate teammates until the groups are formed [49, 127]. Theadvantage of this approach is that it allows workers to explore whether a collaboration shouldproceed and why fully. The disadvantage is that the negotiation process can be lengthy, and thenumber of possible collaborations that can be explored is minimal due to the human cognitivelimitations of processing rich information about one’s candidate teammates (including verbaland also non-verbal communication, e.g., through video), as well as the fatigue that inevitablyaccompanies this process. Another disadvantage of direct negotiation comes from exposing users tothe personal disclosure of explicitly declaring their interests and rejections towards their teammates.In popular person-to-person recommender systems [29] the user first states their intent at thesystem level (e.g. dating apps permitting users to swipe left/right) before allowing the newly , Vol. 1, No. 1, Article . Publication date: February 2020. matched pair to establish unmediated communication. Mediated agency approaches, such as theones explored in [82, 86, 113] elicit users’ teammate preferences, through methods such as themaximum satisfiability problem (MaxSAT) [54], the comprehensive assessment for team-membereffectiveness (CATME) [10], and the Gale-Shapley Algorithm [44]. The advantages of this approachare that it allows workers to explore significantly more candidate teammates since the profileinformation to be processed per teammate is more concise and faster. The disadvantage is that itgrants less time for reflection on a candidate teammate’s suitability, and to a certain extent, it isalso dependent on algorithmic involvement. The approach adopted in this work fits more in thesecond category, i.e., mediated agency, since the SOTs system uses the participants’ explicit stated(and not deduced) preferences about their teammates, rather than the outcome of their in-betweendirect negotiations. This design choice is more appropriate for online and crowd work platforms,where the tasks are time-bounded, and there is a need to maintain scalability in the presence ofdozens of candidate teammate profiles to choose from. However, at the same time, our systemallows users to directly “negotiate” with one another, in the sense that workers can try out differentteammates across the rounds by actually working with them. Future design frameworks couldexplore how direct negotiations could be better incorporated into large-scale online work systemswithout negatively impacting the task’s completion. One possible solution in this direction could becombining direct and mediated negotiations in individual tasks, and then maintaining a databaseof the negotiation results across the tasks, which can be progressively enriched by the workers asthey discuss with more teammates over time.

This work also has a number of limitations. Task type is one of them. In this study we examined aparticular task, i.e. fictional story writing. This task belongs to the broader category of complex,open-ended and creative tasks, which require giving workers full task context (in this case aboutthe main story), and which cannot be easily decomposed to micro-task level. Tasks of this type arenot routinely handled through crowd work but are nevertheless crucial to our knowledge economy,for solving problems that range from creative idea generation to creative disaster response, toname just a few. Although the particular task type does not require specialized skills, it does allowdistinguishing between different levels of creativity and prior experience specific to the task (inthis case in story writing) that the workers may possess in various degrees. In this sense, the taskallows participants to filter their candidate co-workers based on their creative skills, and in parallelit is suitable for the workforce available in commercial crowd work platforms.In the future it would be useful to explore self-organization for other types of tasks that aremore frequently encountered in online work settings, such as those requiring stricter workflows,those with predefined team roles, tasks requiring expert skills or those that can be decomposedto smaller work units. Self-organization in those contexts could mean exploring how workersare given agency to not only choose their teammates, but how to split work responsibilities anddelegate task parts among their teammate circles, in line with latest directions on crowd workreported in the literature [122].Another limitation is the degree of agency given to the workers, and the fact that worker agencyis restricted to teammate choice. Although this is one of the first studies to explore self-organizationin an online crowd work context, it did limit worker agency to the decision about which teammatesto work with. The rest of the workflow settings, like the timing of the activities, the number ofrounds, size of the worker batch, etc., were not up to the workers to decide, and this inevitablyposes limits on their self-management potential. Future work could explore the relative weight ofthe aforementioned workflow design settings, in the final result, through a series of experiments,each focused on a particular setting. Future work could also explore giving workers agency over the , Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 33 entire workflow design , in order to better tailor their collective work strategy to the needs of theparticular task.Finally, the design used in this study combines the elements of competition and collaboration,which are often seen in real-life crowdsourcing scenarios (e.g., Kaggle competitions). We retainedthe element of competition and its surrogate product of user popularity as it is a secondary resultof rating users for their output. Users in our study can also see their peers’ expertise information,similarly to how people almost always become aware of the expertise of other teammates inreal-world team applications. Nevertheless popularity can also impact teammate choice beyondthe necessities of the task, and into selection biases. To mitigate this, future systems could offerthe option to reveal the users’ performance, or any other popularity scores, as an opt-in/opt-outcomponent depending on the scope of the system.

This paper investigates the effects of a novel online team formation framework, titled Self-OrganizingTeams(SOTs). SOTs places increased emphasis on user agency over team formation and relies onthe collective decisions of online workers to self-organize into effective teams, while being sup-ported – but not guided – by an algorithm. We compared the SOT framework with two baselines,where individuals are allocated the same teammate throughout a creative online task, either withthe illusion of user agency (placebo condition) or without it (no-agency condition). Allocatingthe same teammate for the entire task duration is a typical approach of forming teams in onlinework situations. Our findings indicate that the SOTs method leads to a higher quality output, asmeasured by independent evaluators. Furthermore, we carried out a set of quantitative analysisof the worker’s perception of the collaboration, which showed that teams formed under the SOTcondition are more satisfied during their collaboration, and able to collaborate and help each othermore.The purpose of this paper is to lay the ground for the analysis of crowd-led collaboration aidedby non-intrusive models, which are inspired by bottom-up self-organizing systems. It is our hopethat this work will inspire more researchers to look into online team formation systems that moveaway from the trend of micro-managing workers, and into the direction of making the latter anintegral part of the algorithmic decision-making process. In the future, we aim to integrate morecomplex user-preference elicitation methods into the approach, for example by using explanationsto allow users to collectively set the objective function weights and decide which team formationelements are most important to solve the complex problem at hand. As an example of the above,workers could use the SOTs framework to decide how much diversity or homogenerity a certaintask requires, and balance this with the need to optimise for skill or personality complementarity.

REFERENCES [1] Sharad Agarwal and Jacob R Lorch. 2009. Matchmaking for online games and other latency-sensitive P2P systems. In

Proceedings of the ACM SIGCOMM 2009 conference on Data communication . 315–326.[2] Genrikh Saulovich Al’tshuller. 1999.

The innovation algorithm: TRIZ, systematic innovation and technical creativity .Technical innovation center, Inc.[3] Teresa M Amabile. 1998.

How to kill creativity . Vol. 87. Harvard Business School Publishing Boston, MA.[4] Teresa M Amabile. 2018.

Creativity in context: Update to the social psychology of creativity . Routledge.[5] Mike Ananny. 2016. Toward an Ethics of Algorithms: Convening, Observation, Probability, and Timeli-ness.

Science, Technology, & Human Values

41, 1 (2016), 93–117. https://doi.org/10.1177/0162243915606523arXiv:https://doi.org/10.1177/0162243915606523[6] Konstantin Andreev and Harald Racke. 2006. Balanced graph partitioning.

Theory of Computing Systems

39, 6 (2006),929–939.[7] Constantine Andriopoulos. 2001. Determinants of organisational creativity: a literature review.

Management decision

39, 10 (2001), 834–841. , Vol. 1, No. 1, Article . Publication date: February 2020. [8] Linda Argote. 1982. Input uncertainty and organizational coordination in hospital emergency units.

Administrativescience quarterly (1982), 420–434.[9] Donald R. Bacon, Kim A. Stewart, and William S. Silver. 1999. Lessons from the Best and Worst Student TeamExperiences: How a Teacher can make the Difference.

Journal of Management Education

23, 5 (1999), 467–488.https://doi.org/10.1177/105256299902300503 arXiv:https://doi.org/10.1177/105256299902300503[10] Murray R Barrick, Greg L Stewart, Mitchell J Neubert, and Michael K Mount. 1998. Relating member ability andpersonality to work-team processes and team effectiveness.

Journal of applied psychology

83, 3 (1998), 377.[11] K Beck, M Beedle, A VanBennekum, A Cockburn, W Cunningham, M Fowler, J Frenning, J Highsmith, A Hunt, RJeffries, J Kern, B Marick, R.C. Martin, S Mellor, K Schwaber, J Sutherland, and D Thomas. 2001. The Agile Manifesto.https://agilemanifesto.org/[12] Roi Becker, Yifat Chernihov, Yuval Shavitt, and Noa Zilberman. 2012. An analysis of the steam community networkevolution. In . IEEE, 1–5.[13] Jon Birger. 2015.

Date-onomics: how dating became a lopsided numbers game . Workman Publishing.[14] Margaret A Boden. 2004.

The creative mind: Myths and mechanisms . Routledge.[15] Dirk Bollen, Bart P Knijnenburg, Martijn C Willemsen, and Mark Graus. 2010. Understanding choice overload inrecommender systems. In

Proceedings of the fourth ACM conference on Recommender systems . ACM, 63–70.[16] Thomas J Bouchard Jr and Melana Hare. 1970. Size, performance, and potential in brainstorming groups.

Journal ofapplied Psychology

54, 1p1 (1970), 51.[17] Leo F Brajkovich. 2003. Executive Commentary.

Academy of Management Perspectives

17, 1 (2003), 110–111.[18] Alastair Brotchie and Mel Gooding. 1995.

A Book of Surrealist Games . Redstone Press.[19] Lukas Brozovsky and Vaclav Petricek. 2007. Recommender system for online dating service. arXiv preprint cs/0703042 (2007).[20] Abraham Carmeli and John Schaubroeck. 2007. The influence of leaders’ and other referents’ normative expectationson individual involvement in creative work.

The Leadership Quarterly

18, 1 (2007), 35–48.[21] Gerry Chan, Ali Arya, and Anthony Whitehead. 2018. Keeping players engaged in exergames: A personalitymatchmaking approach. In

Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems .1–6.[22] Ankit Chauhan, Pascal Lenzner, and Louise Molitor. 2018. Schelling segregation with strategic agents. In

InternationalSymposium on Algorithmic Game Theory . Springer, 137–149.[23] Gilad Chen, Stanley M Gully, and Dov Eden. 2001. Validation of a new general self-efficacy scale.

Organizationalresearch methods

4, 1 (2001), 62–83.[24] Prerna Chikersal, Maria Tomprou, Young Ji Kim, Anita Williams Woolley, and Laura Dabbish. 2017. Deep structuresof collaboration: Physiological correlates of collective intelligence and group satisfaction. In

Proceedings of the 2017ACM Conference on Computer Supported Cooperative Work and Social Computing . 873–888.[25] Francesco Cirillo. 2009.

The pomodoro technique . Lulu. com.[26] John L Cordery, David Morrison, Brett M Wright, and Toby D Wall. 2010. The impact of autonomy and task uncertaintyon team performance: A longitudinal field study.

Journal of organizational behavior

31, 2-3 (2010), 240–258.[27] Ana Cristina Costa, C Ashley Fulmer, and Neil R Anderson. 2018. Trust in work teams: An integrative review,multilevel model, and future directions.

Journal of Organizational Behavior

39, 2 (2018), 169–184.[28] Thomas H Davenport and John C Beck. 2001. The attention economy.

Ubiquity

Social media+ society

2, 2(2016), 2056305116641976.[30] Md David Cornish, Diane Dukette, et al. 2009.

The essential 20: Twenty components of an excellent health care team .Dorrance Publishing.[31] Carsten KW De Dreu and Michael A West. 2001. Minority dissent and team innovation: The importance of participationin decision making.

Journal of applied Psychology

86, 6 (2001), 1191.[32] Giovanna Di Marzo Serugendo, Noria Foukia, Salima Hassas, Anthony Karageorgos, Soraya Kouadri Mostéfaoui,Omer F. Rana, Mihaela Ulieru, Paul Valckenaers, and Chris Van Aart. 2004. Self-Organisation: Paradigms andApplications. In

Engineering Self-Organising Systems , Giovanna Di Marzo Serugendo, Anthony Karageorgos, Omer F.Rana, and Franco Zambonelli (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–19.[33] Saray Díaz Suarez. 2015. Evaluating creative writing: the criterion behind short stories’ assessment. (2015).[34] Ana Fernández Dobao and Avram Blum. 2013. Collaborative writing in pairs and small groups: Learners’ attitudesand perceptions.

System

41, 2 (2013), 365–378.[35] Steven Dow, Julie Fortuna, Dan Schwartz, Beth Altringer, Daniel Schwartz, and Scott Klemmer. 2011. Prototypingdynamics: sharing multiple designs improves exploration, group rapport, and results. In

Proceedings of the SIGCHIConference on Human Factors in Computing Systems . Acm, 2807–2816., Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 35 [36] David Engel, Anita Williams Woolley, Ishani Aggarwal, Christopher F Chabris, Masamichi Takahashi, Keiichi Nemoto,Carolin Kaiser, Young Ji Kim, and Thomas W Malone. 2015. Collective intelligence in computer-mediated collaborationemerges in different contexts and cultures. In

Proceedings of the 33rd annual ACM conference on human factors incomputing systems . ACM, 3769–3778.[37] AYAN Erdal and S Sadi Seferoglu. 2017. Using EtherPad for online collaborative writing activities and learners withdifferent language learning strategies.

Eurasian Journal of Applied Linguistics

3, 2 (2017), 205–233.[38] Samer Faraj, Stella Pachidi, and Karla Sayegh. 2018. Working and organizing in the age of the learning algorithm.

Information and Organization

28, 1 (2018), 62 – 70. https://doi.org/10.1016/j.infoandorg.2018.02.005[39] Shelly D Farnham, Bruce Christopher Phillips, Scott Lee Tiernan, Keith Steury, William B Fulton, and Jens Riegelsberger.2009. Method for online game matchmaking using play style information. US Patent 7,614,955.[40] Maj S Fausing, Hans Jeppe Jeppesen, Thomas S Jønsson, Joshua Lewandowski, and Michelle C Bligh. 2013. Moderatorsof shared leadership: work function and team autonomy.

Team Performance Management: An International Journal (2013).[41] Andreas Emil Feldmann and Luca Foschini. 2015. Balanced partitions of trees and applications.

Algorithmica

71, 2(2015), 354–376.[42] Xiang Feng, Hanyu Xu, Yuanbo Wang, and Huiqun Yu. 2019. The social team building optimization algorithm.

SoftComputing

23, 15 (2019), 6533–6554.[43] Snehalkumar (Neil) S Gaikwad, Durim Morina, Adam Ginzberg, Catherine Mullings, Shirish Goyal, Dilrukshi Gamage,Christopher Diemert, Mathias Burton, Sharon Zhou, Mark Whiting, et al. 2016. Boomerang: Rebounding theconsequences of reputation feedback on crowdsourcing platforms. In

Proceedings of the 29th Annual Symposium onUser Interface Software and Technology . 625–637.[44] David Gale and Lloyd S Shapley. 1962. College admissions and the stability of marriage.

The American MathematicalMonthly

69, 1 (1962), 9–15.[45] Diego Gómez-Zará, Matthew Paras, Marlon Twyman, Jacqueline N. Lane, Leslie A. DeChurch, and Noshir S. Contractor.2019. Who Would You Like to Work With?. In

Proceedings of the 2019 CHI Conference on Human Factors in ComputingSystems (CHI ’19) . ACM, New York, NY, USA, Article 659, 15 pages. https://doi.org/10.1145/3290605.3300889[46] Diego Gómez-Zará, Matthew Paras, Marlon Twyman, Jacqueline N Lane, Leslie A DeChurch, and Noshir S Contractor.2019. Who Would You Like to Work With?. In

Proceedings of the 2019 CHI conference on human factors in computingsystems . 1–15.[47] Narasimhaiah Gorla and Yan Wah Lam. 2004. Who should work with whom?: building effective software projectteams.

Commun. ACM

47, 6 (2004), 79–82.[48] Thore Graepel and Ralf Herbrich. 2006. Ranking and matchmaking.

Game Developer Magazine

25 (2006), 34.[49] Roger Guimera, Brian Uzzi, Jarrett Spiro, and Luis A Nunes Amaral. 2005. Team assembly mechanisms determinecollaboration network structure and team performance.

Science

Mathemat-ical biosciences

Proceedings of the 2004 ACM conference onComputer supported cooperative work . 554–563.[52] Martine Haas and Mark Mortensen. 2016. The secrets of great teamwork.

Harvard business review

94, 6 (2016), 70–76.[53] Mark H Hansen, John L Morrow Jr, and Juan C Batista. 2002. The impact of trust on cooperative membership retention,performance, and satisfaction: an exploratory study.

The International Food and Agribusiness Management Review

5, 1(2002), 41–59.[54] Pierre Hansen and Brigitte Jaumard. 1990. Algorithms for the maximum satisfiability problem.

Computing

44, 4(1990), 279–303.[55] Alexa M Harris, Diego Gómez-Zará, Leslie A DeChurch, and Noshir S Contractor. 2019. Joining together online:the trajectory of CSCW scholarship on group formation.

Proceedings of the ACM on Human-Computer Interaction

Encyclopedia of library and information sciences

Agile Project Management: Creating Innovative Products . Addison-Wesley.[58] Pamela J Hinds, Kathleen M Carley, David Krackhardt, and Doug Wholey. 2000. Choosing work group members:Balancing similarity, competence, and familiarity.

Organizational behavior and human decision processes

81, 2 (2000),226–251.[59] Rashina Hoda, James Noble, and Stuart Marshall. 2010. Organizing Self-Organizing Teams.

Association of ComputerManufacturers Journal (2010). https://doi.org/10.1109/NSREC.2017.8115448, Vol. 1, No. 1, Article . Publication date: February 2020. [60] Martin Hoegl and Praveen Parboteeah. 2006. Autonomy and teamwork in innovative projects.

Human ResourceManagement: Published in Cooperation with the School of Business Administration, The University of Michigan and inalliance with the Society of Human Resources Management

45, 1 (2006), 67–79.[61] Andrea B Hollingshead and David P Brandon. 2003. Potential benefits of communication in transactive memorysystems.

Human communication research

29, 4 (2003), 607–615.[62] Farnaz Jahanbakhsh, Wai-Tat Fu, Karrie Karahalios, Darko Marinov, and Brian Bailey. 2017. You Want Me to Workwith Who?: Stakeholder Perceptions of Automated Team Formation in Project-based Courses. In

Proceedings ofthe 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17) . ACM, New York, NY, USA, 3201–3212.https://doi.org/10.1145/3025453.3026011[63] Pankaj Jalote, Aveejeet Palit, Priya Kurien, and VT Peethamber. 2004. Timeboxing: a process model for iterativesoftware development.

Journal of Systems and Software

70, 1-2 (2004), 117–127.[64] Mohammad Hossein Jarrahi, Will Sutherland, Sarah Beth Nelson, and Steve Sawyer. 2020. Platformic Management,Boundary Resources for Gig Work, and Worker Autonomy.

Computer Supported Cooperative Work (CSCW)

29, 1(2020), 153–189.[65] Vijay Kamble, Anilesh K. Krishnaswamy, and Hannah Li. 2018. Exploration vs. Exploitation in Team Formation.

CoRR abs/1809.06937 (2018). arXiv:1809.06937 http://arxiv.org/abs/1809.06937[66] Athina Karatzogianni and George Michaelides. 2009. Cyberconflict at the edge of chaos: Cryptohierarchies andself-organisation in the open-source movement.

Capital & Class

33, 1 (2009), 143–157. https://doi.org/10.1177/030981680909700108 arXiv:https://doi.org/10.1177/030981680909700108[67] James Kennedy. 2010. Particle swarm optimization.

Encyclopedia of machine learning (2010), 760–766.[68] Joy Kim, Justin Cheng, and Michael S. Bernstein. 2014. Ensemble: Exploring Complementary Strengths of Leaders andCrowds in Creative Collaboration. In

Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work& . ACM, New York, NY, USA, 745–755. https://doi.org/10.1145/2531602.2531638[69] Joy Kim, Sarah Sterman, Allegra Argent Beal Cohen, and Michael S Bernstein. 2017. Mechanical novel: Crowdsourcingcomplex work through reflection and revision. In

Proceedings of the 2017 ACM Conference on Computer SupportedCooperative Work and Social Computing . 233–245.[70] David Knights and Fergus Murray. 1994.

Managers divided: Organisation politics and information technology manage-ment . John Wiley & Sons, Inc.[71] Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the userexperience of recommender systems.

User Modeling and User-Adapted Interaction

22, 4-5 (2012), 441–504.[72] Matevž Kunaver and Tomaž Požrl. 2017. Diversity in recommender systems–A survey.

Knowledge-Based Systems

Behavioral and brain sciences

36, 6 (2013), 661–679.[74] Claus W Langfred. 2000. The paradox of self-management: Individual and group autonomy in work groups.

Journalof Organizational Behavior

21, 5 (2000), 563–585.[75] Craig Larman and Victor R Basili. 2003. Iterative and incremental developments. a brief history.

Computer

36, 6(2003), 47–56.[76] Edward E Lawler III and Christopher G Worley. 2006. Designing Organizations That Are Built to Change.

MIT SloanManagement Review (2006).[77] Q. Liu, T. Luo, R. Tang, and S. Bressan. 2015. An efficient and truthful pricing mechanism for team formation incrowdsourcing markets. In . 567–572. https://doi.org/10.1109/ICC.2015.7248382[78] Ioanna Lykourentzou, Vassillis-Javed Khan, Konstantinos Papangelis, and Panos Markopoulos. 2019. MacrotaskCrowdsourcing: An Integrated Definition. In

Macrotask Crowdsourcing . Springer, 1–13.[79] Ioanna Lykourentzou, Robert E Kraut, and Steven P Dow. 2017. Team dating leads to better online ad hoc collaborations.In

Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing . 2330–2343.[80] Ioanna Lykourentzou, Shannon Wang, Robert E Kraut, and Steven P Dow. 2016. Team dating: A self-organized teamformation strategy for collaborative crowdsourcing. In

Proceedings of the 2016 CHI Conference Extended Abstracts onHuman Factors in Computing Systems . ACM, 1243–1249.[81] Ann Majchrzak and Arvind Malhotra. 2013. Towards an information systems perspective and research agenda oncrowdsourcing for innovation.

The Journal of Strategic Information Systems

22, 4 (2013), 257–268.[82] Felip Manya, Santiago Negrete, Carme Roig, and Joan Ramon Soler. 2017. A MaxSAT-based approach to the teamcomposition problem in a classroom. In

International Conference on Autonomous Agents and Multiagent Systems .Springer, 164–173.[83] David W McDonald and Mark S Ackerman. 2000. Expertise recommender: a flexible recommendation system andarchitecture. In

Proceedings of the 2000 ACM conference on Computer supported cooperative work . 231–240., Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 37 [84] Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks.

Annual review of sociology

27, 1 (2001), 415–444.[85] Miriam J Metzger and Andrew J Flanagin. 2013. Credibility and trust of information in online environments: The useof cognitive heuristics.

Journal of pragmatics

59 (2013), 210–220.[86] Daniel Meulbroek, Daniel Ferguson, Mathew Ohland, and Frederick Berry. 2019. Forming More Effective Teams UsingCATME TeamMaker and the Gale-Shapley Algorithm. In . IEEE, 1–5.[87] Robert C Miller, Haoqi Zhang, Eric Gilbert, and Elizabeth Gerber. 2014. Pair research: matching people for collaboration,learning, and productivity. In

Proceedings of the 17th ACM conference on Computer supported cooperative work & socialcomputing . 1043–1048.[88] Tom Minka, Ryan Cleven, and Yordan Zaykov. 2018. Trueskill 2: An improved bayesian skill rating system.

Tech. Rep. (2018).[89] Nils Brede Moe and Torgeir Dingsøyr. 2008. Scrum and team effectiveness: Theory and practice. In

InternationalConference on Agile Processes and Extreme Programming in Software Engineering . Springer, 11–20.[90] Richard L Moreland. 2010. Are dyads really groups?

Small Group Research

41, 2 (2010), 251–267.[91] Bonnie Nardi and Justin Harris. 2006. Strangers and friends: Collaborative play in World of Warcraft. In

Proceedingsof the 2006 20th anniversary conference on Computer supported cooperative work . 149–158.[92] Richard E Nisbett and Timothy D Wilson. 1977. The halo effect: evidence for unconscious alteration of judgments.

Journal of personality and social psychology

35, 4 (1977), 250.[93] Gerardo A Okhuysen and Beth A Bechky. 2009. 10 coordination in organizations: an integrative perspective.

Academyof Management annals

3, 1 (2009), 463–502.[94] Habibur Rahman, Senjuti Basu Roy, Saravanan Thirumuruganathan, Sihem Amer-Yahia, and Gautam Das. 2019.Optimized Group Formation for Solving Collaborative Tasks.

The VLDB Journal

28, 1 (Feb. 2019), 1–23. https://doi.org/10.1007/s00778-018-0516-7[95] Thomas H Rasmussen and Hans Jeppe Jeppesen. 2006. Teamwork and associated psychological factors: A review.

Work & Stress

20, 2 (2006), 105–128.[96] Joseph S Renzulli, Steven V Owen, and Carolyn M Callahan. 1974. Fluency, flexibility, and originality as a function ofgroup size.

The Journal of Creative Behavior (1974).[97] Daniela Retelny, Michael S. Bernstein, and Melissa A. Valentine. 2017. No Workflow Can Ever Be Enough: HowCrowdsourcing Workflows Constrain Complex Work.

Proc. ACM Hum.-Comput. Interact.

1, CSCW, Article 89 (Dec.2017), 23 pages. https://doi.org/10.1145/3134724[98] Daniela Retelny, Sébastien Robaszkiewicz, Alexandra To, Walter S. Lasecki, Jay Patel, Negar Rahmati, Tulsee Doshi,Melissa Valentine, and Michael S. Bernstein. 2014. Expert Crowdsourcing with Flash Teams. In

Proceedings of the27th Annual ACM Symposium on User Interface Software and Technology (UIST ’14) . ACM, New York, NY, USA, 75–85.https://doi.org/10.1145/2642918.2647409[99] Jens Riegelsberger, Scott Counts, Shelly D Farnham, and Bruce C Philips. 2007. Personality matters: Incorporatingdetailed user attributes and preferences into the matchmaking process. In . IEEE, 87–87.[100] Markus Rokicki, Sergej Zerr, and Stefan Siersdorfer. 2015. Groupsourcing: Team Competition Designs forCrowdsourcing. In

Proceedings of the 24th International Conference on World Wide Web (WWW ’15) . Interna-tional World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 906–915.https://doi.org/10.1145/2736277.2741097[101] Christian Rudder. 2014.

Dataclysm: Love, Sex, Race, and Identity–What Our Online Lives Tell Us about Our OfflineSelves . Crown.[102] Frank E Saal, Ronald G Downey, and Mary A Lahey. 1980. Rating the ratings: Assessing the psychometric quality ofrating data.

Psychological bulletin

88, 2 (1980), 413.[103] Harvey Sacks, Emanuel A Schegloff, and Gail Jefferson. 1978. A simplest systematics for the organization of turntaking for conversation. In

Studies in the organization of conversational interaction . Elsevier, 7–55.[104] Niloufar Salehi and Michael S. Bernstein. 2018. Hive: Collective Design Through Network Rotation.

Proc. ACMHum.-Comput. Interact.

2, CSCW, Article 151 (Nov. 2018), 26 pages. https://doi.org/10.1145/3274420[105] Niloufar Salehi, Andrew McCabe, Melissa Valentine, and Michael Bernstein. 2017. Huddler: Convening Stable andFamiliar Crowd Teams Despite Unpredictable Availability. In

Proceedings of the 2017 ACM Conference on ComputerSupported Cooperative Work and Social Computing (CSCW ’17) . ACM, New York, NY, USA, 1700–1713. https://doi.org/10.1145/2998181.2998300[106] Michael Helfried Schiller, Gunter Wallner, Christopher Schinnerl, Alexander Monte Calvo, Johanna Pirker, Rafet Sifa,and Anders Drachen. 2018. Inside the group: Investigating social structures in player groups and their influence onactivity.

IEEE Transactions on Games (2018). , Vol. 1, No. 1, Article . Publication date: February 2020. [107] Heinz Schmitz and Ioanna Lykourentzou. 2018. Online Sequencing of Non-Decomposable Macrotasks in ExpertCrowdsourcing.

ACM Transactions on Social Computing

1, 1 (2018), 1.[108] Giovanna Di Marzo Serugendo, Noria Foukia, Salima Hassas, Anthony Karageorgos, Soraya Kouadri Mostéfaoui,Omer F Rana, Mihaela Ulieru, Paul Valckenaers, and Chris Van Aart. 2003. Self-organisation: Paradigms andapplications. In

International Workshop on Engineering Self-Organising Applications . Springer, 1–19.[109] Shung J Shin, Tae-Yeol Kim, Jeong-Yeon Lee, and Lin Bian. 2012. Cognitive team diversity and individual teammember creativity: A cross-level interaction.

Academy of Management Journal

55, 1 (2012), 197–212.[110] Pao Siangliulue, Joel Chan, Steven P. Dow, and Krzysztof Z. Gajos. 2016. IdeaHound: Improving Large-scaleCollaborative Ideation with Crowd-Powered Real-time Semantic Modeling. In

Proceedings of the 29th AnnualSymposium on User Interface Software and Technology (UIST ’16) . ACM, New York, NY, USA, 609–624. https://doi.org/10.1145/2984511.2984578[111] Jan Henrik Sieg, Martin W. Wallin, and Georg Von Krogh. 2010. Managerial challenges in open innovation: a study ofinnovation intermediation in the chemical industry.

R&D Management

40, 3 (2010), 281–291. https://doi.org/10.1111/j.1467-9310.2010.00596.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9310.2010.00596.x[112] Manu Sridharan, Stephen J Fink, and Rastislav Bodik. 2007. Thin slicing. In

Proceedings of the 28th ACM SIGPLANConference on Programming Language Design and Implementation . 112–122.[113] Grace Tacadao and Ramon Prudencio Toledo. 2015. A Generic Model for the Group Formation Problem UsingConstraint Logic Programming. In . IEEE,307–308.[114] Hirotaka Takeuchi and Ikujiro Nonaka. 1986. The new product development game.

Journal of Product InnovationManagement (1986). https://doi.org/10.1016/0737-6782(86)90053-6[115] Elizabeth R Tenney, Eric Turkheimer, and Thomas F Oltmanns. 2009. Being liked is more than having a goodpersonality: The role of matching.

Journal of Research in Personality

43, 4 (2009), 579–585.[116] Rajan Vaish, Snehalkumar (Neil) S. Gaikwad, Geza Kovacs, Andreas Veit, Ranjay Krishna, Imanol Arrieta Ibarra,Camelia Simoiu, Michael Wilber, Serge Belongie, Sharad Goel, James Davis, and Michael S. Bernstein. 2017. CrowdResearch: Open and Scalable University Laboratories. In

Proceedings of the 30th Annual ACM Symposium on UserInterface Software and Technology (UIST ’17) . ACM, New York, NY, USA, 829–843. https://doi.org/10.1145/3126594.3126648[117] Melissa A. Valentine, Daniela Retelny, Alexandra To, Negar Rahmati, Tulsee Doshi, and Michael S. Bernstein. 2017.Flash Organizations: Crowdsourcing Complex Work by Structuring Crowds As Organizations. In

Proceedings ofthe 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17) . ACM, New York, NY, USA, 3523–3537.https://doi.org/10.1145/3025453.3025811[118] Mark E Whiting, Dilrukshi Gamage, Snehalkumar (Neil) S Gaikwad, Aaron Gilbee, Shirish Goyal, Alipta Ballav, DineshMajeti, Nalin Chhibber, Angela Richmond-Fuller, Freddie Vargus, et al. 2017. Crowd guilds: Worker-led reputation andfeedback on crowdsourcing platforms. In

Proceedings of the 2017 ACM Conference on Computer Supported CooperativeWork and Social Computing . 1902–1913.[119] Kipling D Williams. 2010. Dyads can be groups (and often are).

Small Group Research

41, 2 (2010), 268–274.[120] Karen Wilson and James H Korn. 2007. Attention during lectures: Beyond ten minutes.

Teaching of Psychology

34, 2(2007), 85–89.[121] David J Woehr and Allen I Huffcutt. 1994. Rater training for performance appraisal: A quantitative review.

Journal ofoccupational and organizational psychology

67, 3 (1994), 189–205.[122] Alex J Wood, Mark Graham, Vili Lehdonvirta, and Isis Hjorth. 2019. Networked but Commodified: The (Dis)Embeddedness of Digital Labour in the Gig Economy.

Sociology (2019), 0038038519828906.[123] Anita Williams Woolley, Ishani Aggarwal, and Thomas W Malone. 2015. Collective intelligence and group performance.

Current Directions in Psychological Science

24, 6 (2015), 420–424.[124] Anita Williams Woolley, Christopher F Chabris, Alex Pentland, Nada Hashmi, and Thomas W Malone. 2010. Evidencefor a collective intelligence factor in the performance of human groups. science

Proceedings of the 2013 conference on Computer supported cooperativework . 1375–1386.[126] Sharon Zhou, Melissa Valentine, and Michael S. Bernstein. 2018. In Search of the Dream Team: Temporally ConstrainedMulti-Armed Bandits for Identifying Effective Team Structures. In

Proceedings of the 2018 CHI Conference on HumanFactors in Computing Systems (CHI ’18) . ACM, New York, NY, USA, Article 108, 13 pages. https://doi.org/10.1145/3173574.3173682[127] Mengxiao Zhu, Yun Huang, and Noshir S Contractor. 2013. Motivations for self-assembling into project teams.

Socialnetworks

35, 2 (2013), 251–264., Vol. 1, No. 1, Article . Publication date: February 2020. elf-Organizing Teams in Online Work Settings 39

A ANNEX

Table 1. The story quality ratings by the external evaluators were significantly correlated at the 0.01 level(2-tailed), N= 1960.

Pearson CorrelationsGrammar Interest Originality Plot OverallGrammar 1 . ∗∗ . ∗∗ . ∗∗ . ∗∗ Interest . ∗∗ . ∗∗ . ∗∗ . ∗∗ Originality . ∗∗ . ∗∗ . ∗∗ . ∗∗ Plot . ∗∗ . ∗∗ . ∗∗ . ∗∗ Overall . ∗∗ . ∗∗ . ∗∗ . ∗∗1