[PDF] HyMap: eliciting hypotheses in early-stage software startups using cognitive mapping

Abstract

Context: Software startups develop innovative, software-intensive products. Given the uncertainty associated with such an innovative context, experimentation is a valuable approach for these companies, especially in the early stages of the development, when implementing unnecessary features represents a higher risk for companies' survival. Nevertheless, researchers have argued that the lack of clearly defined practices led to limited adoption of experimentation. In this regard, the first step is to define the hypotheses based on which teams will create experiments. Objective: We aim to develop a systematic technique to identify hypotheses for early-stage software startups. Methods: We followed a Design Science approach consisted of three cycles in the construction phase, that involved seven startups in total, and an evaluation of the final artifact within three startups. Results: We developed the HyMap, a hypotheses elicitation technique based on cognitive mapping. It consists of a visual language to depict a cognitive map representing the founder's understanding of the product, and a process to elicit this map consisted of a series of questions the founder must answer. Our evaluation showed that the artifacts are clear, easy to use, and useful leading to hypotheses and facilitating founders to visualize their idea. Conclusion: Our study contributes to both descriptive and prescriptive bodies of knowledge. Regarding the first, it provides a better understanding of the guidance founders use to develop their startups and, for the latter, a technique to identify hypotheses in early-stage software startups.

Full PDF

HHyMap: eliciting hypotheses in early-stage software startups using cognitivemapping

Jorge Melegati a, ∗ , Eduardo Guerra a , Xiaofeng Wang a a Free University of Bozen-Bolzano, Piazza Domenicani 3, Bolzano, Italy

AbstractContext:

Software startups develop innovative, software-intensive products. Given the uncertainty associ-ated with such an innovative context, experimentation is a valuable approach for these companies, especiallyin the early stages of the development, when implementing unnecessary features represents a higher risk forcompanies’ survival. Nevertheless, researchers have argued that the lack of clearly deﬁned practices ledto limited adoption of experimentation. In this regard, the ﬁrst step is to deﬁne the hypotheses based onwhich teams will create experiments.

Objective:

We aim to develop a systematic technique to identifyhypotheses for early-stage software startups.

Methods:

We followed a Design Science approach consistedof three cycles in the construction phase, that involved seven startups in total, and an evaluation of theﬁnal artifact within three startups.

Results:

We developed the HyMap, a hypotheses elicitation techniquebased on cognitive mapping. It consists of a visual language to depict a cognitive map representing thefounder’s understanding of the product, and a process to elicit this map consisted of a series of questionsthe founder must answer. Our evaluation showed that the artifacts are clear, easy to use, and useful leadingto hypotheses and facilitating founders to visualize their idea.

Conclusion:

Our study contributes to bothdescriptive and prescriptive bodies of knowledge. Regarding the ﬁrst, it provides a better understanding ofthe guidance founders use to develop their startups and, for the latter, a technique to identify hypothesesin early-stage software startups.

Keywords: hypotheses engineering, software startups, experimentation

1. Introduction “Good sense is, of all things among men, the mostequally distributed.” With this phrase, Descartesbegins his

Discourse on the method that changedscience and human history. In that book, thephilosopher introduces the scientiﬁc method, a sys-tematic way to derive knowledge based on experi-ments. Recently, a new tendency strengthens thisidea in software engineering: experimentation. Thisapproach is a process of continuously validatingproduct assumptions, transforming them into hy-potheses, prioritizing, and applying the scientiﬁcmethod to test these hypotheses, supporting or ∗ Corresponding author

Email addresses: [email protected] (Jorge Melegati), [email protected] (EduardoGuerra), [email protected] (Xiaofeng Wang) refuting them [1]. In this context, practitionerscan employ several techniques like iterations withprototypes, gradual rollouts, and controlled exper-iments [2] but also problem and solution inter-views [1].In a recent position paper [3], we compared diﬀer-ent models of experimentation and observed that,at the process beginning, they suggest the team toidentify, specify, and prioritize hypotheses. Draw-ing a parallel to Requirements Engineering activ-ities employed in a requirement-driven approach,we argued the need for a set of practices called Hy-potheses Engineering (HE) to identify, specify, andprioritize hypotheses in experimentation.Given the similarity between the terms assump-tion and hypothesis, it is fundamental to diﬀeren-tiate them. Throughout this paper, “assumption”refers to a personal or team-wise, generally implicit, a r X i v : . [ c s . C Y ] F e b nderstanding taken as truth without being ques-tioned or proved. Meanwhile, “hypothesis” is an ex-plicit statement that has not been proved yet butcould be tested through an experiment. That is,assumptions are cognitive and abstract ideas, whilehypotheses are concrete elements employed in ex-perimentation.The natural ﬁrst step of HE is to elicit or deﬁnehypotheses. In this paper, we targeted this prob-lem in the context of software startups. Softwarestartups are organizations looking for a sustainablebusiness model for an innovative product or servicethey develop where software is a core element [4].Although we can easily identify success stories likeAirbnb or Uber, most of these companies fail [5].Reasons for the lack of success are various: demand-ing market conditions, lack of team commitment, ﬁ-nancial issues [6], including an inaccurate businessdevelopment [7]. Since a deﬁning characteristic ofsoftware startups is developing an innovative solu-tion, experimentation is a key element in this con-text [8]. Such a value is corroborated by the factthat Lean Startup, the most well-known methodol-ogy among practitioners, has a strong emphasis onexperimentation [9, 10].Nevertheless, these companies still focus on de-veloping their proposed solution instead of focus-ing on the necessary learning process [11]. This as-pect is essential, especially in early-stage startupsfor which developing the wrong features may rep-resent the resources exhaustion and the consequentending. One of the reasons for this limited adop-tion of experimentation is the lack of clearly de-ﬁned practices [1]. Therefore, an essential step inthe direction of a better implementation of exper-imentation practices is a systematic way to spec-ify and handle hypotheses [3]. In this study, ourgoal is to develop a novel technique to identify thehypotheses on which early-stage software startupsbase their products. Based on hypotheses, thesecompanies could perform experiments and progresswith more precise information about the user andmarket needs. Therefore, to guide this study, wecame up with the following research question: RQ: How can early-stage software startupsdeﬁne hypotheses to supportexperimentation?

To achieve our goal, we followed a design scienceresearch (DSR) approach based on Hevner et al.’sguidelines [12] composed of three cycles. The ﬁrst cycle goal was to understand how the assumptionson which startups base their products are formed.In the second and third cycles, we proposed, eval-uated, and improved HyMap, a technique to elicithypotheses based on cognitive mapping systemati-cally created through a set of questions. We evalu-ated the practice using a multiple-case study withthree software startups. The results indicated thatthe technique is clear, easy to use, independentfrom the facilitator applying it, and useful lead-ing to hypotheses of three types: problem, value,and product. This paper extends a previous pa-per [13] that presented the ﬁrst two cycles of thisstudy. This paper’s main original contributions arethe improvement in the graphical notation and pro-cess performed in the third cycle and the evaluationof the technique with three new software startups.The remaining of this paper is organized accord-ing to Gregor and Hevner’s guidelines to present aDSR study [14]. Section 2 presents a literature re-view including the justiﬁcatory knowledge to sup-port the artifact eﬀectiveness. Section 3 presentsthe DSR method and Section 4 the artifact develop-ment process. Section 5 describes the ﬁnal artifactand Section 6 presents its evaluation. In Section 7,we discuss the results and, ﬁnally, Section 8 con-cludes the paper.

2. Literature review

Gregor and Hevner [14] made a distinction be-tween descriptive knowledge ( Ω ) and prescriptiveknowledge ( Λ ). While the ﬁrst one concerns the“what” about phenomena, including laws and theo-ries to describe natural, artiﬁcial, or human phe-nomena, the second is focused on the “how” ofhuman-built artifacts, including constructs, mod-els, and methods. According to the authors, inDSR, it is important to review both areas to avoidthe lack of novelty and, consequently, contributionto the knowledge. Besides that, this review shouldinclude the justiﬁcatory knowledge, that is, ele-ments used to inform the artifact construction andexplain its eﬀectiveness. We organized this sectionaccording to this distinction: Section 2.1 describesthe available solutions indicating the gap, and Sec-tion 2.2 displays the justiﬁcatory knowledge thatsupports our proposed technique. In the software engineering literature, there aresome models to describe experimentation in gen-2ral. We can mention RIGHT [15], HYPEX [16],and QCD [17]. The authors of these models ana-lyzed the existing literature and current practices ofcompanies applying experimentation. These mod-els presented the process as cyclical approaches con-sisted of some steps executed continuously: identify,specify, and prioritize hypotheses, design an exper-iment, execute it, analyze the results, and updatethe hypotheses accordingly [3]. Nevertheless, thesemodels do not describe how hypotheses could besystematically identiﬁed.Other valuable pieces of Λ knowledge come frompractitioners-oriented literature. Here, we couldmention the Customer Development [18] and theLean Startup [19]. The latter had a huge suc-cess among practitioners and consisted of takingthe founders’ assumptions as hypotheses, buildingexperiments to evaluate them, and based on the re-sults, persevere or pivot to another idea. One criti-cism against the Lean Startup is exactly its lack ofoperationalization. For instance, Bosch et al. [20]proposed the Early-Stage Software Startup Devel-opment Model (ESSSDM) to tackle this problem.It consisted of three parts: idea generation, a pri-oritized backlog, and a funnel in which ideas arevalidated. To generate ideas, the authors suggestedexploratory interviews, brainstorming, or followingpotential customers to understand their needs.To the best of our knowledge, the only techniqueexplicitly focused on hypotheses elicitation is As-sumption Mapping. It is a technique recently pro-posed by Bland et al. [21] consisting of a seriesof canvases, including the Business Model Canvas(BMC) [22] to create hypotheses. Although theBMC was initially based on an ontology system-atically developed [23], Assumption Mapping hasbeen neither derived nor evaluated scientiﬁcally. Insummary, up to the moment, there is no hypothe-sis elicitation technique systematically derived andevaluated. The deﬁnition of software startup is not a con-sensus among published papers, but the most com-mon aspects are innovation and uncertainty [24].Blank [18] proposed a deﬁnition generally adoptedin practice: a startup is an organization formed tosearch for a repeatable and scalable business model.Therefore, searching a business model for a novelsoftware-intensive product is a key deﬁning aspectof a software startup contrasting these organiza-tions to other development teams [25]. The business model concept has a plethora of dif-ferent used deﬁnitions in academic literature [26].Furnari [27] described two theoretical perspectivesin business model research: an activity-based per-spective that describes a business model as “a sys-tem of activities that ﬁrms use to create and capturevalue”, and a cognitive perspective that considers itas a cognitive instrument to represent those activi-ties.Based on the cognitive perspective, Furnari [27]proposed the use of cognitive maps to representbusiness models. Cognitive maps are visual rep-resentations of causal aspects of a person’s beliefsystem as a graph where nodes represent the con-cepts individuals use and arrows, causal links be-tween them [27]. The arrows are generally labeledaccording to the type of relationship: ‘+’ for a pos-itive one, ‘-’ for a negative one, and ‘/o/’ for a neu-tral one. Cognitive maps are supported by Kelly’sPersonal Construct Theory [28]. According to thetheory, a person looks at the world through patternsor templates, that Kelly called constructs, that shecreates and, in which, she tries to ﬁt the reality [29].Kelly also described the person-as-a-scientist idea:“as a scientist, man seeks to predict, and thus con-trol, the course of events” and these constructs “areintended to aid him in his predictive eﬀorts” [29].Brannback et al. [30] have already discussed the re-lationship between cognitive mapping and the per-sonal constructs theory with entrepreneurship. Theauthors argued that “an entrepreneur needs to makesense of his/her reality to predict and control - toﬁnd and solve problems” [30].One essential aspect of software startups to thisdiscussion is the founders’ inﬂuence on the prod-uct deﬁnition. Seppanen et al. [31] investigated thecompetencies of initial teams in software startups.They observed a strong inﬂuence from founders onthe actions and competencies related to the busi-ness and product creation in these nascent compa-nies. Based on what we exposed so far, softwarestartups’ business models are strongly inﬂuencedby how their founders perceive, mentally model theenvironment and how they use these models to pre-dict the market and how the future product willbehave. Research has shown that this inﬂuence isstrong enough to prevent the use of experimenta-tion. For instance, while investigating enablers andinhibitors for experimentation in software startups,Melegati et al. [32] identiﬁed as an inhibitor thefact that founders are often “in love” with the ideadeeming experiments to evaluate it as unnecessary3nd focusing on developing the solution. Giardinoet al. [11] argued that this focus is one of the keychallenges early-stage software startups face.Cognitive mapping could be used to materializethese assumptions and put them in a position tobe challenged. As Eden [28] points out: “by seeingtheir own ideas in this form of visualization [peo-ple] are being encouraged to ‘change their mind.” ’This technique has been used in Software Engineer-ing, for instance, in problem structuring in require-ments engineering [33] or in a decision model fordistributed software development with agile [34].The methods to elicit cognitive maps can be di-vided into two groups depending on how data is ob-tained [35]. Initially, researchers used documents orother sources of evidence to perform content analy-sis to develop these maps. Another way is throughdirect methods where researchers develop the map in situ by interacting with subjects. These di-rect methods can employ two divergent approaches:pairwise judgments of causal relationships or cap-ture through visual forms. In the ﬁrst form, sub-jects answer to questionnaires where all combina-tions of concepts are evaluated and based on theanswers, a map is built. In the second form, withthe subject’s help, a facilitator builds a visual rep-resentation using paper and pencil or software so-lutions. The pairwise approach has better coverageat the expense of being more diﬃcult, less engag-ing, and less representative than the freehand tech-nique [35]. Since the startup context is deﬁned bytime and resource constraints [24], an eﬀective ap-proach targeted to these companies should not betime-consuming. Therefore, a visual approach ismore suitable.Regarding the population of startups, it is essen-tial to mention that they may be in diﬀerent de-velopment stages. Based on previous works in theliterature, Klotins et al. [36] proposed a life-cyclemodel to analyze the startups’ progress composedof four stages: inception, stabilization, growth, andmaturity. The ﬁrst stage starts with the idea andends with the ﬁrst product release. In the nextstage, the startup prepares to scale regarding tech-nical and operational perspectives. In summary,during the early-stages, teams focus on “ﬁnding arelevant problem” and “a feasible solution.” In thegrowth stage, the startup aims to reach a desiredmarket participation, and, ﬁnally, in the last stage,it progresses into an established company. That is,in the later stages, the focus is on marketing and ef-ﬁciency. We decided to initially focus on developing a technique for early-stage startups, mainly becauseof two reasons. First, the lack of testing assump-tions about the customer and market represent ahigher risk to the startup survival at this stagewhere the company generally do not have many re-sources. Second, in a later stage, the hypothesesobtained by the technique might have already beenvalidated or refuted by the product usage from pre-vious stages.Since early-stage startups focus on ﬁnding theproblem and evaluating the proposed solution, theyare essentially testing their value proposition. Onan ontological analysis of the value proposition con-cept, Sales et al. [37] deﬁned “a value propositionas a value assertion a company makes (as the valuebeholder) that a given market segment (the beneﬁ-ciaries) will ascribe a particular value to the expe-riences enabled by an oﬀering (the value object).”Such a deﬁnition is compatible with the cognitive-based perspective of business models.

3. Research method

Given our research goals and question, we aimto solve a real-world problem. Instead of trying tounderstand how a deﬁned phenomenon unfolds, ourgoal is to develop an artifact to act on the world.In this regard, Design Science Research (DSR) isa suitable method. This approach is often usedin Information Systems research as shown by theseveral methodological guidelines (e.g., Hevner etal. [12], Peﬀers et al. [38], and Wieringa [39]) andeven a special issue in

MIS Quarterly (i.e., [40]).Although its use is often not explicitly mentionedin Software Engineering research, in an analysis ofawarded papers in the

International Conference onSoftware Engineering , Engstrom et al. [41] showedthat most of these studies actually could be classi-ﬁed as DSR although not explicitly using the term.More recently, though, researchers have explicitlyused the methodology to tackle problems in soft-ware engineering like gamiﬁcation (e.g., [42]) andrequirements (e.g., [43]).In this research, we followed the guidelines pro-posed by Hevner and colleagues in [12] and [14].According to the authors, DSR seeks to develop in-novative artifacts relying on existing kernel theories“that are applied, tested, modiﬁed, and extendedthrough the experience, creativity, intuition, andproblem solving capabilities of the researcher” [12].These artifacts could be constructs, models, meth-ods, or instantiations. Constructs represent the4anguage used to describe the world, and modelsuse them to represent real-world situations. Meth-ods deﬁne processes to guide how to solve prob-lems and, ﬁnally, instantiations demonstrate howthe previous elements could be used in a scenario.Based on this classiﬁcation, we can categorize theexpected artifact of this study as a method: drivinghow an early-stage software startup can identify thehypotheses that will guide its experiments.Hevner et al. [12] proposed seven guidelines forDSR:G1. Design as an artifact: DSR must produce aviable artifact;G2. Problem relevance: the goal should be to de-velop a solution to relevant problems;G3. Design evaluation: the “utility, quality, andeﬃcacy” of the artifact should be rigorouslydemonstrated;G4. Research contributions: the DSR projectshould provide “clear and veriﬁable contribu-tions” regarding the “design artifact, designfoundations, and/or design methodologies”;G5. Research rigor: DSR should apply rigorousmethods both in the construction and in theevaluation of the artifact;G6. Design as a search process: the DSR processis inherently iterative and the search for thebest, optimal solution is unfeasible. The goalshould be feasible, good designs representingsatisfactory solutions.G7. Communication of research: DSR solutionsshould be presented eﬀectively.Since the target artifact is a method, we sat-isfy G1 and G4. Our argument about the im-portance of experimentation to early-stage softwarestartups fulﬁlls G2. In the following sections, fol-lowing Gregor and Hevner’s guidelines, we describethe design artifact and its search (or development)process as a way to cope with G5 and G6. Topresent them in a logical order, in Section 4, we de-scribe the development process then, in Section 5,we present the ﬁnal artifact. Section 6 presentsthe evaluation (G3 and G4) and, in Section 7, wediscuss the research contributions (G4). This pa-per and previous one [13] communicate our results(G7).To increase the research rigor (G5), we guidedour development and evaluation processes accord-ing to deﬁned criteria. First, to fulﬁll the utilitycriteria, the achievements that the artifact aims for should have “value outside the development envi-ronment” [14]. Therefore, using the artifact, weshould be able to create hypotheses for real situ-ations, that is, for startups other than those thatparticipated in the study. This concept is associ-ated with the perceived usefulness, which is gener-ally used in the research on the adoption of soft-ware development methodologies (e.g., [44, 45])and technology in general, like in the TechnologyAcceptance Model [46]. This concept “refers to thedegree to which a developer expects that following amethodology will improve his or her individual jobperformance” [45]. In the context of experimen-tation in software startups, we can operationalizethis concept by obtaining hypotheses to build ex-periments.Regarding the artifact quality, we consider sev-eral aspects: ease of use, independence from thefacilitator, and clearness. Since our ultimate goalis to impact real startups, we should consider thefuture adoption of this method. In this regard, tak-ing the artifact as innovation, complexity is one fac-tor inﬂuencing adoption [47]. Additionally, giventhat startups are generally time and resources-constrained, we expect that they would not be keento spend a large amount of time learning and apply-ing a new method. It should be independent withrespect to who is applying or facilitating it, provid-ing all the details to proper use, allowing anyone touse it rather than depending on its authors. Finally,the method description should be clear, making itscomprehension straightforward.Finally, the artifact should be eﬀective; that is,it should produce the expected result. In our case,our goals are to reveal hidden assumptions thatfounders had about the product’s environment andwhy it should have value for potential customers,and systemically elicit hypotheses that would workas the basis for experiments.

4. Artifact design process

The design artifact development process con-sisted of an initial exploratory, and two design cy-cles referred to below as cycles 0, 1, and 2. Theﬁrst cycle (0) had the goal of understanding howteams form the assumptions on which they basetheir products. To achieve this goal, we performeda multiple-case study with two early-stage softwarestartups. Our results indicate that requirements arebased on the team’s, especially the founder’s, as-sumptions about the market, and customer behav-5or. In the ﬁrst design cycle (1), we used cognitivemapping to make founders’ assumptions explicit.At this stage, the technique consisted of using boxesand the arrows as described in cognitive mappingand the employment of an open-ended talk wherethe founder described her understanding, and a fa-cilitator drew the map. We evaluated this initiallyproposed method in two other software startups.Our results indicate that this approach could basea comprehensive practice to elicit hypotheses elici-tation in software startups. However, it still lackedsome operationalization. In the ﬁnal design cycle(2), we improved the technique by deﬁning speciﬁcnotations for diﬀerent concepts (customer, product,and features) and creating a list of questions toguide the cognitive map creation. Below, we de-scribe the three cycles in detail, including the re-search method, data collection, analysis, and resultsobtained.

This cycle’s goal was to understand how teamsform the assumptions on which they base theirproducts. Given that such a phenomenon is con-temporary and the boundaries between it and thecontext are not evident, a case study is a suitableresearch method [48]. According to Yin [48], one ra-tionale for this research approach is the representa-tive or typical case. Therefore, we selected softwarestartups where, as mentioned before, the founder isthe one who had the initial idea. Besides that, wefollowed Klotins et al.’s life-cycle model [36] andselected startups in the inception and stabilizationphase. Through our contact network, we selectedtwo startups called from now on as A and B. Bothcompanies were based in the same city in Italy andlocated in a technological park. Nevertheless, at themoment of data collection, startup B participatedin the incubation process while startup A only usedthe space available.Data collection consisted of semi-structured in-terviews that followed a previously deﬁned guide.For both cases, we interviewed the founders and,for case B, also the software developer. The in-terview questions aimed to understand the inter-viewees’ background, the idea, the motivation tobuild the product, and how they changed through-out the company history. Since the goal was to un-derstand from where the assumptions used to cre-ate the product came, data analysis consisted of ex-planation building where cause-eﬀect relationships are sought [49] looking for an explanation for thecases [48].As the ﬁrst step in the analysis, we developedcase descriptions, as suggested by Yin [48]. Then,we performed a cross-case analysis.

At the time of data collection, the startup wasdeveloping a library to be added to software de-velopment projects. The company will provide adashboard that will show live software run-time is-sues, like exceptions, detected or inferred from datacollected within the target system. The dashboardwill also show solutions found on websites focusedon programming issues like Stack Overﬂow to simi-lar problems and a list of freelance developers thatcould solve the problem. In some cases, the systemwould be able to ﬁx some issues automatically. Thestartup team was composed of ﬁve people workingpart-time on the project spread across software de-velopment, business plan, and marketing strategy.The founder has worked as a software develop-ment consultant for an extended period. Whileparticipating in third-party projects, he observedthat such a tool could help him work more eﬀec-tively. Besides that, he believed that the technicallevel of software developers was decreasing. There-fore, it would make sense to develop such a tool.In the founder’s words: “the idea came to me dur-ing my work as a consultant because I observed thisneed... let’s say the idea came from there, that is,seeing that my customers don’t [collect data frombugs], but I saw that some of them were starting todo something in that direction, and I also observedthat the developer job market is growing, but theaverage know-how is probably decreasing.”

At the time of data collection, the startup hadan initial prototype consisted of a dashboard withsome dummy data and a website that displays theidea.

The startup was running a website to help ho-tel owners and managers to ﬁnd the best softwaresolutions for their businesses. Its initial focus wason the Italian market, but it aimed to reach inter-national markets. The team was composed of twofounders/partners, one developer who founded thecompany but left the partnership, and an intern tohelp with administrative tasks. We performed in-terviews with one of the founders and another with6he developer in the company oﬃce to allow fur-ther observations. The interviewed founder is theone that had the original idea.The interviewed founder had a background in on-line marketing. He had worked in a company thathandled web marketing and websites before stayingtwelve years in a big web agency. In his last job, heworked as the director of the company’s technologybusiness unit. Throughout his work life, he had ex-tensive contact with the tourism sector, especiallythe hospitality industry.He claimed that the idea came to him based onthe needs he observed from hotel owners, whichhave many technological tools available in the mar-ket to run the business and software vendors thathave to reach these customers. He was inspired byAmerican software review websites and the lack ofa speciﬁc one for the hospitality sector. Therefore,the original idea was to list available software withusers’ reviews, bring hotel owners to the website,and receive a fee for each lead (an interested cus-tomer that visited the vendor website) generated.In his own words, “Let’s say, I have worked formany years in the touristic sector ﬁrst as a consul-tant and later as a software producer, speciﬁcally inthe hotel sector. I made the match between thesetwo competencies, the problem of the hotel ownerswho have many technologies in the hotel and theproblem of the software vendor who needs a show-case, a marketplace where to sell their products, andthen I created this hotel technology marketplace.”

The founder said that after the original versionwent online, the team started observing the web-site usage data and realized it was not going as ex-pected. The team observed that the hotel ownerswere not able to compare diﬀerent software solu-tions because these products rarely have the sameset of features and, sometimes, hotels needed morethan one software system to fulﬁll their require-ments. Then, the startup changed the website:now, the hotel owner ﬁlls a form giving detailsabout her business, and the system would matchthrough a simple algorithm with solutions adaptedto the business needs.In the founder’s words: “Initially, it started as aproject similar to [review or comparison websites]then we started to collect data that was telling usthat the hotel owners did not have the analyticalcapability to compare the features between one soft-ware and another [then, the product became] a sys-tem that allowed the hotel owner to tell us what wereher needs ﬁll in a form, and we, based on a simple algorithm, made the match between her needs andthe database of software solutions that we had.”

In the interview with the developer, it was clearthat his inﬂuence on the idea was limited. Hethought that it was a good idea but does not haveexperience in the market. He trusted the foundersregarding the business and focused on developingthe solution.At the moment of data collection, the startup hadits customer base growing, and it was close to break-even. It was looking to expand to other markets.

From the startup product and the founder’s back-ground descriptions above, it is clear that the lattershaped a set of beliefs on the founders about theirtarget customers and market. Through these setsof beliefs, the founders made sense about the spe-ciﬁc business environment and its players, explain-ing their behavior and, in the last stance, trying toforecast it. Speciﬁcally, in startup B, the founderconsidered that hotel owners wanted to buy soft-ware solutions, and they were able to compare thediﬀerent alternatives and select the best suited fortheir cases. Based on that, the founder foresaw thata website with a list of available software solutionswould be useful to hotel owners. They would beable to see all the solutions and select the one thatwould ﬁt their needs. Fig. 1 summarizes this pro-cess and compares it with the idea of the founderbeing the innovation owner and her experience be-ing the motivator for the startup product idea, asdiscussed in the literature [31].In summary, the assumptions a founder hasabout customers and the market guides require-ments elicitation, that is, the beliefs that thefounder has about the customers and market basethe deﬁnition of software features.In startup B, it was possible to see what couldhappen next. After the software was ready, andthe website went online, usage data showed that theresults were not as predicted. Hence, the founderhad to update his assumptions about the customersand change the product accordingly. Now, this new“implicit theory” has emerged from the experimentresult and led the company to better results withinthe market. Nevertheless, to reach this stage, thestartup spent resources developing the whole prod-uct that could have done earlier if the team hadanalyzed the customers.Such rearrangement exposed an implicit processmodel for development in software startups. In such7 revious experienceAssumptions aboutcustomers and marketForecast about customersand marketProduct ideaBackground of founderStartup idea

Figure 1: The process of idea creation. The dashed linesrepresent the previous understanding that the backgroundof the founder led her to had the idea. Adapted from [13].

Assumptions Software Datawhich use generatedwas not totally compatible with and updated the

Requirements

Figure 2: The founder’s assumptions being updated asshown in [13]. a process, the founder’s assumptions guide the elic-itation of requirements, and the data generated bythe software usage may impose the changes on thisset of assumptions. Then, the founder uses such anupdated representation of the world to elicit newrequirements. Fig. 2 depicts this process through acausal chain [50].

Based on Cycle 0 results, a hypothesis elicitationapproach should make explicit the founders’ under-lying assumptions about the context in their star-tups are included. Cognitive mapping is a valuabletool in this regard.As a ﬁrst version, we proposed to adapt the ap-proach proposed by Furnari [27]. Using a white-board to depict the current status of the map-ping and, with the founder’s help, we aimed tocreate a cognitive map representing how and why the founder believes the startup’s business modelworks. The detailed steps were:1. ask the founder to describe the business modelconcerning the value proposition and cus-tomers;2. extract concepts and causal relationships;3. dig on each concept to see if they were, in re-ality, not based on the underlying assumption.4. check with the founder if the map representedthe way she thought about the problem at themoment.We evaluated this initial proposal in two othersoftware startups, C and D. Both startups are lo-cated in the same Italian city as A and B. We per-formed interview sessions following the deﬁned pro-tocol below:1. Present the concept of hypotheses and howLean Startup is related to it.2. Ask the interviewee to describe his businessor product idea, especially regarding customersegments and value proposition.3. Ask on which hypotheses the founder believedhis idea is based.4. Using a whiteboard and interacting with thefounder, draw a cognitive map until she feelsthat the map represented her understanding ofthe market.5. Create a list of hypotheses based on cognitivemapping and compare it with the initially cre-ated list.6. Ask feedback on the process to the founder.Below, we describe the results for each case.

Case C is an early-stage software startup thatplans to develop a digital mentor for software devel-opers to increase their happiness and satisfaction.The product would adapt itself to each developer’sneeds. Companies interested in improving their de-velopers’ productiveness customer would pay a feeto make the solution available to their teams.When asked about hypotheses, the founder men-tioned those they already worked with and thosethey were planning. The ﬁrst one was that softwaredevelopment teams could not organize themselves.Through some interviews, it got invalidated, andthey pivoted an initial idea to the current one. Thenext hypothesis or, how the interviewee called, “ex-ploration” was to understand if software developers8 ompanysatisfaction CompanyresultsMakedeveloperswork more fun Startup toolDeveloperssatisfaction + + + DevelopersproductivityGamiﬁcation ++ ++

Figure 3: Cognitive map created during interview with thefounder of startup C. care about soft skills. When asked about other hy-potheses, the founder said that she was waiting foranother round of tests.Fig. 3 displays a representation of the cognitivemap derived for this case. Through this process, thefounder stated that the main element to increase de-velopers’ productivity would be making their workmore fun through gamiﬁcation.The arrows in the ﬁgure imply six hypotheses: 1)developers productivity improves the company re-sults; 2) developers satisfaction improves develop-ers productivity; 3) making the development workmore fun improves the developers’ productivity and4) the developers satisfaction; 5) gamiﬁcation couldmake developers’ work more fun; 6) making the de-velopment work more fun would increase the com-pany satisfaction.Although some identiﬁed hypotheses are straight-forward and may not demand a proper experimentto be considered validated, the founder acknowl-edges that the “[they] have to see if the correla-tion between having fun and the productivity [ex-ists], that is a major risk.”

Increasenetworkefﬁciency Startup toolUser'swillingness toreact ++ Usersatisfaction Make networkmoretransparent/o/ ++ Figure 4: Cognitive map created during interview with thefounder of startup D.

Case D is developing a software solution to im-prove network connectivity, especially for situationswhere the quality of the Internet connection is low.Through an innovative approach that is suppressedhere as requested by the interviewee, the solutionwill make the network status transparent, enablingthe user to adapt it to their needs and, conse-quently, improve the quality of service.At the beginning of the interview, the founder an-swered that the main hypothesis they had regardedhow large is the area where quality is bad and ifproviders are willing to, in the near future, to ﬁxit. He mentioned that he talked to many poten-tial customers regarding the solution, and most ofthem would like to have the solution. After that,the cognitive map was developed and is depicted inFig. 4.From the arrows, there are four implied hypothe-ses: 1) increasing the network eﬃciency will im-prove user satisfaction, 2) making the network moretransparent will not decrease user satisfaction, 3)making the network more transparent will increasethe user’s willingness to react, and 4) the users’willingness and ability to react will increase usersatisfaction.When confronted with the hypotheses, thefounder mentioned they had already thought aboutthem before. Nevertheless, using his words, the pro-cess “made them explicit and more structured.”

Although the results were promising, we observedthat the process of cognitive map elicitation was notrepeatable, highly dependent on the interviewer in-stead. Besides that, the interviewer felt the lack ofguidance on properly conducting this step, as wecould observe at the beginning of the interviews.Another aspect that we observed was the lack of9niformity in the boxes’ content: some were con-cepts or nouns, but others were actions, events.Based on that, we improved the technique in a newresearch cycle. Regarding the expected attributesdeﬁned in Section 3, the artifact was useful, easy touse, and eﬀective, but we should improve its quali-ties: independence to facilitator and clearness.

Based on the previous cycle results, the need for amore systematic approach was evident. To achievethis goal, we focused on developing a proper vi-sual language to build the maps and a systematicmethod to elicit them from founders. Since the ar-tifact development is a design search process [14],we performed a series of map elicitation sessionswith potential founders. The subjects in this phasenot necessarily created a startup but had an ideathat, in their opinion, could potentially be the ba-sis of a new solution. After each session, the re-searchers evaluated the process and improved theartifacts to make the language and process moreprecise. The sessions occurred online, and the re-searcher shared his screen where he drew the cog-nitive map using the software Diagrams.net withthe interviewee’s help. Once the results reached asatisfactory level, we considered the artifact designprocess completed.Following this approach, we performed three ses-sions. While the ﬁrst two were Brazilian en-trepreneurs located in two diﬀerent cities, the thirdwas based in Italy, in a diﬀerent city from cases an-alyzed in the previous cycles. Table 1 describes theimprovements we applied to the technique for eachcycle and the respective results.

5. HyMap: Hypotheses Elicitation usingCognitive Maps for Early-Stage SoftwareStartups

The ﬁnal artifact consists of two elements: a vi-sual language to depict the founder’s cognitive mapand a deﬁned process to help draw the map and ex-tract the hypotheses from it.

To depict the cognitive map, we developed a vi-sual language consisted of the following elements: https://app.diagrams.net/ • Circles represent the customer segments.• An ellipsis box is used to denote the proposedsolution.• Dotted-line boxes portray the software fea-tures.• Boxes represent concepts, either physical or ab-stract elements, and are ﬁlled with nouns.• Arrows connect elements and represent rela-tionships among them. They represent threetypes of relationships: oﬀering, inﬂuence, andperception, that are deﬁned by the types of theelements connected. The ﬁrst connect solutionand features. Inﬂuence arrows are similar tothose in cognitive maps, as mentioned before,and should be labeled with one sign: ‘+’, ‘-’, ‘/o/’ to denote its type. Perception arrowsare those that connect the customers with theirproblems.There are no restrictions on the number of in-bound or outbound arrows from boxes, but it is ex-pected that they represent an acyclical graph. Sucha pattern for the construction leads to layers of ele-ments in the map, as shown in Fig. 5. In the Prod-uct layer, we represent the product. In the Featureslayer, we represent the features the founders expectfor the product. In the Problems layers, one or morelayers of elements represent the aspects foundersthink the product features will solve. Finally, inthe Customer layer, we represent the expected cus-tomers and users for the product. The ﬁrst step to reach hypotheses is to elicit thefounder’s cognitive map. To reach this goal, wepropose an iterative approach where, based on themap’s current status, the founder should analyzeeach of the relationships (arrows) and wonder ifthere are underlying concepts. At the beginningof this process, an initial map should be createdbased on the following questions:• What is the product/solution name?• What are the customers targeted by the solu-tion?• For each customer, what are the aspects theactor expects to improve using the solution?10 able 1: Sessions performed on artifact creation Cycle 2.

Session Visual language improvements Process improvements Results1 A circle to deﬁne the potentialcustomers’ problems to be tack-led by the startup solution. An initial set of questions toguide the map elicitation (prod-uct name and customers’ prob-lems the solution aimed to solve)and an iterative approach toconnect these two elements. Much clear guidance for the in-terviewer with respect to theprevious cycle. But the pro-cess of asking the customersand their problems was still notstraightforward to explain to theinterviewee.2 Diﬀerent elements for customersand their problems: circles torepresent customers or usersand their problems used similarboxes as the other elements. Questions changed accordinglyto changes in the language. Improved the process but, basedon the analysis of the elicitedmaps, using the same element torepresent software features andvalue concepts was confusing.3 Dashed box to represent fea-tures, diﬀerentiating them fromother elements. None. Satisfactory results.

Product NameCustomerSegment 1 CustomerSegment 2Problem 1 Problem 2Functionality 1 Functionality 2 Functionality 3 + or - or /o/ + or - or /o/

CustomersLayerProblems LayersFeatures LayerProduct Layer + or - or /o/ ...... ...

Problem nProblem 3 + or - or /o/

Figure 5: A template for the HyMap map. • Which are the solution features envisioned, andwhich aspects are identiﬁed in the previousstep they help fulﬁll?The answers to these questions and the corre-sponding relationships lead to an initial version ofthe map. Then, for each arrow, the founder shouldjudge if there are concepts implicitly used to ex-plain that relationship. Some questions useful inthis step are how? and why? . If a new concept isadded along with new relationships (arrows), thisprocess is repeated iteratively until the founder iscomfortable that no new concepts should be added.A useful question to evaluate if this process is sat-urated is if it is possible to create a simple ex-periment to evaluate that relationship. Addition-ally, the founder must evaluate if the new conceptsadded are related to other concepts already presenton the map. Throughout the process, the foundershould constantly assess if the forming map is co-herent to her understanding of the customer andmarket. To reﬁne the map, the founder can add,remove, or substitute elements.Once the cognitive map is ﬁnished, we can saythat each relationship represents an assumptionthe founder has about its targeted customer, valueproposition, and product. Based on them, she canformulate hypotheses based on which she can cre-ate experiments. These experiments can be piecesof software but also interviews, questionnaires, orother techniques.11rrows originated in diﬀerent layers represent di-verse types of hypotheses that demand diﬀerenttemplates while crafting the hypotheses. Althoughthe deﬁnition of systematic templates for hypothe-ses is beyond this paper’s scope, we deﬁned a sim-ple template for each type. Of course, the tem-plates are only guidelines, and a ﬁnal inspectionof the wording was necessary to create well-formedphrases. Below, we describe each type and the cor-responding template.1. Arrows from the product to the features layergenerate hypotheses regarding the team’s ca-pability to develop that feature that we calledproduct hypotheses. A simple template for thistype is: “the team developing < product name > is capable of implementing < functionality > ”.2. Arrows starting from the features layer to theproblem layers, or those restricted to the prob-lem layers, represent value hypotheses. In thiscase, a suitable template is “ < Functionality orproblem > < increases, decreases or does notaﬀect > < problem > ”.3. Arrows connecting customers to the problemlayers lead to problem hypotheses, that is, ifthat problem is a real “pain” for the customers,that will make her pay for the solution. Aninitial template for this type of hypothesis is“ < Customer segment > < has/would like to >< problem > ”.

6. Evaluation

In this section, we describe the HyMap evalu-ation. Rigorous design evaluation is an essentialelement of DSR [14].

Since a startup is a complex phenomenon withmany variables like founders’ background, product,market on which they are operating, and the bound-ary between the phenomenon and the context isblur, a case study is a suitable choice to evaluatea technique for these companies. To do so, we ex-ecuted a protocol similar to that for Cycle 1 butonline as in Cycle 2. To evaluate the artifact easeof use and independence of facilitator, a diﬀerent re-searcher from the one that performed the sessionson the construction phase was responsible for facil-itating the sessions. The other researcher acted asan observer during the elicitation sessions. Anotherdiﬀerence was that, instead of doing in one session, we divided the protocol into two steps: ﬁrst, the fa-cilitator, with the help of the founder, created themap; then, we created the hypotheses list oﬄineand sent it to the founder, then, on a second ses-sion, we performed an interview to get her feedbackon the hypotheses and the process. If the founderwas not available for a second interview, we sent aquestionnaire. We also used this instrument as aguide in case we performed the second interview.In both situations, we started asking for each hy-pothesis if the founder believed the hypothesis hadbeen validated and, if positive, how, and how sheperceived the risk to the business if it was not valid.Then, we asked feedback about the process’s use-fulness, ease to use, clearness, and if the process ledthe founder to think about something she had notthought before but would consider in the followingstartup steps.To sample the cases, we employed a theoreticalapproach based on the diﬀerent startup stages, asdescribed in Section 2. Since the focus of HyMap ison early-stage startups, we aimed companies in theinception and stabilization stages. Since, by the endof the inception stage, we expect that a startup hadalready partially developed the product, we aimedto compare startups in the beginning and at the endof such stage. To reach the startups, we followeda convenient approach, using our contacts networkto recruit interested startups.

We performed the planned case study in threestartups that we referred to as E, F, and G, orderedby the development stage they are at the momentof data collection: beginning and end of inceptionand stabilization stages, respectively. Below, wedescribe the companies and the results we obtainedfor each in detail. To preserve the startups’ privacy,a request that founders often make, we do not ex-plicitly put the product or startups name in thedescriptions or the cognitive maps.

Case E is a Brazilian startup planning an app toconnect board game enthusiasts to meet and formgroups to playing sessions. The startup also plansto provide services to board game shops that wantpeople to come and play on their premises and pub-lishers that want to promote their games. At thetime of data collection, the startup had already cre-ated the brand and started building an online pres-ence. Because of the 2020 coronavirus pandemic,12he development halted in the beginning, and thestartup lost all team members but the founder.Therefore, we classify the startup at the beginningof the inception stage. The interview performedwith the founder resulted in the cognitive map de-picted in Fig. 6.Based on the cognitive map, we identiﬁed 22 hy-potheses of which four were related to problems(e.g., “board game players have diﬃcult to formgame tables”), 12 to value (e.g., “the search ofnearby people with similar interests decreases thediﬃculty to ﬁnd people with similar interests”), andsix to the product (e.g., “the development team iscapable of implementing the search of nearby peo-ple with similar interests”).Regarding the problem hypotheses, the foundersaid that one (“board game players have diﬃcultforming game tables”) has a high risk to the busi-ness, and she validated it through her own expe-rience within the ﬁeld and oﬄine and online sur-veys. The other three problem hypotheses havemedium risk, and she validated them through talk-ing to shops and publishers. Out of the 12 valuehypotheses, the founder considered eight with highrisk to the business and four with medium risk. Shebelieves that all value hypotheses are validated ex-cept one: “creating game tables through the appfacilitates bringing people to play at the shop.” Forthe 11 value hypotheses the founder considered val-idated, we grouped similar strategies. Since shementioned more than one strategy per hypothe-sis, the sum of occurrences is larger than the to-tal of hypotheses. She mentioned that validationcame with oﬄine and online surveys for six of them,four from the comparison with similar tools, threefrom resembling business models, and three fromher own experience with the market. Finally, re-garding product hypotheses, the founder consideredall with high risk except the one regarding news feed(low risk) and suggestions (medium risk). However,they were not validated so far because of the lackof a development team.

Case F is a Brazilian startup developing an appto connect patients to health professionals gener-ally not found through insurance companies likepsychologists, nutritionists, and chiropractics. Thestartup is located in the same city as case E. By thetime of data collection, the company had developedmost of the software solution and was planning tolaunch in a short time. Therefore, the startup is at the inception stage end. The founder team con-sisted of two people: one concentrated on softwaredevelopment and the other on the product concep-tion and other issues. We performed interviewswith the latter. The result of the ﬁrst interviewis the cognitive map depicted in Fig. 7.Based on the cognitive map, we identiﬁed 23 hy-potheses of which eight were related to problems(e.g., “the patient has diﬃculty to ﬁnd profession-als”), ten to value (e.g., “searching professionals bytype and place decreases the diﬃculty to ﬁnd pro-fessionals)”, and ﬁve to the product (e.g., “the teamdeveloping the product is capable of implement thesearch for professionals by type and place”).When we asked the interviewee to rate the prob-lem hypotheses, he answered that they had vali-dated seven out of the eight hypotheses based onhis own experience using those services or talkingto professionals they know. The founder regardedﬁve validated as having a high risk to the business,one with a medium risk, and two with low. The hy-pothesis not validated was about the referral pro-gram. The founder attributed this classiﬁcation tothe fact that this feature is not essential to productviability. Regarding the value hypotheses, he ac-knowledged that they had not evaluated them, butit will be possible to evaluate them as soon as theylaunch the product. Regarding the risk, ﬁve wereconsidered high, one was classiﬁed as medium, andfour with low risk. Finally, since the product wasalmost ready, the product hypotheses had alreadybeen validated, and three of them had high risk,while two were low risk.Nevertheless, the founder observed that the pro-cess did not identify a potential hypothesis: pa-tients have diﬃculty booking appointments withprofessionals. He mentioned that this aspect be-came clear to him while discussing with the facili-tator after the diagram elicitation process ended.

Case G is another Brazilian startup that is devel-oping an online marketplace for second-hand sportsgear. The startup is located in a diﬀerent city fromprevious cases. By the time of data collection, thestartup has been created for a year, and the ser-vice is already online. Therefore, we classify thisstartup in the stabilization stage. We ran a sessionwith the startup founder that led to the cognitivemap depicted in Fig. 8.Based on this map, we generated 16 hypothe-ses from which two were related to the problems13 tartup product Board gameplayers Board gameshopsPublishers Difﬁculty to create agame table Attract people to playGet information aboutplayersIncrease the gamespublicity Search nearby peoplewith similar interests Create game tableDifﬁculty to ﬁnd peoplewith similar interests Compatibility issueswith time and place + -- -- +

News feed + Games suggestions + Player proﬁleData exportation -+ + +

Figure 6: Cognitive map created for startup E.

Startup productPatient Healthprofessional outof healthinsuranceDifﬁculty to ﬁndprofessionals Difﬁculty to pay for theappointment Losses because ofno-show Difﬁculty to getvisibility to potentialpatientsReceive commissionfor referralSearch professionalsby type and place - -

Difﬁculty to buildcustomer loyalty - Favorite professionalslist - Booking through theapp Payment through theapp - -

Send referralmessage + + --

Figure 7: Cognitive map created for startup F. tartup productSportsenthusiastsDifﬁculty to accesssports gearGear high cost + Low level of details ofadvertised products + Ads low quality- - Difﬁculty to sell usedsports gear Specialized adcreation guidePersonalized salesfee policy - +

Lack of trust in theproduct + Broker buy and selloperations - Sellers' reputation - -

Figure 8: Cognitive map created for startup G. (e.g., “sports enthusiasts have diﬃculty to accesssports gear”), ten to value (e.g., “the lack of trustin the products increases the diﬃculty to accesssports gear”), and four to the product (e.g., “theimplementing the solution is capable of implement-ing seller’s reputation”).The founder answered a questionnaire about hisevaluation of generated hypotheses. Regarding thetwo problem hypotheses, the founder answered thatthey represent a high risk to the business, but theywere validated based on “ﬁeld and online surveys.” The founder had a similar view for the producthypotheses: all represented high risk but were val-idated based on the actual implementation. Con-cerning the value hypotheses, out of the 11 hypothe-ses, the founder considered six validated and ﬁvenot. The validation came either from “ﬁeld andonline surveys” or from the service’s current users.The founder evaluated the risk as high for one,medium for four, and low for one. For those notvalidated, two were high risk and three medium.

Table 2 summarizes how the founders perceivedthe hypotheses identiﬁed.Comparing the diﬀerent cases, we could observesimilar results. Regarding problem hypotheses,founders claimed that, although having high risk,they had validated these statements. The excep-tion was two problem hypotheses for case F thatwere regarded as not validated but with a minor risk since were related to a feature not essential tothe product. Founders claimed to have validatedthese hypotheses, mainly based on their own ex-periences or interaction with customers from thetargeted market.For value hypotheses, we obtained diﬀerent re-sults. For startup G, the one in the stabilizationstage, the founder claimed that got the hypothesesvalidated by the product usage. However, for casesE and F, although the founder of the latter saidthat he expects to validate these hypotheses withthe product usage, the founder of E claimed thatmost of the hypotheses were validated based on herexperience and surveys with potential customers.An evident aspect regarding these two types men-tioned above of hypotheses is the prevalence of thefounders’ previous experience as evidence to sup-port. Although the interviewees’ claim of validity,there was no systematic approach to evaluate thesehypotheses and, consequently, a risk that they arenot valid, leading to the development of unneededsolutions.Regarding product hypotheses, the founders con-sidered these hypotheses validated for the two cases(F and G) where the initial product was ready.For case E, since the product development has notstarted, the founder believes that the hypotheseshad not been validated. In all cases, founders gen-erally considered the hypotheses as high risk to theproduct viability.Regarding the feedback about the technique, allfounders answered that it allowed them to see theirbusiness idea better. Although not highlighting un-noticed elements, founders claimed that the prac-tice gave a structured form to the product idea.

7. Discussion

With the proposed artifact, we aimed to answerour research question: “How can early-stage soft-ware startups deﬁne hypotheses to support exper-imentation?” To verify if the artifact reached thisgoal, we proposed analyzing the criteria: utility,quality, and eﬀectiveness.Regarding utility, the technique used in the star-tups as described in Section 6 demonstrated itscapability of eliciting hypotheses even though, asmore mature the startup becomes, the higher theprobability that teams have already conﬁrmed thehypotheses. Nevertheless, the technique identiﬁedhypotheses not validated, even for the startup on15 able 2: Summary of hypotheses obtained by case. The letters L, M, and H stand for the risk level perceived by the founders:low, medium, and high.

Case Stage Hypotheses Problem Value ProductL M H Total L M H Total L M H TotalE Inception (begin) Validated - 3 1 4 - 3 8 11 - - - -Not validated - - - - - 1 - 1 1 1 4 6F Inception (end) Validated 2 - 5 7 - - - - 2 - 3 5Not validated - 1 - 1 4 1 5 10 - - - -G Stabilization Validated - - 2 2 1 4 1 6 - - 4 4Not validated - - - - - 3 1 4 - - - -the stabilization stage. Besides that, all foundersmentioned the value of having a graphical overviewof their business. For instance, such visualizationmay help communicate product aspects to otherstakeholders like a marketing agency. We can alsoexpect that teams could use the map as a livingdocument updated according to how the startupprogress and validates or updates hypotheses aboutits product, customers, and market.To evaluate quality, we considered three aspects:ease of use, independence to the facilitator, andclearness. The small number of visual language el-ements and the process simplicity are good indica-tors for the ease of use. The amount of time spentcreating the map (around one hour in each case)demonstrates that the technique demands a few re-sources that it is essential in the startup resourceand time-constrained context. This aspect is cor-related to the independence to the facilitator thatwe observed by the facility with which the eval-uation sessions ran and the results they reached.Finally, the diagram and the process are clear, asshown by the maps displayed and the process de-scription. An aspect that we have not explicitlyevaluated was if the technique could reach a com-plete set of hypotheses, that is if it could identify allof them. Based on the example of case F, it is clearthat such an aspect was not reached. This fact isprobably related to our choice of using a freehandapproach rather than a pairwise that is linked to abetter coverage [35] as we discussed earlier. Besidesthat, our goal was to reach an initial set of hypothe-ses that could be extended and reﬁned throughoutthe startup existence. Finally, this changing behav-ior is similar to requirements and was one of thereasons behind agile software development [51].Eﬀectiveness is the most challenging aspect toevaluate. Given that founders were not used to think about hypotheses, asking them at the begin-ning of the section what their hypotheses were wasnot practical, and we abandoned it in the evalu-ation. Then, to support the claim of an eﬀectivetechnique, we should count on founders’ feedbackduring the whole process of artifact constructionand the lack of validation of hypotheses observedeven in later stages. Regarding the latter, our re-sults showed that startups, even with an initial ver-sion, or live, the product still led to hypotheseswithout proper backing.Our evaluation supports several attributes ex-pected to the technique like ease to use, useful-ness, and clearness but other studies could betterevaluate other aspects like the eﬀectiveness. Nev-ertheless, we agree with Hevner [14] that writes:“When a researcher has expended signiﬁcant eﬀortin developing an artifact in a project, often withmuch formative testing, the summative (ﬁnal) test-ing should not necessarily be expected to be as fullor as in-depth as evaluation in a behavioral researchproject where the artifact was developed by some-one else.” As future work, we suggest experimentsprobably comparing with a diﬀerent approach likethe Assumption Mapping [21].An interesting consequence of the diagram de-veloped is a relationship among hypotheses, or atleast some types of them, with requirements. Thisresult is in line with the concept of Dual-track de-velopment proposed by Sedano et al. [52]. Througha comprehensive ﬁeld study in a development com-pany, the authors proposed a conceptual frameworkto reconcile human-centered design and agile meth-ods. According to them, “a software project com-prises two continuous, ongoing, parallel tracks ofwork” where one generate feature ideas and theother use these ideas to build the product. Our re-sults give a piece of evidence for a possible opposite16 revious experienceAssumptions aboutcustomers and marketForecast about customersand marketProduct ideaBackground of founderStartup idea Problem hypothesesValue hypothesesProduct hypotheses

Figure 9: The relationship between the idea creation processand the hypotheses types identiﬁed by the HyMap process. ﬂow: potential requirements leading to guidance tobetter understand the product.Still, regarding the hypotheses types, we can mapthem to the steps in the idea creation process de-picted in Fig. 1. Based on the previous personalexperience, the founder builds an understandingof the target market and customers that led toproblem hypotheses through the HyMap technique.Based on this understanding, the founder forecastshow the customers and market will behave. Theelicitation process extracts these assumptions asvalue hypotheses. Finally, the product envisionedby the founder that would carry out her expecta-tions leads to the product hypotheses relating tothe solution feasibility and the team capability ofdoing it. Fig. 9 depicts this comparison.Besides that, the frequency of their own expe-rience as an answer to how the hypotheses werevalidated is another piece of evidence to supportthe idea creation process. It was also evident inthese cases that founders based their product ideason their previous experiences or observations of thetarget market and not necessarily with proper back-ing.Another interesting aspect of hypothesis types isthat they can act as elements to help in prioriti-zation, an essential aspect of hypotheses engineer-ing [3]. For instance, regarding the types we iden-tiﬁed in HyMap, if customers do not feel the prob-lems, it is hard for the product to succeed, and with-out them, the whole map would not exist. There- fore, we could expect that these are probably theﬁrst to be evaluated.Our results are also related to the three ap-proaches to software development identiﬁed byBosch et al. [53]. According to the authors, theseapproaches would be the conventional requirement-driven, the outcome or data-driven, that is essen-tially experimentation, and a rising AI-driven soft-ware development, where the software would beautomatically updated by machine learning algo-rithms trained with user data. The authors arguethat these approaches co-exist and should be usedaccording to the needs. The relationship among po-tential features and hypotheses identiﬁed in HyMapdiagrams suggests an intertwined process where re-quirements also could take to hypotheses. This ideais also related to the concept of hybrid develop-ment [54].Regarding the artifact construction process, it isimportant to summarize the cycles and knowledgeﬂow in the DSR. As inputs in the design processfrom the descriptive knowledge base, there are theelements already described in Section 2, like thePersonal Constructs Theory [29] and the life-cycleof software startups [36]. Besides that, we usedother elements in the following cycles, like the valueproposition ontology analysis [37]. From the pre-scriptive knowledge base, we used cognitive map-ping techniques and their use to depict businessmodels (e.g., [27]). The description of the cy-cles also displayed the iterative nature of the DSR.Such an aspect is clear from the improvement ofthe depicted maps. The next section describes thisstudy’s contributions to the descriptive and pre-scriptive knowledge base.

To summarize the contributions of our study, wewill use the concepts of descriptive ( Ω ) and pre-scriptive knowledge ( Λ ) [14]. Regarding the ﬁrst,Cycle 0 results showed that founders develop anunderstanding of the customers and markets basedon their previous experience and use this perceptionto develop the idea and forecast how it will behavewith customers. The artifact evaluation also cor-roborated this result. Based on the ﬁnal artifact, itwas possible to observe at least three diﬀerent typesof hypotheses: product, value, and problem. Sucha set is probably not complete but brings the ideathat there should be diﬀerent types, handled, andvalidated in diverse ways. Finally, the relationship17etween requirements and hypotheses is a novel in-sight.Concerning prescriptive knowledge, our contribu-tions are the visual language used to depict the cog-nitive map and a systematic process of doing it.Early-stage software startups could use this tech-nique to guide their initial steps more systemati-cally. We envision that this practice could be usefulfor other software development teams when creat-ing new features to consolidated products, but in-vestigating this suggestion is beyond this paper’sscope. Given that we employed multiple-case studies intwo cycles of the design phase (as already describedin our previous paper [13]) and in the evaluationof the ﬁnal artifact, we deemed essential to dis-cuss threats to the validity of these investigations.We followed the deﬁnitions given by Runeson andHost [49]. The authors describe four aspects of va-lidity for a case study: construct validity, internalvalidity, external validity, and reliability.Construct validity concerns to what extent thecase elements studied represent what the re-searchers have in mind. A common threat whenusing interviews is if the interviewee has the sameunderstanding of terms and concepts used in thequestions as the interviewer. Since the interviewguide for Cycle 0 focused on the business model de-scription and evolution, such a threat is minimal.Besides that, the triangulation of data with a dif-ferent team member interview decreased the threateven more.Triangulation was also important to mitigatethreats to internal validity. This aspect relates tocausal inferences when the researchers attribute thecause of an eﬀect to a phenomenon, but, in reality,it is caused by a third one not considered in theanalysis. In addition to triangulation, we employedpeer debrieﬁng; that is, all authors discussed theresults.External validity reﬂects the extent to which theresults can be generalized or if it is interesting toother people outside the studied case [49]. As men-tioned by Runeson et al. [49], in case studies, it isnot possible to draw statistical signiﬁcance. Still,the goal should be an analytical generalization ofthe results to cases with similar characteristics. Aswe argued before, the studied cases are typical soft-ware startups where the founder is the main innova-tion owner and, consequently, dictates the require- ments. Besides that, these companies generally fo-cus on developing a solution instead of understand-ing the customer [11, 55]. Therefore, we expect thatour results are valuable to describe a large portionof early-stage software startups.Finally, reliability concerns to what extent theresults are dependent on the researchers that per-formed the study. That is, if another researcherconducts the same study, she will reach similar con-clusions. To improve this aspect, we described allthe steps for data collection and analysis in all ar-tifact construction cycles and the evaluation.

8. Conclusions

Experimentation is a useful approach to guidesoftware development in startups. However, thelack of deﬁned practices to guide these teams is onereason for the reduced use of this approach. Giventhat identifying hypotheses is the ﬁrst step to cre-ate experiments, this study focused on developinga practice to perform this task in early-stage soft-ware startups. Following a Design Science Researchapproach, we performed three cycles to build a vi-sual language to depict cognitive maps of startupfounders and a systematic process to extract them.We evaluated these artifacts on three startups indiﬀerent development stages.As mentioned earlier, an extensive evaluation ofHyMap would be valuable future work, probablyusing controlled experiments or longitudinal casestudies. Another interesting work would be toassess the technique outside the startup context,for instance, when adding new features to market-driven products where development teams createsoftware for a market of users rather than speciﬁccustomers. Besides that, other studies could im-prove the completeness of the hypotheses set gener-ated by the technique, probably extending the lan-guage and the process. These enhanced processescould also tackle other types of hypotheses.

References [1] E. Lindgren, J. Münch, Raising the odds of success: thecurrent state of experimentation in product develop-ment, Information and Software Technology 77 (2016)80–91. doi:10.1016/j.infsof.2016.04.008 .[2] A. Fabijan, P. Dmitriev, C. McFarland, L. Vermeer,H. Holmström Olsson, J. Bosch, Experimentationgrowth: Evolving trustworthy a/b testing capabilitiesin online software companies, Journal of Software: Evo-lution and Process 30 (12) (2018) e2113. doi:10.1002/smr.2113 .

3] J. Melegati, X. Wang, P. Abrahamsson, Hypothe-ses Engineering: First Essential Steps of Experiment-Driven Software Development, in: 2019 IEEE/ACMJoint 4th International Workshop on Rapid Continu-ous Software Engineering and 1st International Work-shop on Data-Driven Decisions, Experimentation andEvolution (RCoSE/DDrEE), IEEE, 2019, pp. 16–19. doi:10.1109/RCoSE/DDrEE.2019.00011 .[4] M. Unterkalmsteiner, P. Abrahamsson, A. Nguyen-duc, G. H. Baltes, K. Conboy, D. Dennehy, R. Sweet-man, H. Edison, S. Shahid, X. Wang, J. Garbajosa,T. Gorschek, L. Hokkanen, I. Lunesu, M. Marchesi,L. Morgan, C. Selig, M. Oivo, S. Shah, F. Kon, Soft-ware Startups - A Research Agenda, e-Informatica Soft-ware Engineering Journal 10 (1) (2016) 1–28. doi:10.5277/e-Inf160105 .[5] B. L. Herrmann, M. Marmer, E. Dogrultan,D. Holtschke, Startup ecosystem report 2012, Telefon-ica Digital and Startup Genome (2012).[6] E. Klotins, M. Unterkalmsteiner, T. Gorschek, Soft-ware engineering in start-up companies: An analysis of88 experience reports, Empirical Software Engineering(2018) 1–19 doi:10.1007/s10664-018-9620-y .[7] M. Cantamessa, V. Gatteschi, G. Perboli, M. Rosano,Startups’ Roads to Failure, Sustainability 10 (7) (2018)2346. doi:10.3390/su10072346 .[8] W. R. Kerr, R. Nanda, M. Rhodes-Kropf, En-trepreneurship as Experimentation, Journal of Eco-nomic Perspectives 28 (3) (2014) 25–48. doi:10.1257/jep.28.3.25 .[9] D. L. Frederiksen, A. Brem, How do entrepreneurs thinkthey create value? A scientiﬁc reﬂection of Eric Ries’Lean Startup approach, International Entrepreneurshipand Management Journal 13 (1) (2017) 169–189. doi:10.1007/s11365-016-0411-x .[10] R. F. Bortolini, M. Nogueira Cortimiglia, A. d. M. F.Danilevicz, A. Ghezzi, Lean Startup: a comprehensivehistorical review, Management Decision (August) (aug2018). doi:10.1108/MD-07-2017-0663 .[11] C. Giardino, X. Wang, P. Abrahamsson, Why early-stage software startups fail: A behavioral frame-work, Lecture Notes in Business Information Pro-cessing 182 LNBIP (2014) 27–41. doi:10.1007/978-3-319-08738-2 .[12] Hevner, March, Park, Ram, Design Science in Infor-mation Systems Research, MIS Quarterly 28 (1) (2004)75–106. doi:10.2307/25148625 .[13] J. Melegati, X. Wang, Hypotheses elicitation in early-stage software startups based on cognitive mapping,in: V. Stray, R. Hoda, M. Paasivaara, P. Kruchten(Eds.), Agile Processes in Software Engineering andExtreme Programming, Springer International Pub-lishing, Cham, 2020, pp. 211–220. doi:10.1007/978-3-030-49392-9_14 .[14] S. Gregor, A. R. Hevner, Positioning and PresentingDesign Science Research for Maximum Impact, MISQuarterly 37 (2) (2013) 337–355. doi:10.25300/MISQ/2013/37.2.01 .[15] F. Fagerholm, A. Sanchez Guinea, H. Mäenpää,J. Münch, The RIGHT model for Continuous Experi-mentation, Journal of Systems and Software 123 (2017)292–305. doi:10.1016/j.jss.2016.03.034 .[16] H. H. Olsson, J. Bosch, From Opinions to Data-DrivenSoftware R&D: A Multi-case Study on How to Closethe ’Open Loop’ Problem, in: 2014 40th EUROMICRO Conference on Software Engineering and Advanced Ap-plications, IEEE, 2014, pp. 9–16. doi:10.1109/SEAA.2014.75 .[17] H. H. Olsson, J. Bosch, Towards Continuous CustomerValidation: A Conceptual Model for Combining Quali-tative Customer Feedback with Quantitative CustomerObservation, in: Lecture Notes in Business Informa-tion Processing, Vol. 210, 2015, pp. 154–166. doi:10.1007/978-3-319-19593-3_13 .[18] S. Blank, The four steps to the epiphany: successfulstrategies for products that win, BookBaby, 2013.[19] E. Ries, The Lean Startup: How Today’s EntrepreneursUse Continuous Innovation to Create Radically Success-ful Businesses, Crown Business, 2011.[20] J. Bosch, H. H. Olsson, J. Björk, J. Ljungblad, TheEarly Stage Software Startup Development Model: AFramework for Operationalizing Lean Principles in Soft-ware Startups, Lean Enterprise Software and Systems(2013) 1–15 doi:10.1007/978-3-642-44930-7 .[21] D. Bland, A. Osterwalder, Testing Business Ideas, Wi-ley, 2019.[22] A. Osterwalder, Y. Pigneur, T. Clark, Business ModelGeneration: A Handbook for Visionaries, Game Chang-ers, and Challengers, Alexander Osterwalder & YvesPigneur, 2009.[23] A. Osterwalder, Y. Pigneur, C. L. Tucci, ClarifyingBusiness Models: Origins, Present, and Future of theConcept, Communications of the Association for Infor-mation Systems 16 (July) (2005). doi:10.17705/1cais.01601 .[24] V. Berg, J. Birkeland, A. Nguyen-Duc, I. O. Pappas,L. Jaccheri, Software startup engineering: A system-atic mapping study, Journal of Systems and Software144 (February) (2018) 255–274. doi:10.1016/j.jss.2018.06.043 .[25] J. Melegati, R. Chanin, A. Sales, R. Priklad-nicki, Towards Speciﬁc Software Engineering Prac-tices for Early-Stage Startups, Vol. 1, Springer Inter-national Publishing, 2020, pp. 18–22. doi:10.1007/978-3-030-58858-8_2 .[26] C. Zott, R. Amit, L. Massa, The Business Model: Re-cent Developments and Future Research, Journal ofManagement 37 (4) (2011) 1019–1042. doi:10.1177/0149206311406265 .[27] S. Furnari, A cognitive mapping approach to businessmodels: Representing causal structures and mecha-nisms, Advances in Strategic Management 33 (2015)207–239. doi:10.1108/S0742-332220150000033025 .[28] C. Eden, Cognitive mapping, European Journal of Op-erational Research 36 (1) (1988) 1–13. doi:10.1016/0377-2217(88)90002-1 .[29] G. Kelly, The Psychology of Personal Constructs: Vol-ume One: Theory and Personality, Taylor & Francis,2002.[30] M. Brännback, A. Carsrud, Cognitive Maps in En-trepreneurship: Researching Sense Making and Action,in: Understanding the Entrepreneurial Mind, SpringerNew York, New York, NY, 2009, pp. 75–96. doi:10.1007/978-1-4419-0443-0_5 .[31] P. Seppanen, M. Oivo, K. Liukkunen, The initial teamof a software startup Narrow-shouldered innovation andbroad-shouldered implementation, in: 2016 Interna-tional Conference on Engineering, Technology and In-novation/IEEE lnternational Technology ManagementConference (ICE/ITMC), IEEE, 2016, pp. 1–9. doi: .[32] J. Melegati, R. Chanin, X. Wang, A. Sales, R. Prik-ladnicki, Enablers and inhibitors of experimentation inearly-stage software startups, in: X. Franch, T. Män-nistö, S. Martínez-Fernández (Eds.), Product-FocusedSoftware Process Improvement, Springer InternationalPublishing, Cham, 2019, pp. 554–569. doi:10.1007/978-3-030-35333-9_39 .[33] J. Rooksby, I. Sommerville, M. Pidd, A hybrid approachto upstream requirements: IBIS and cognitive mapping,Rationale Management in Software Engineering (2006)137–154 doi:10.1007/978-3-540-30998-7_6 .[34] L. H. Almeida, P. R. Pinheiro, A. B. Albuquerque,Applying multi-criteria decision analysis to global soft-ware development with scrum project planning, Lec-ture Notes in Computer Science (including subseriesLecture Notes in Artiﬁcial Intelligence and LectureNotes in Bioinformatics) 6954 LNAI (2011) 311–320. doi:10.1007/978-3-642-24425-4_41 .[35] G. P. Hodgkinson, A. J. Maule, N. J. Bown, CausalCognitive Mapping in the Organizational StrategyField: A Comparison of Alternative Elicitation Proce-dures, Organizational Research Methods 7 (1) (2004)3–26. doi:10.1177/1094428103259556 .[36] E. Klotins, M. Unterkalmsteiner, P. Chatzipetrou,T. Gorschek, R. Prikladniki, N. Tripathi, L. Pom-permaier, A progression model of software engineer-ing goals, challenges, and practices in start-ups, IEEETransactions on Software Engineering 13 (9) (2019) 1–1. doi:10.1109/TSE.2019.2900213 .[37] T. P. Sales, N. Guarino, G. Guizzardi, J. Mylopoulos,An Ontological Analysis of Value Propositions, Pro-ceedings - 2017 IEEE 21st International Enterprise Dis-tributed Object Computing Conference, EDOC 20172017-Janua (2017) 184–193. doi:10.1109/EDOC.2017.32 .[38] K. Peﬀers, T. Tuunanen, M. A. Rothenberger, S. Chat-terjee, A Design Science Research Methodology for In-formation Systems Research, Journal of ManagementInformation Systems 24 (3) (2007) 45–77. doi:10.2753/MIS0742-1222240302 .[39] R. Wieringa, Design science as nested problem solving,Proceedings of the 4th International Conference on De-sign Science Research in Information Systems and Tech-nology, DESRIST ’09 (2009). doi:10.1145/1555619.1555630 .[40] S. March, V. Storey, Design Science in the InformationSystems Discipline: An Introduction to the Special Is-sue on Design Science Research, MIS Quarterly 32 (4)(2008) 725. doi:10.2307/25148869 .[41] E. Engström, M.-A. Storey, P. Runeson, M. Höst, M. T.Baldassarre, How software engineering research alignswith design science: a review, Empirical Software Engi-neering (apr 2020). doi:10.1007/s10664-020-09818-7 .[42] B. Morschheuser, L. Hassan, K. Werder, J. Hamari,How to design gamiﬁcation? A method for engineeringgamiﬁed software, Information and Software Technol-ogy 95 (April 2017) (2018) 219–237. doi:10.1016/j.infsof.2017.10.015 .[43] A. Benfell, Modeling functional requirementsusing tacit knowledge: a design science re-search methodology informed approach, Re-quirements Engineering (0123456789) (2020). doi:10.1007/s00766-020-00330-4 .[44] C. K. Riemenschneider, B. C. Hardgrave, F. D. Davis, Explaining software developer acceptance of method-ologies: A comparison of ﬁve theoretical models, IEEETransactions on Software Engineering 28 (12) (2002)1135–1145. doi:10.1109/TSE.2002.1158287 .[45] B. Hardgrave, F. D. Davis, C. K. Riemenschneider, In-vestigating Determinants of Software Developers’ In-tentions to Follow Methodologies, Journal of Manage-ment Information Systems 20 (1) (2003) 123–151. doi:10.1080/07421222.2003.11045751 .[46] F. D. Davis, Perceived Usefulness, Perceived Ease ofUse, and User Acceptance of Information Technology,MIS Quarterly 13 (3) (1989) 319. doi:10.2307/249008 .[47] E. M. Rogers, Diﬀusion of innovations, Simon andSchuster, 2010.[48] R. Yin, Case Study Research: Design and Methods,Applied Social Research Methods, SAGE Publications,2003.[49] P. Runeson, M. Höst, A. Rainer, B. Regnell, Case StudyResearch in Software Engineering: Guidelines and Ex-amples, Vol. 283, John Wiley & Sons, Inc., Hoboken,NJ, USA, 2012. doi:10.1002/9781118181034 .[50] M. Miles, A. Huberman, J. Saldana, Qualitative DataAnalysis, SAGE Publications, 2014.[51] L. Williams, A. Cockburn, Agile software development:it’s about feedback and change, Computer 36 (6) (2003)39–43. doi:10.1109/MC.2003.1204373 .[52] T. Sedano, P. Ralph, C. Peraire, Dual-Track Devel-opment, IEEE Software 7459 (c) (2020) 0–0. doi:10.1109/MS.2020.3013274 .[53] J. Bosch, H. H. Olsson, I. Crnkovic, It Takes Three toTango : Requirement , Outcome / data , and AI DrivenDevelopment, in: Software-intensive Business Work-shop on Start-ups, Platforms and Ecosystems (SiBW2018), CEUR-WS.org, Espoo, 2018, pp. 177–192.[54] M. Kuhrmann, P. Diebold, J. Münch, P. Tell,V. Garousi, M. Felderer, K. Trektere, F. McCaﬀery,O. Linssen, E. Hanser, C. R. Prause, Hybrid softwareand system development in practice: Waterfall, scrum,and beyond, ACM International Conference Proceed-ing Series Part F128767 (2017) 30–39. doi:10.1145/3084100.3084104 .[55] M. Gutbrod, J. Münch, M. Tichy, How Do Soft-ware Startups Approach Experimentation? Empiri-cal Results from a Qualitative Interview Study, in:M. Felderer, D. Méndez Fernández, B. Turhan, M. Kali-nowski, F. Sarro, D. Winkler (Eds.), Product-FocusedSoftware Process Improvement, Springer InternationalPublishing, Cham, 2017, pp. 297–304. doi:10.1007/978-3-319-69926-4\_21 ..