Hero: On the Chaos When PATH Meets Modules
Ying Wang, Liang Qiao, Chang Xu, Yepang Liu, Shing-Chi Cheung, Na Meng, Hai Yu, Zhiliang Zhu
HH E RO : On the Chaos When PATH Meets Modules
Ying Wang ∗ , Liang Qiao ∗ , Chang Xu †§ , Yepang Liu ‡ , Shing-Chi Cheung ¶ , Na Meng (cid:107) ,Hai Yu ∗ , and Zhiliang Zhu ∗∗ Software College, Northeastern University, ChinaEmail: [email protected], [email protected], {yuhai, zzl}@mail.neu.edu.cn † State Key Laboratory for Novel Software Technology and Department of Computer Science and Technology,Nanjing University, China, Email: [email protected] ‡ Southern University of Science and Technology, China, Email: [email protected] ¶ The Hong Kong University of Science and Technology, China, Email: [email protected] (cid:107)
Virginia Tech, USA, Email: [email protected]
Abstract —Ever since its first release in 2009, the Go pro-gramming language (Golang) has been well received by softwarecommunities. A major reason for its success is the powerfulsupport of library-based development, where a Golang projectcan be conveniently built on top of other projects by referencingthem as libraries. As Golang evolves, it recommends the use of anew library-referencing mode to overcome the limitations of theoriginal one. While these two library modes are incompatible,both are supported by the Golang ecosystem. The heterogeneoususe of library-referencing modes across Golang projects hascaused numerous dependency management (DM) issues, incur-ring reference inconsistencies and even build failures. Motivatedby the problem, we conducted an empirical study to characterizethe DM issues, understand their root causes, and examine theirfixing solutions. Based on our findings, we developed H
ERO , anautomated technique to detect DM issues and suggest properfixing solutions. We applied H
ERO to 19,000 popular Golangprojects. The results showed that H
ERO achieved a high detectionrate of 98.5% on a DM issue benchmark and found 2,422 newDM issues in 2,356 popular Golang projects. We reported 280issues, among which 181 (64.6%) issues have been confirmed,and 160 of them (88.4%) have been fixed or are under fixing.Almost all the fixes have adopted our fixing suggestions.
Index Terms —Golang Ecosystem, Dependency Management
I. I
NTRODUCTION
The Go programming language (Golang) is quickly adoptedby software practitioners since its release in 2009 [1]. Likeother modern languages, Golang allows a project to import andreuse functionalities from another Golang project (i.e., library)by simply specifying an import path [2]. There are fourpopular sites hosting Golang projects, namely, Bitbucket [3],GitHub [4], Launchpad [5], and IBM DevOps Services [6].Among them, GitHub hosts nearly 90% Golang projects (asof June 2020) [7].While Golang’s library-based development boosts its adop-tion, its library-referencing mode has undergone a majorchange as the language evolves. Prior to Golang 1.11, library-referencing was supported by the
GOPATH mode. Libraries ref-erenced by a project are fetched using command go get [8].This mode does not require developers to provide any con-figuration file. It works by matching the URLs of the sitehosting referenced libraries with the import paths specified by § Chang Xu is the corresponding author. the go get command. However, it fetches only a library’slatest version. To overcome this restriction, developers usethird-party tools such as
Dep [9] and
Glide [10] to managedifferent library versions under the
Vendor directory . To sat-isfy developers’ need for referencing specific library versions,in August 2018, Golang 1.11 introduced the Go Modules mode, which allows multiple library versions to be referencedby a module using different paths. A module comprises atree of Golang source files with a go.mod configuration filedefined in the tree’s root directory. The configuration fileexplicitly specifies the module’s dependencies with specificlibrary versions as well as a module path by which the moduleitself can be uniquely referenced by other projects. The filemust be specified according to the semantic import versioning (SIV) rules [11]. For instance, projects whose major versionsare v2 or above should include a version suffix like “ /v2 ” atthe end of their module paths.Compared with GOPATH , Go Modules is flexible and allowsmultiple library versions to coexist in a Golang project [12].Developers are suggested to migrate their projects’ library-referencing modes from
GOPATH to Go Modules . However,the migration took a long time. We sampled 20,000 pop-ular Golang projects on GitHub. As of June 2020, only35.9% projects had migrated to
Go Modules , while the rest64.1% were still using
GOPATH , resulting in the coexistenceof two different library-referencing modes. What’s more,many projects suffered from various dependency management (DM) issues caused by such mixed library-referencing modes.Specifically, we made the following three observations: • Go Modules is not backward compatible with
GOPATH . There are two scenarios. First, a Golang project can bereferenced by its downstream projects. After it migrates to
Go Modules , its introduced virtual import paths (with versionsuffixes) cannot be recognized by downstream projects still in
GOPATH . This causes build errors to these projects. Second, adownstream project that has migrated to
Go Modules maynot find its referenced libraries in
GOPATH , or may fetchunintended library versions, due to different import path The
Vendor attribute allows a Golang project to reference a library’sdifferent versions and keep them in different folders under a vendor directory. a r X i v : . [ c s . S E ] F e b (a) (b) Fig. 1. DM issue examples interpretations by the two modes. • SIV rules can be violated even if a Golang project andits referenced upstream projects both use
Go Modules . Forinstance, a project of major version v2 may not necessarilyinclude a version suffix at the end of its module path. Suchviolation can be due to developers’ misunderstanding or weakSIV rule enforcement (discussed later in Sec II-A). They cancause a large number of unresolved library references (“ cannotfind package ” errors) when downstream projects are built. • Resolving DM issues for a Golang project requires up-to-date knowledge of its upstream and downstream projects,as well as their possible heterogeneous uses of two library-referencing modes.
However, such information is not providedby the Golang ecosystem to help developers evaluate a solu-tion’s impact on other projects. Resolving a DM issue in aproject locally without considering the ecosystem in a holisticway can easily cause new issues to its downstream projects.Figure 1(a) shows a DM issue example. Project lz4 [13]migrated to
Go Modules in version v2.0.7. Following SIVrules, it declared module path github.com/pierrec/lz4/v2 in its go.mod file with version suffix “ /v2 ”. Although the project canbe built successfully after migration, it induced DM issues todownstream projects still in
GOPATH , since the latter cannotrecognize the version suffix in module paths (e.g., issue filebrowser [14]). To fix the problem, lz4 releasedversion v2.2.4, which was still in
Go Modules but removedversion suffix “ /v2 ” from its module path as a workaround.This resolved the DM issues in its downstream projects in
GOPATH , but induced build errors into its downstream projectsthat had already migrated to
Go Modules , since this solutionviolated SIV rules (e.g., issue lz4 [15]). As thereis no accurate way to estimate the migration impact to itsdownstream projects, lz4 chose to roll back to
GOPATH inv2.2.6 and suspended its migration until its most downstreamprojects had completed migrations. Such problems are com-mon across Golang projects, imposing unforeseeable risks inmode migration.Figure 1(b) shows another example from go-i18n [16].Its version v2.0.1 followed its upstream projects to use
GoModules for finer library-referencing control. However, suchchange induced at least five DM issues to downstream projectsin
GOPATH (e.g., issue /v2 ” in go-i18n ’s module path. To fix theproblem, go-i18n v2.0.2’s repository provided an additionalsubdirectory go-i18n/v2 with a copy of implementations to support downstream projects in
GOPATH . This is a suboptimalsolution since it changes the virtual path in
Go Modules into aphysical one that requires extra maintenance in every projectrelease. In fact, without a holistic view of all dependenciesand the interference between their mixed library-referencingmodes, it is hard for developers to find a proper solution tofix DM issues without impacting downstream projects.Such chaos caused by mixed library-referencing modesis unique to Golang ecosystem, unlike existing dependencyconflict issues in Java [18]–[20], JavaScript [21] and Pythonprojects [22]. Besides, our study of 20,000 Golang projectson GitHub suggests the severity of DM issues caused by themode migration, since a majority of these projects have chosento stay with old
GOPATH . To better dig into the problem, wesampled 500 projects from top 1,000 ones, and collected 151DM issues from the issue trackers for a deeper study of theircharacteristics and solutions. We identified three DM issuepatterns and summarized eight fixing solutions commonlyadopted by developers. Leveraging these findings, we furtherdeveloped an automated tool, H
ERO (HEalth diagnosis toolfoR the gOlang ecosystem), to detect DM issues. One inter-esting feature is that H
ERO can also provide customized fixingsuggestions to developers with analyses of potential benefitsand consequences incurring to the ecosystem.To evaluate H
ERO , we collected 132 real DM issues fromtop 1,000 Golang projects that were not used in our empiricalstudy and conducted experiments using these issues as abenchmark. H
ERO achieved a detection rate of 98.5% (onlymissed two cases). We further applied H
ERO to the rest 19,000projects collected from GitHub, and detected 2,422 new DMissues. We submitted 280 of them that were associated with the1,001 st –2,000 th popular projects, and suggested fixing solu-tions. Encouragingly, 181 (64.6%) issues have been confirmed,and 160 (88.4%) of them have been fixed or under fixingusing our suggested solutions. Such fixes would cause minimalor acceptable impacts to other projects in the ecosystem.The confirmed issues cover well-known projects, such as github/hub [23] and microsoft/presidio [24], and havepromoted 29 projects’ migration to Go Modules .To summarize, in this paper, we made three contributions: • Originality.
To our best knowledge, we conducted the firstempirical study on 20,000 Golang projects to investigatetheir status of library-referencing mode migration and ana-lyze 151 real DM issues to unveil their characteristics. • Technique.
We developed the H
ERO tool to diagnose de-pendency management issues for the Golang ecosystem. Itcan detect DM issues effectively and provide customizedfixing suggestions. • Reproduction package.
We provided a reproduction packageon H
ERO website for future research, which includes: (1)detailed information of the 20k subjects and 151 DM issuesstudied in our empirical study; (2) our benchmark dataset(132 DM issues and subjects used for evaluation); :2:3: 4: 5:
2: 3: 4: 5: 1:2: go.mod projectAgo.mod projectA .go file of projectAprojectA
Fig. 2. SIV rules in the
Go Modules mode
II. B
ACKGROUND
We introduce SIV rules in
Go Modules and the concept ofmodule-awareness, to facilitate our later discussions.
A. SIV Rules in
Go Modules
Go Modules introduces SIV to support dependency man-agement of multiple project versions. It has three rules:1) Golang projects should follow a semantic versioning format(Semver) . Figure 2(a) gives an example, where projectA tags a release with a semantic version of v2.7.0 on GitHub.2) When a project’s major version is v2 or above (denotedas v2+ ), a version suffix like “ /v2 ” must be included atthe end of its module path declared in the go.mod file. Asshown in Figure 2(b), projectA v2.7.0’s module path is“ github.com/user/projectA/v2 ”. To reference it, downstreamprojects must declare this path and import it in requiredirective attributes of the go.mod file, as well as in importdirective attributes of their .go source files. Figures 2(c)and (d) give two examples.3) If a project’s major version is v0 / v1 , its version suffixshould not be included in its module or import paths.Under these SIV rules in Go Modules , multiple majorversions of a library can be separately referenced by differentpaths. In contrast, a project in
GOPATH can reference only thelatest version of a library.To be more flexible, the official Golang documentation [11]suggests two strategies to release a v2+ project, namely, majorbranch and major subdirectory . The former is to update aproject’s module and import paths to include a version suffixlike “ /v2 ”. It is not necessary to physically create a newbranch labeled with such a version suffix on the versioncontrol system of hosting site. The latter is to physicallycreate a subdirectory (e.g., projectA/v2 ) with source code anda corresponding go.mod file, and the corresponding modulepath must end with a version suffix like “ /v2 ” accordingly.As such, module and import paths in the major branchstrategy are virtual, but are physical in the major subdirectorystrategy. The latter is sometimes used to provide a transitionfor downstream projects in
GOPATH , as shown in Figure 1(b).
B. Module-Awareness in Different Golang Versions
To ease discussion, we refer to the capability of recognizinga virtual path ended with a version suffix like “ /v2 ” as The Semver format is
MAJOR . MINOR . PATCH , where
MAJOR , MINOR , and
PATCH denote incompatible API changes, backward compatible API changes,and backward compatible bug fixes, respectively (https://semver.org/). github.com/user/projectA github.com/user/projectA/ v2 … v0.*.* / v1.*.* projectAprojectA … v2.*.* projectA … V.S.
Fig. 3. Comparison of module-aware and module-unaware projectsTABLE IM
ODULE AWARENESS IN DIFFERENT G OLANG VERSIONS
Category Version range DMmode UsingDM tools ModuleawarenessYLegacyGolang versions [1.0.1, 1.9.7) ∪ [1.10.1, 1.10.3) GOPATH
N NY NCompatibleGolang versions [1.9.7, 1.10.1) ∪ [1.10.3, 1.11.1) GOPATH
N YY N
GOPATH
N YNewGolang versions ≥ Go Modules – YDM stands for dependency management. “–” means “not applicable”. module-awareness . This capability is important for referencinglibraries in the Golang ecosystem.As the migration from
GOPATH to Go Modules has immenseimpact on many Golang projects, it was gradually achieved bymultiple Golang versions over two years. During migration,“minimal module compatibility” was adopted since Golang1.9.7 in the series 1.9.* and Golang 1.10.3 in the series1.10.*, which added module-awareness to projects that had notmigrated to
Go Modules [25]. As such, we refer to the ver-sions in range [1.0.1, 1.9.7) ∪ [1.10.1, 1.10.3), which managedependencies in GOPATH without module-awareness, as legacyGolang versions . We refer to those in range [1.9.7, 1.10.1) ∪ [1.10.3, 1.11.1), which manage dependencies in GOPATH withmodule-awareness, as compatible Golang versions . We referto those of 1.11.1 or above, which allow projects to adopteither
GOPATH or Go Modules and support module-awareness,as new Golang versions . We observe that Golang projects in
GOPATH often use third-party tools (e.g.,
Dep [9],
Glide [10],etc.) to help manage dependencies. Since none of the toolssupports “minimal module compatibility”, their uses actuallyblock module-awareness, messing up library-referencing (e.g.,issue olivere [26] about using
Dep and glide , and migrate [27] about using govendor ).Table 1 summarizes module-awareness in different Golangversions. Based on this, we give two definitions below:
Definition 1 (Module-aware project):
A project is module-aware if and only if it uses a compatible or new Golang versionand does not use any DM tool.
Definition 2 (Module-unaware project):
A project is module-unaware if and only if it uses a legacy Golang version,or it uses a compatible or new Golang version with a DM tool.Figure 3 shows how module-aware and module-unawareprojects differ in parsing an import path with or with-out a v2+ version suffix. For an import path like github.com/user/projectA , a module-aware project could refer-ence a specific version v0. ∗ . ∗ or v1. ∗ . ∗ of projectA under v2 (latest version under v2 , by default), while a module-unaware project would reference the version on projectA ’smain branch (typically the latest version). For an import pathlike github.com/user/projectA/v2 , the former could reference aspecific version v2. ∗ . ∗ of projectA (latest version under v3 ,y default), while the latter would fail to recognize it.According to the above background knowledge, we formallydefine the DM issues occurred in Golang projects as follows: Definition 3 (Dependency management (DM) issue):
Ifan issue is caused by the different interpretations betweenmodule-aware and module-unaware projects or violating SIVrules by
Go Modules projects, we refer to it as a
DM issue in Golang ecosystem.A project suffers from a DM issue may fetch the unintendedversions of its libraries, or may not find its referenced libraries.III. E
MPIRICAL S TUDY
We empirically study the characteristics of DM issues andthe scale of these issues arising from the varying degrees ofmodule-awareness in different Golang versions. We aim toanswer the following three research questions: • RQ1 (Scale of Module-Awareness) : What is the status quoof library-referencing mode migration for projects in theGolang ecosystem? To what extent are they module-aware? • RQ2 (Issue Types and Causes) : What are common typesof DM issues? What are their root causes? • RQ3 (Fixing Solutions) : What are common practices forfixing DM issues? How do they affect the ecosystem?
To answer RQ1, we collected top 20,000 popular andactive open-source Golang projects from GitHub to study theirmigration status. To answer RQ2/3, we randomly selected500 subjects (denoted as subjectSet ) from top 1,000 ofour collected projects. We then collected real DM issuesfrom these projects plus some additional ones. To dig intothese issues, we manually analyzed their issue descriptions,developers’ discussions, code commits, and the Golang officialdocumentation. Note that the rest 500 projects (denoted as subjectSet ) in top 1,000 of our collected projects were notused in RQ2/3. They are used to evaluate our DM issuedetection technique later (Sec V-A). Below we present ourdata collection procedure and study results in detail. A. Data Collection
Step 1: Collecting Golang projects.
We collected top20,000 popular and active Golang projects from GitHub, whichhosts over 90% Golang ones. A project’s popularity is decidedby its star counts, and activeness is decided based on whether50+ code commits exist in its repository since Jan 2020.Figure 4 shows these projects’ demographics. They are:(1) popular (60.3% having 100+ stars or forks), (2) well-maintained (on average having 339 code commits and 136issues), and (3) large-sized (on average having 72.3 KLOC).We used these projects for RQ1.
Step 2: Collecting DM issues.
For the 500 projects in subjectSet , after filtering the ones that have no issue trackersor code repositories, we considered the remaining 484 projectsas subjects. We then added to the seed subjects Golang’s offi-cial project golang/go [28] and two most popular dependencymanagement tools Dep [9] and
Glide [10], for better studyingDM issues from the perspective of the ecosystem. In total, weobtained 487 projects for RQ2/3.
Stars Forks Commits Issues LOC
Fig. 4. Statistics of collected 20,000 Golang projects (log scale) -1-0.500.51-1 -0.5 0 0.5 1
Go Modules Go ModulesGo Modules GOPATH -1-0.500.51 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
GOPATH
Fig. 5. Investigation statistics for RQ1
As these projects contain many issue reports, we filteredusing keywords “go modules” and “go.mod” (case insensitive)to locate potential DM issues for manual analysis (“go.mod”configuration file is a notable new feature in the
Go Modules mode). Keyword “go modules” returned 1,342 issue reports,and “go.mod” returned 2,421 ones. We merged overlappingreports and then removed noise. First, we excluded issuereports that did not discuss DM issues (e.g., issue gogs [30] only documented developers’ plan tomigrate to
Go Modules ). Second, we excluded issue reportsthat discussed nothing about root causes of DM issues.Three co-authors cross-checked all collected issue reports,and finally obtained a collection of 151 well-documented DMissues, which involves 127 Golang projects. They containsufficient details for studying RQ2/3.
B. RQ1: Scale of Module-Awareness
We analyze the scale of module-awareness as below: • For all 20,000 projects, we counted the number of projectsthat have migrated to
Go Modules by checking whether go.mod files exist in their latest versions’ repositories. • For projects that have migrated to
Go Modules , we checkedwhether their major version numbers of latest releases are v2+ . If so, we further checked their adopted strategies (i.e.,major branch/subdirectory) in the code repositories. • For projects still in
GOPATH , we checked whether they usethird-party tools to manage dependencies by the presenceof their configuration files. For example, using the
Dep [9]or
Glide [10] tool requires a
Gopkg.toml or glide.yaml configuration file, respectively. esults. Figure 5 shows analysis results. To see trends, wedivided all projects into six (overlapping) groups based ontheir popularities: top 500, 1k, 5k, 10k, and 20k (1k = 1,000).From Figure 5(a), we see that the proportion of
Go Modules migrations increases with the popularity of projects. Thissuggests that migrating to
Go Modules is a good practice inthe ecosystem. Still, 64.1% projects are in
GOPATH despite thattwo years have passed since
Go Modules came into being.Figures 5(b) and 5(c) show that only 4.5% projects thathave migrated to
Go Modules released v2+ versions (i.e.,most ones are still in v0/v1 ), and 91.2% v2+ versions weremanaged by the major branch strategy. This suggests that thevast majority of v2+ projects should be referenced by virtualmodule paths ended with version suffixes like “ /v2 ”. Thenthey are likely to induce build failures in module-unawaredownstream projects. Besides, for the rest 95.5% projectswhose major versions are v0/v1 , DM issues can easily occurwhen they are updated to v2+ versions in future.Figure 4(d) shows that 79.6% projects in
GOPATH use third-party tools to manage dependencies. As aforementioned, thiswill block module-awareness for projects that adopt compati-ble Golang versions. Therefore, at least 10,205 of top 20,000Golang projects (51.0%) are module-unaware.
Challenges of migration.
Our findings may explain whymany Golang projects stay with
GOPATH . We also investigatehow developers consider this problem from projects still in
GOPATH . We focused on the
GOPATH part (36.6%) of top 500out of the 20,000 projects (Fig. 5(a)), and analyzed their issuereports that discuss migration to study reasons for holding themigration. We obtained 52 issue reports specifically discussingunsuccessful migration, and observed three common reasons: • Existing versioning scheme incompatible with SIV rules in
Go Modules (27/52).
Some projects have their own version-ing schemes, different from SIV rules in
Go Modules . Toavoid incompatibility (e.g., issue go-tools [31]),developers chose to stay with
GOPATH . • Third-party DM tools hindering the migration plan(15/52).
Some projects heavily rely on third-party tools fordependency management. As the tools do not work with
GoModules , developers chose to live with the tools instead ofmigration (e.g., issue uuid ). • Causing problems to downstream projects in
GOPATH (10/52).
Many projects are still in
GOPATH , inconvenient toreference upstream projects in
Go Modules . For continuoussupport for downstream projects, developers chose to staywith
GOPATH (e.g., issue migrate ).Due to these challenges, we conjecture that
GOPATH and
Go Modules can co-exist for a long time. This suggeststhe inevitability of DM issues in the Golang ecosystem andmotivates us to study their characteristics and fixing solutions.
Answer to RQ1:
Golang projects face challenges in migrat-ing to
Go Modules . Up till June 2020, only 35.9% of top20,000 projects on GitHub have migrated to
Go Modules ,and at least 51.0% of top 20,000 projects are module-unaware. The two library-referencing modes may co-existfor a long time in the ecosystem. C. RQ2: Issue Types and Root Causes
We observed three common types of DM issues in collectedissue reports. Below we introduce them and analyze their rootcauses with examples.
Type A.
DM issues can occur when projects in
GOPATH depend on projects in
Go Modules (41/151=27.2%).
Theformer are typically module-unaware. Build errors can occurwhen such projects directly or transitively depend on the latterbut cannot recognize their virtual paths with version suffixes,e.g., issue glide [33].Among 41
Type A issues, 35 occurred in module-unawareprojects when they upgraded upstream dependencies whosenewer versions introduced virtual import paths. This showsthat version upgrades of libraries in
Go Modules can imposethreats to their module-unaware downstream projects anddevelopers should estimate such threats before upgrading. Therest 6 issues occurred when introducing new upstream projectsthat transitively depend on virtual import paths.
Type B.
DM issues can occur when projects in
Go Modules depend on projects in
GOPATH (40/151=26.5%).
There aretwo cases. The first (
Type B.1 ) is due to the different importpath interpretations between
GOPATH and
Go Modules , andthe second (
Type B.2 ) is due to the interference of
Vendor attribute in
GOPATH . Type B.1 (16/40).
Let project P A in Go Modules dependon project P B in GOPATH , and P B further depend on P C in Go Modules with import path github.com/user/PC . Supposethat P C has released a v2+ version with the major branchstrategy. From P B ’s perspective, it interprets the import pathas P C ’s latest version (i.e., v2+ version on P C ’s main branch).However, in P A ’s build environment, the import path isinterpreted as a v0 / v1 version of P C (no version suffix in thepath). As a result, P A fails to fetch P C ’s correct version andcan encounter errors when building with P B . Type B.1 issues are difficult to notice, and can easily causebuild errors. For example, issue cockroach [34]reported that a client project in
Go Modules depends on cockroach v19.5.2 in
GOPATH , and cockroach further de-pends on project apd [35] in
Go Modules (with a v2+ version). Although cockroach itself correctly referenced apd v2.0.0 (latest version) by interpreting import path github.com/cockroachdb/apd , the client project instead fetched apd v1.1.0 based on its interpretation of this import path. Asa result, the client project’s building failed due to missing animportant field (not in apd v1.1.0 but in v2.0.0).
Type B.2 (24/40).
Let project P A in Go Modules dependon project P B in GOPATH , and P B further depend on project P C , which is managed in P B ’s Vendor directory. A
Vendor directory is a major feature of
GOPATH , which localizes themaintenance of remote dependencies’ specific versions. Wenote that P A references P C by import path github.com/user/PC declared in P B ’s source files rather than from P B ’s Vendor directory. Although the build may work for the time being, P A can fail to fetch P C if P C is deleted or moved to anotherrepository (e.g., renaming). Even if the fetching is successful,the version on P C ’s hosting site could be different from thene in P B ’s Vendor directory, causing potential build errorsdue to the inconsistency.Such situations often occur, since there are essentially twoversions of a library at two different sites and their consistencyis not guaranteed. We witnessed a
Type B.2 issue in project moby [36], which has received 57.6k stars on GitHub andranked the third in popularity. To support its large number ofdownstream projects still in
GOPATH , moby has not migratedto Go Modules . Its issue moby referenced project logrus [38] from its
Vendor directory, and logrus had been relocated from github.com/ S irupsen/logrus to github.com/ s irupsen/logrus (case sensitive) on GitHub. Thisincurred DM issues to many of moby ’s downstream projects in Go Modules (e.g., issues testcontainers [39] andissue shnorky [40]), as they could not fetch logrus bythe import path in moby ’s source files.
Type C.
DM issues can occur when projects in
Go Modules depend on projects also in
Go Modules but not following SIVrules (70/151=46.4%).
We identified three types of SIV ruleviolations that caused build failures to downstream projects:(1) lacking version suffixes like “ /v2 ” in module paths orimport paths, although the versions of concerned projects are v2+ (37/70) (e.g., issue iris ); (2) version tagsnot following the
MAJOR . MINOR . PATCH format (18/70) (e.g.,issue gobgp ); (3) module paths in go.mod filesare inconsistent with URLs associated with concerned projectson their hosting sites (15/70) (e.g., issue jwplayer ).While downstream projects can encounter build failures,the projects violating SIV rules do not produce warningsor errors themselves when building. Currently, there is nodiagnosis technique to detect the three SIV rule violationtypes, or mechanism to enforce SIV rules, as discussed inissues iris [41] and golang/go [44] (by lz4 ’s [45] users). As a result, projects violating SIV rules can“safely” stay in the Golang ecosystem, despite the unexpectedconsequences to their downstream projects. Regarding suchrisk, lz4 ’s developers commented its severity on issue “we need to fix this issue and figure out how big the craterit brings to the ecosystem.”
Answer to RQ2:
DM issues commonly occur due toheterogeneous uses of
GOPATH and
Go Modules . Theirmanifestations can be summarized into three types and thereare two common root causes: (1)
GOPATH and
Go Modules interpret import paths in different ways, and (2) SIV rulesare not strictly enforced in the Golang ecosystem.D. RQ3: Fixing Solutions
Out of the 151 DM issues, 144 issues have fixing patchesor fixing plans that developers have agreed on. We studiedthem and observed eight common fixing solutions, whichdemonstrate different trade-offs.
Solution 1:
Projects in
GOPATH migrate to
Go Modules (22/144=15.3%).
Migrating from
GOPATH to Go Modules canhelp fix
Type A issues, since these issues are caused byprojects still in
GOPATH , which are unable to recognize importpaths with version suffixes. For example, in issue
Go ModulesGo Modules
Fig. 6. Benefits and consequences of the eight fixing solutions redis [47] migrated to
Go Modules , but its downstreamproject benthos was still in
GOPATH . Then, benthos wassuggested to migrate to
Go Modules to avoid build errors.This solved benthos ’s problem, but caused incompatibility to benthos ’s module-unaware downstream projects. As a result,new
Type A issues (e.g., issue
Solution 2:
Projects in
Go Modules roll back to
GOPATH (13/144=9.0%).
Some projects rolled back to
GOPATH aftermigrating to
Go Modules for fixing
Types A and C issues.For example, in issue Type A ), project uuid ’s[46] migration to
Go Modules broke the building of manydownstream projects in
GOPATH . As a compromise, uuid rolled back to
GOPATH , waiting for downstream projects tomigrate first. In issue
Type C ), gopsutil and itsdownstream projects were all in Go Modules , but gopsutil violated SIV rules (lacking a version suffix in its module pathof v2+ release), causing build errors to downstream projects.As such, gopsutil chose to roll back to
GOPATH to makedownstream projects work again. This solution solves theproblem, but hinders the migration status of the ecosystem.
Solution 3:
Changing the strategy of releasing v2+projects in
Go Modules from major branch to subdirectory(6/144=4.2%).
It helps resolve
Type A issues, where module-unaware projects cannot recognize virtual import paths for v2+ libraries in
Go Modules . The new strategy creates physicalpaths by code clone, so that libraries can be referenced bymodule-unaware projects. However, this is just a workaroundand needs extra maintenance in subsequent releases (e.g., issueof go-i18n [17] as discussed in Sec I).
Solution 4:
Maintaining v2+ libraries in
Go Modules indownstream projects’ Vendor directories rather than referenc-ing them by virtual import paths (6/144=4.2%).
Similar to solution 3 , this solution also helps resolve
Type A issues. Bymaking a copy of libraries in downstream projects’ reposito-ries, it avoids fetching the libraries by virtual import paths.For example, in issue radix [51] refused to usethe major subdirectory strategy for its v2+ project release in
Go Modules . Its downstream projects had to make a copy of radix ’s code in their
Vendor directories, which requires extramaintenance and potentially cause
Type B.2 issues in future.
Solution 5:
Using a replace directive with version infor-mation to avoid using import paths in referencing libraries16/144=11.1%).
It addresses
Types B.1 (problematic importpath interpretations) and
Type C (import path violating SIVrules) issues. For example, in issue goq ’s version [53]. However, thiswould make developers no longer able to use the go get command to automatically fetch upgraded libraries.
Solution 6:
Updating import paths for libraries that havechanged their repositories (24/144=16.7%).
It fixes
Type B.2 issues, where libraries in a project’s
Vendor directory may beinconsistent with the ones referenced by their import paths. Itupdates import paths to help a project’s downstream projectsin
Go Modules fetch consistent library versions. For example,in issue go-cloud managed library etcd in its
Vendor directory, etcd later changed its hosting repositoryfrom github.com/coreos/etcd to go.etcd.io/etcd . To fix builderrors for its downstream projects in Go Modules , go-cloud updated etcd ’s import path to the latest one for the consis-tency. This fixes the issue and benefits all affected downstreamprojects without impacting others in the ecosystem. Solution 7:
Projects in
Go Modules fix configuration itemsto strictly follow SIV rules (47/144=32.6%).
Projects that havemigrated to
Go Modules are suggested to follow Golang’sofficial guidelines on SIV rules to fix their induced
Type C issues. For example, in redis [47] addeda version suffix “ /v7 ” at the end of its module path to followSIV rules. However, we noticed that while the issues are fixed,the project’s downstream projects in
GOPATH may be impacted(unable to recognize the version suffixes, e.g., issue redis ). Solution 8:
Using a hash commit ID for a specific versionto replace a problematic version number in library referencing(10/144=6.9%).
It fixes
Type C issues, where some projectsin
Go Modules violate SIV rules in version numbers andcause build errors to downstream projects that are also in
GoModules . It avoids referencing problematic version numbers,by a require directive with a specific hash commit ID. Forexample, in issue prometheus ’s down-stream projects in
Go Modules chose to use directive requiregithub.com/prometheus/prometheus 43acd0e to reference itsexpected version v2.12.0. Similar to
Solution 5 , this solutionwould also make developers unable to automatically fetchupgraded libraries using command go get .As summarized in Figure 6, these solutions fix their targetedDM issues, but at the same time they may bring additional ben-efits ( ab – ab ) or undesired consequences ( uc – uc ). Whenthere are multiple fixing solutions for a specific DM issue,developers are suggested to carefully consider the relevantdependencies and minimize the impact on other projects inthe ecosystem, by weighing consequences against benefits. Answer to RQ3:
We observed eight common fixing solutionsfor DM issues, covering 95.4% of the studied issues. Mostsolutions could affect other projects in the ecosystem. Whenfixing a DM issue, developers should find a tradeoff betweenthe benefits and the possible consequences.
IV. H
ERO : DM I
SSUE D IAGNOSIS
Our empirical study reveals the prevalence of DM issues inthe Golang ecosystem due to the chaotic use of
GOPATH and
GoModules in different projects. This motivates us to develop atool, named H
ERO , to help automatically detect DM issuesand provide customized fixing solutions. H
ERO works in twosteps. It first extracts dependencies among Golang projectsand their library-referencing modes and then diagnoses DMissues in these projects based on our observed issue typesand root causes (RQ2). It further provides customized fixingsuggestions leveraging the findings in RQ3. H
ERO can analyzea single Golang project or monitor the heterogeneous use ofthe two library-referencing modes in the Golang ecosystem.Below we explain how H
ERO models project dependenciesand detects DM issues.
A. Constructing Dependency Model
We first build a dependency model for the Golang projectunder analysis. We formally define the model below.
Definition 3 (Dependency model):
The dependency model D ( P v ) for version v of a project P is a 3-tuple ( P r, Ds, U s ) : • P r = ( ip, md, t, vd ) records the information of the currentproject , where ip and md are P v ’s declared module path(for P v to be referenced by downstream projects) andlibrary-referencing mode ( GOPATH or Go Modules ), respec-tively. If P v is in GOPATH , fields t and vd denote whether P v depends on any DM tool ( yes or no ), and a collectionof import paths (set of URLs) referencing those upstreamlibraries that are maintained in P v ’s Vendor directory butcannot be found in the repositories pointed to by URLs(e.g., due to removal or renaming), respectively. Otherwise,the two fields are set to no and null , respectively. • Ds = { dp , dp , · · · , dp n } is a collection of P v ’s down-stream projects dp i , where dp i = ( v i , ip i , md i , t i ) . Field v i denotes dp i ’s latest version number. Fields ip i , md i , and t i denote this version’s import path, library-referencing mode,and whether any DM tool is used, respectively. • U s = { up , up , · · · , up n } is a collection of P v ’s upstreamprojects up i , where up i = ( v i , ip i , md i , S i , I i ) . The fields ip i and md i denote v i ’s import path and library-referencingmode, respectively. If P v is in Go Modules , field v i denotes up i ’s specific version declared in P v ’s configuration file.Otherwise (i.e., when P v is in GOPATH ), v i denotes up i ’s lat-est version number. If up i is a v2+ project in Go Modules ,field S i denotes whether it is released by the major branchstrategy ( yes or no ), implying whether ip i is a virtual importpath. If both projects up i and P v are in Go Modules , field I i denotes whether up i is transitively introduced into P v by any project in GOPATH ( yes or no ). Otherwise, the twofields are set to null and no , respectively.We explain how to obtain these field values, taking GitHub(the most popular Golang project hosting site) for example: Step 1: Collecting
P r information.
Leveraging GitHub’sREST API “ repository_url ” [58], H
ERO queries with P v ’srepository name to obtain its import path ip and library-referencing mode md by checking if a go.mod file exists roject github.com/user/up a In GOPATH
Project
In Go Modules
Project
In Go Modules v2.0.0 (a)
Type A
Cannot find the package
Build error: (b)
Type B.1 v2.0.0
Latest version on main branch v0/v1
Referencing the unexpected version of up b (v0/v1) (d) Type C v2.0.0
Violating
SIV rules
Inconsistency:Withoutmodule-awareness (c)
Type B.2
Downstream project ‘ Vendor directory Project up a has been deleted/ changed hosting repository. github.com/user/upb/v2 (Latest version) Project github.com/user/up a github.com/user/up b Project Project github.com/user/P github.com/user/up a Downstream project “ up b /v2 ” Major branchModule path
Project Project
Project Project “ up b /v2 ” Major branch
Module path
Project
In Go Modules (Specific version)
In GOPATHIn Go Modules In Go Modules In GOPATH In Go ModulesIn Go ModulesIn Go Modules
Project
Cannot find the packageBuild error: Cannot find the packageBuild error:
Fig. 7. Three types of DM issues H
ERO detects ∈ Go Moules Go Modules Go Modules Go Modules ∈ GOPATH Go Modules GOPATHGo Modules ∈ GOPATH Go Modules
Hero Go Modules Go Modules ∈ Go Moules GOPATH ∈ GOPATH
Fig. 8. Templates of customized fixing suggestions for three types of DM issues in its repository. If P v is in GOPATH , H
ERO decides field t by checking whether any DM tool’s configuration file exists.Field vd is decided by parsing P v ’s source files to collectimport paths for libraries maintained in the Vendor directory,and querying via the “ repository_url ” API with the collectedimport paths to check whether the corresponding libraries havebeen deleted or relocated (e.g., by
HTTP 404: Not Found errors [58]).
Step 2: Collecting Ds information. Leveraging GitHub’sREST API “ code_search_url ” [58], H
ERO queries with P v ’srepository name to check which projects depend on it. Thisinformation is from the require directives of a project’s go.mod file, import directives of its source files, or a DMtool’s configuration file. Each found project corresponds toan item dp i in the collection Ds . Note that H ERO collects thelatest version v i for dp i , and decides its associated import path ip i , library-referencing mode md i (by checking whether P v ’srepository name is declared in its go.mod file), and field t i (by checking whether its DM tools’ configuration file exists),respectively. These collected downstream projects depend on P v and can also reference its earlier versions. Step 3: Collecting
U s information.
Project P v ’s upstreamprojects information is collected in two ways, depending onthe library referencing mode of the project: • P v in Go Modules : H
ERO collects P v ’s upstream projects up i with fields ip i and v i by parsing its go.mod file, which configures a project’s direct and transitive dependencies withimport paths and specific version numbers. H ERO identifies up i ’s library-referencing mode md i by checking whether a go.mod file exists in its repository via GitHub’s “ repos-itory_url ” API. If up i is a v2+ project in Go Modules ,H ERO identifies its release strategy S i by checking whethera subdirectory like “ up i /v2” exists. For projects transitivelyintroduced into P v by any project in GOPATH , Golang’s buildtool automatically marks them with a “ //indirect ” commentat the end of their module paths in P v ’s go.mod file [59],with which H ERO decides I i . • P v in GOPATH : H
ERO collects P v ’s direct dependencies up i with import paths ip i from its source files. With the importpaths, H ERO leverages GitHub’s “ repository_url ” API tolook into these dependencies’ repositories to collect theirlatest versions, from which it decides the correspondingversion numbers v i and library-referencing modes md i .Then H ERO recursively collects the information of P v ’stransitive dependencies declared in go.mod or sources filesin concerned repositories, and identifies version numbers,import paths, library-referencing modes in a similar way. B. Diagnosing DM Issues
The dependency model built by H
ERO contains sufficientinformation for detecting DM issues and suggesting solutions.
Detecting DM issues.
Our study disclosed that most DMissues caused build errors, already observable. Thus, H
ERO ocuses on detecting DM issues that have not yet manifested,but would probably happen when the concerned projects havetheir upstream or downstream projects upgraded. Due to pagelimit, we explain scenarios for which H
ERO reports issues inthis paper with algorithm details on our website.
Type A.
Figure 7(a) shows a scenario, where a module-unaware project P v references a specific version of its up-stream project up a in Go Modules . This version is older than up a ’s latest version, which newly introduces another upstreamproject up b in Go Modules with a v2+ version released usingthe major branch strategy. Build errors do not occur in P v when it references up a ’s old version. However, if P v updates up a to reference the latest version, it will not be able torecognize up b ’s virtual import path. When seeing such apossibility, H ERO reports a warning of
Type A issue for P v . Type B.1.
Figure 7(b) shows a scenario, where project P v in Go Modules transitively references a v2+ upstream project up b in Go Modules (released by the major branch strategy)through another module-unaware project up a in GOPATH . Since
GOPATH and
Go Modules interpret import paths differently, up a would use up b ’s latest version (e.g., v2.0.0), while P v would use up b ’s old v0/v1 version, causing inconsistencies.Thus, H ERO reports a warning of
Type B.1 issue for P v . Type B.2.
Figure 7(c) shows a scenario, where project P v in GOPATH references an upstream project up a maintained onlyin its Vendor directory (i.e., up a has already been deleted orrelocated). No build errors occur when P v has no downstreamprojects in Go Modules . However, if P v has such downstreamprojects, the latter would fetch up a via its import path (i.e.,hosting repository) rather than from P v ’s Vendor directory,causing build errors due to failing to fetch up a . Thus, H ERO reports a warning of
Type B.2 issue for P v . Type C.
Figure 7(d) shows a scenario, where project P v in Go Modules violates SIV rules (as discussed in Sec III-C).The violation may not introduce build errors when P v has nodownstream projects in Go Modules . However, build errorswould occur if such projects exist in future. Thus, H
ERO reports a warning of
Type C issue for P v . Customized fixing suggestions.
Our empirical study hasidentified applicable fixing solutions for each issue type (Fig-ure 6). We summarize the impacts of these solutions astemplates in Figure 8. For each detected DM issue, H
ERO suggests all applicable solutions to developers by customizingthe template with potential impact analysis based on theassociated dependency model.V. E
VALUATION
We study two research questions in our evaluation of H
ERO : • RQ4 (Effectiveness) : How effective is H ERO in detectingDM issues for Golang projects? • RQ5 (Usefulness) : Can H ERO detect new DM issues forreal-world Golang projects and assist the developers infixing the detected issues?
For RQ4, we conducted experiments using the 132 DMissues from the 500 Golang projects in sujectSet . Notethat none of them overlap with those issues used in ourempirical study. Specifically, we constructed a benchmark TABLE IIH
ERO ’ S EFFECTIVENESS ON DM ISSUE DETECTION
Result Type
Type A Type B.1 Type B.2 Type C
Summary
Ground truth
38 15 28 51 132
Detected
36 15 28 51 130
Missed
Detection rate dataset containing the 132 DM issues and their project versionsfor evaluating whether H
ERO can detect these issues in thebuggy versions or predict them in earlier versions. It is worthmentioning that issue-fixing versions are not necessarily issue-free, since new DM issues can be introduced after fixing aswe have discussed earlier.For RQ5, we applied H
ERO to the rest 19,000 of the top20,000 Golang projects (i.e., excluding 500 used for RQs2–3 and 500 used for RQ4). We reported the detected issuestogether with root cause analyses and fixing suggestions torespective developers. In our issue reports, we also highlightedthe preferred solutions based on their impact on other projects.
A. RQ4: Effectiveness
Experimental setup.
The benchmark dataset contains 38
Type A (28.8%), 15
Type B.1 (11.3%), 28
Type B.2 (21.2%),and 51
Type C (38.6%) DM issues. We collected their cor-responding project versions to evaluate H
ERO ’s capability ofdetecting or predicting DM issues: • Type A:
These issues occurred when module-unawareprojects in
GOPATH referenced v2+ dependencies in
GoModules by virtual import paths. Since issue occurrenceswould already cause build errors, we ran H
ERO on the pre-vious project versions where such issues had not occurred. • Type B.1:
These issues occurred when projects in
GoModules referenced dependencies in
GOPATH , with differentimport path interpretations to v2+ projects released by themajor branch strategy. The inconsistency may not lead toimmediate build errors or functional failures, but is indeedrisky. Thus, we ran H
ERO on the current project versions tocheck whether it can detect potential issues. • Types B.2 and C:
The former occurred when the dependen-cies maintained in the current projects’
Vendor directorieswere deleted or relocated remotely. The latter occurredwhen the current projects in
Go Modules violated SIVrules. In both cases, the current projects would not havesymptoms like build errors, but their downstream projects in
Go Modules would when referencing them in future. Thus,we ran H
ERO on current project versions to check whetherit can detect potential issues.
Results.
Table 2 shows our experiment results. H
ERO re-ported a total of 130 DM issues (all true positives), covering98.5% issues in the benchmark dataset. H
ERO achieved such ahigh detection rate because it constructs a dependency modelthat captures all necessary information on the characteristicsof common DM issues. The only two missing issues are of
Type A . H
ERO failed to detect them due to its conservativenature in identifying module-aware projects in
GOPATH withoutusing any DM tools. We note that precisely deciding module-wareness requires checking a project’s local build environ-ment to know whether it adopts a compatible Golang version.Currently, H
ERO does not support such checking.
B. RQ5: Usefulness
In total, H
ERO reported 2,422 new issues after analyzing the19,000 Golang projects. Although the key information of rootcauses and fixing suggestions can be automatically generatedby H
ERO , reporting these issues to developers involves sub-stantial manual work, such as communicating with developers,helping them submit PRs, etc. As such, we only managedto report 280 issues for the top 1001–2000 popular projects(top 1–1000 already used for RQs 2–4) in the projects’ issuetrackers. Table 3 summarizes the status of our reported issues.Encouragingly, 181 issues (64.6%) were quickly confirmedby the developers, and 160 confirmed issues (88.4%) werelater fixed or are under fixing. For all but two fixed issues,developers adopted our suggested fixes. The other issues arestill pending (likely due to the inactive maintenance of theprojects). We discuss the feedback from the developers below.
Feedback on issue detection.
While different types of DMissues had different confirmation rates (52.0%–74.4%), mostconfirmed issues received positive feedback from developers.We give some examples below. In issue
Type A ) of kiali [60], a developer mentioned “
I have found the sameissue as you describe via the commit c453e89 [61]. I juststuck in an older version of this library ”. In issue
TypeB.1 ) of flamingo-commerce [62], developers were previouslyunaware of the risk and commented “
I guess the inconsistencyof library version was imported by accident. We will create aPR to remove the occurrence ”. In issue
Type B.2 ) of tomato [63], a developer commented “
Nice catch! I thinkit is nice to clean up our vendor directory, since library bitly/go-nsq repository is not existed anymore .” We alsoreported issue
Type C ) [64] to project tidb [65] thatviolated SIV rules and the issue could affect 341 downstreamprojects! Our report struck a chord with tidb ’s downstreamprojects and was linked to seven real issues that indeed causedbuild failures (e.g., issue parser ). Feedback on fixing suggestions.
To ease discussion, wedivide the 160 DM issues that have been fixed or are underfixing into three categories: (1) 143 taking our highlighted pre-ferred solutions (with minimal impacts to other projects), (2)15 taking one of our suggestions (impacting some projects),and (3) the remaining two not taking our suggestions.As an example for category (1), issue sensu-go ’s [68] SIV rule violations.H
ERO warned the potential build errors for sensu-go ’s 89downstream projects. This was confirmed by developers’comments “
We are aware of this issue, but the way youhave summarized it, including the paths forward and impactanalyses, is very valuable. ” However, the developers could notfollow SIV rules immediately due to some internal restrictions.To minimize the impacts to these downstream projects, theytagged a “ technical-debt ” to our report, and extracted part ofthe project code into a new module that follows SIV rules foruse by downstream projects. This code refactoring process was laborious. For category (2), the developers did not take ourhighlighted preferred fixing solutions. With the informationof impacted downstream projects reported by H
ERO , somedevelopers chose to add notes in their projects’ documentationsto suggest the concerned downstream projects work aroundpotential DM issues by using replace directives ( Solution5 ) or hash commit ID (
Solution 8 ) (e.g., issues tidb [64]). For category (3), developers of only two reportedissues ( sensu-go [69] and libvirt [70])did not take our fixing suggestions. Not wanting to be involvedinto trouble, they used other similar libraries for substitution.The above feedback indicates that H
ERO is useful in detect-ing and predicting DM issues for Golang projects, as well assuggesting proper fixes with impact analysis. Developers alsoshowed interest in the H
ERO tool. For example, one developercommented “
I found that you sent many contributions onGitHub for this kind of subjects on many repositories. Howdo you detect the problems with
Go Modules ? Do you planto share a tool or something to manage
Go Modules issues? ”( ovh/cds ’s [71] issue Itis a good bot! ” (
TheThingsNetwork ’s issue
ISCUSSIONS
A. Threats to Validity
One possible threat is the representativeness of the studiedGolang projects and DM issues. To reduce the threat, weselected top 20,000 projects on GitHub for migration statusanalysis (RQ1), and randomly chose 500 from the top 1,000projects to investigate DM issues’ characteristics (RQs 2–3).These projects are popular, large-sized, and well-maintained.We believe that they are proper subjects for our study.Another possible threat is the generality of the issues thatH
ERO detects since the issue types were observed by studyingonly 500 Golang projects. To mitigate the threat, we used adifferent set of DM issues to evaluate H
ERO (RQ4) and foundthat H
ERO can detect 98.5% of these issues, which suggeststhat our findings on issue characteristics are generalizable.Besides, H
ERO also detected a large number of real DM issuesafter analyzing 19,000 Golang projects. This further suggeststhe generality of the findings in this paper.In addition, our study involves manual work (e.g., identify-ing and analyzing issue reports). To reduce the threat of humanmistakes, three co-authors have cross-validated all results forconsistency. B. H ERO ’s Generalizability Beyond the Golang Ecosystem
Two aspects of our methodology are generalizable to theDM issues induced by incompatible library-referencing modesat other ecosystems: • The scenarios of issue types and their causes: (1) projectsin the legacy library-referencing mode depend on projectsin the new library-referencing mode, (2) projects in the newmode depend on those in the legacy mode, and (3) projectsin the new mode depend on others also in the new mode,can be generalized to analyze similar situations.
ABLE IIIS
TATISTICS OF
280 DM
ISSUES REPORTED BY H ERO
Type Issue reports (Issue report ID, Project name)
Type A ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; ♠ ; Type B.1
Type B.2
Type C
Status 1: Issues fixed using our suggestions; Status 2: Issues under fixing using our suggestions; Status 3: Issues confirmed, but fixing not decided; Status 4: Issues fixed using other suggestions;Status 5: Issues pending; Issue ID ♠ : Migration to Go Modules conducted (desired); Due to page limit, the detailed information of reported issues is provided on our homepage ( ) . • The formulation of issue fixing patterns. The methodologyto construct the dependency model by collecting informationabout its upstream and downstream projects can be adaptedto other ecosystems. With the aid of such a dependencymodel, fixing suggestions can be structurally formulatedbased on applicable solutions and their potential impacts.The generalization of our methodology needs to considerthe unique characteristics of the studied programming lan-guages, since our work focuses only on the Golang ecosys-tem (one of the most influential and fastest growing open-source ecosystems).VII. R
ELATED W ORK
Software dependency management.
Software dependencymanagement has inherent complexities [18]–[22], [74]–[95].Blincoe et al. [75] studied over 70 million dependencies tofind out how developers declared dependencies across 17package managers. Their results guided research into betterpractices for dependency management. Abate et al. [96] re-viewed state-of-the-art dependency managers and their abilityto keep up with evolution at the current growth rate ofpopular component-based platforms, and conclude that theirdependency solving abilities are not up to the task. Somestudies [79]–[89], [92] focused on upgrading dependencyversions, and some [77], [78], [90], [91], [95] investigatedhow to migrate client code to adapt to changing dependencies.Researchers [18]–[22] also proposed a series of techniquesto detect, test and monitor dependency conflict issues (e.g.,misusing versions) for JavaScript, Java, and Python projects.Different from such conflict issues, our studied DM issuesare due to incompatible library-referencing modes and theirbroad impacts on related projects in the Golang ecosystem.Garcia et al.’s work [76] is closely related to our H
ERO , inwhich eight inconsistent modular dependencies were formallydefined for Java-9 applications on the Java Platform ModuleSystem (JPMS). They proposed a technique D
ARCY to detectand repair such inconsistencies but their targeted issues arearchitecture-implementation mapping ones, which are differentfrom our focus.
Health of software ecosystems.
Literatures on evolving soft-ware ecosystems cover
Maven [97]–[99],
Apache [79], [100],
Eclipse [101],
Ruby [102]–[104],
PyPI [22],
GNOME [105],and
Npm [104], [106]–[112]. Many concerned techniques fo-cus on three aspects: ecosystem modeling and analysis [98],[100], [104], [107], [108], [111]–[113], socio-technical the-ories within ecosystems [106], [113], and diagnosis andmonitoring for ecosystem’s evolution [22], [97], [114]. Forexample, Blincoe et al. [113] proposed coupling references tomodel technical dependencies between projects, and exploredcharacteristics of open-source or commercial software ecosys-tems. Zimmermann et al. [107] modeled dependencies for the
Npm ecosystem, and analyzed potential risks for packages thatcould be attacked. To the best of our knowledge, our work isthe first attempt to study the health of Golang ecosystem fromthe perspective of DM issues.VIII. C
ONCLUSIONS AND F UTURE W ORK
In this paper, we studied DM issues in Golang projects,which are prevalent and have caused confusions and troublesto many Golang developers. In particular, we investigated thecharacteristics of DM issues, analyzed their root causes, andidentified common fixing solutions. We refined our findingsinto detecting algorithms with customizable fixing templates.The evaluation confirmed the effectiveness of our efforts asa tool implementation H
ERO in detecting and diagnosingDM issues. Leveraging fixing templates and rich diagnosticinformation, we plan to study DM patch generation in future.A
CKNOWLEDGMENT
The authors express thanks to the anonymous reviewers fortheir constructive comments. Part of the work was conductedduring the first author’s internship at HKUST in 2018. Thework is supported by the National Natural Science Founda-tion of China (Grant Nos. 61932021, 61902056, 61802164,61977014), Shenyang Young and Middle-aged Talent SupportProgram (Grant No. ZX20200272), the Fundamental ResearchFunds for the Central Universities (Grant No. N2017011), theHong Kong RGC/GRF grant 16207120, MSRA grant, US NSF(Grant No. CCF-1845446) and Guangdong Provincial KeyLaboratory (Grant No. 2020B121201001).
EFERENCES[1] R. M. Yasir, M. Asad, A. H. Galib, K. K. Ganguly, and M. S. Siddik,“Godexpo: an automated god structure detection tool for golang,” in
Proceedings of the 3rd International Workshop on Refactoring , 2019,pp. 47–50.[2] “Import path syntax descibed in golang documentation,” https://golang . org/cmd/go/ . org/, 2020, accessed: 2020-06-01.[4] “Github,” https://github . com/, 2020, accessed: 2020-06-01.[5] “Launchpad,” https://launchpad . net/, 2020, accessed: 2020-06-01.[6] “Ibm devops services,” hub . jazz . net/git, 2020, accessed: 2020-06-01.[7] “Popular golang libraries on libraries.io,” https://libraries . io/search?order=desc&platforms=Go&sort=rank, 2020, accessed:2020-06-01.[8] “Go get command descibed in golang documentation,” https://golang . org/cmd/go/ . com/golang/dep, 2020, accessed: 2020-06-01.[10] “Glide,” https://github . com/Masterminds/glide, 2020, accessed: 2020-06-01.[11] “Go modules explained in golang wiki,” https://github . com/golang/go/wiki/Modules, 2020, accessed: 2020-06-01.[12] “Siv rules descibed in go wiki,” https://github . com/golang/go/wiki/Modules . com/pierrec/lz4, 2020, accessed: 2020-06-01.[14] “Issue . com/filebrowser/filebrowser/issues/530, 2020, accessed: 2020-06-01.[15] “Issue . com/pierrec/lz4/issues/39, 2020, accessed: 2020-06-01.[16] “go-i18n,” https://github . com/nicksnyder/go-i18n, 2020, accessed:2020-06-01.[17] “Issue . com/nicksnyder/go-i18n/issues/184, 2020, accessed: 2020-06-01.[18] Y. Wang, M. Wen, Z. Liu, R. Wu, R. Wang, B. Yang, H. Yu,Z. Zhu, and S. C. Cheung, “Do the dependency conflicts in my projectmatter?” in Proceedings of the 26th ACM Joint Meeting on EuropeanSoftware Engineering Conference and Symposium on the Foundationsof Software Engineering (ESEC/FSE) , 2018, pp. 319–330.[19] Y. Wang, M. Wen, R. Wu, Z. Liu, S. H. Tan, Z. Zhu, H. Yu, andS. C. Cheung, “Could i have a stack trace to examine the dependencyconflict issue?” in
Proceedings of the 41st International Conference onSoftware Engineering (ICSE) , 2019, pp. 572–583.[20] K. Huang, B. Chen, B. Shi, Y. Wang, C. Xu, and X. Peng, “Interactive,effort-aware library version harmonization,”
Proceedings of the 28thACM Joint Meeting on European Software Engineering Conferenceand Symposium on the Foundations of Software Engineering , 2020.[21] J. Patra, P. N. Dixit, and M. Pradel, “Conflictjs: finding and understand-ing conflicts between javascript libraries,” in
Proceedings of the 40thInternational Conference on Software Engineering , 2018, pp. 741–751.[22] Y. Wang, M. Wen, Y. Liu, Y. Wang, Z. Li, C. Wang, H. Yu, S. C.Cheung, C. Xu, and Z. Zhu, “Watchman: Monitoring dependencyconflicts for python library ecosystem,”
Proceedings of the 42ndInternational Conference on Software Engineering (ICSE) , pp. 125–135, 2020.[23] “github/hub,” https://github . com/github/hub, 2020, accessed: 2020-06-01.[24] “microsoft/presidio,” https://github . com/microsoft/presidio, 2020, ac-cessed: 2020-06-01.[25] “Explanations for minimal module compatibility in go wiki,” https://github . com/golang/go/wiki/Modules, 2020, accessed: 2020-06-01.[26] “Issue . com/olivere/elastic/issues/878, 2020, accessed: 2020-06-01.[27] “Issue . com/golang-migrate/migrate/issues/103, 2020, accessed: 2020-06-01.[28] “golang/go,” https://github . com/golang/go, 2020, accessed: 2020-06-01.[29] “Issue . com/gogs/gogs/issues/5559, 2020, accessed: 2020-06-01.[30] “gogs,” https://github . com/gogs/gogs, 2020, accessed: 2020-06-01.[31] “Issue . com/dominikh/go-tools/issues/328, 2020, accessed: 2020-06-01.[32] “Issue . com/gofrs/uuid/issues/61,2020, accessed: 2020-06-01.[33] “Issue . com/Masterminds/glide/issues/1017, 2020, accessed: 2020-06-01. [34] “Issue . com/cockroachdb/cockroach/issues/47246, 2020, accessed: 2020-06-01.[35] “apd,” https://github . com/cockroachdb/apd, 2020, accessed: 2020-06-01.[36] “moby,” https://github . com/moby/moby, 2020, accessed: 2020-06-01.[37] “Issue . com/moby/moby/issues/39302, 2020, accessed: 2020-06-01.[38] “logrus,” https://github . com//Sirupsen/logrus, 2020, accessed: 2020-06-01.[39] “Issue . com/testcontainers/testcontainers-go/issues/127, 2020, accessed: 2020-06-01.[40] “Issue . com/simiotics/shnorky/issues/2, 2020, accessed: 2020-06-01.[41] “Issue . com/kataras/iris/issues/1355, 2020, accessed: 2020-06-01.[42] “Issue . com/osrg/gobgp/issues/1848, 2020, accessed: 2020-06-01.[43] “Issue . com/jwplayer/jwplatform-go/issues/9, 2020, accessed: 2020-06-01.[44] “Issue . com/golang/go/issues/32695, 2020, accessed: 2020-06-01.[45] “lz4,” github . com/pierrec/lz4, 2020, accessed: 2020-06-01.[46] “Issue . com/Jeffail/benthos/pull/454, 2020, accessed: 2020-06-01.[47] “redis,” https://github . com/go-redis/redis, 2020, accessed: 2020-06-01.[48] “Issue . com/Jeffail/benthos/issues/232, 2020, accessed: 2020-06-01.[49] “Issue . com/shirou/gopsutil/issues/663, 2020, accessed: 2020-06-01.[50] “Issue . com/mediocregopher/radix/issues/141, 2020, accessed: 2020-06-01.[51] “radix,” https://github . com/mediocregopher/radix, 2020, accessed:2020-06-01.[52] “Issue . com/andrewstuart/goq/issues/12, 2020, accessed: 2020-06-01.[53] “goq,” https://github . com/andrewstuart/goq, 2020, accessed: 2020-06-01.[54] “Issue . com/google/go-cloud/issues/429, 2020, accessed: 2020-06-01.[55] “Issue . com/go-redis/redis/issues/1149, 2020, accessed: 2020-06-01.[56] “Issue . com/go-redis/redis/issues/1151, 2020, accessed: 2020-06-01.[57] “Issue . com/prometheus/prometheus/issues/6048, 2020, accessed: 2020-06-01.[58] “Rest api v3 standards,” https://developer . github . com/v3/, 2020, ac-cessed: 2020-06-01.[59] “Go.mod file descibed in golang documentation,” https://blog . golang . org/v2-go-modules, 2020, accessed: 2020-06-01.[60] “Issue . com/kiali/kiali/issues/2922, 2020, accessed: 2020-06-01.[61] “commit c453e89,” https://github . com/kiali/kiali/commit/c453e89dbd76de161930e2996bdc1303c4d22187, 2020, accessed:2020-06-01.[62] “Issue . com/i-love-flamingo/flamingo-commerce/issues/256, 2020, accessed: 2020-06-01.[63] “Issue . com/tomatool/tomato/issues/114, 2020, accessed: 2020-06-01.[64] “Issue . com/pingcap/tidb/issues/16381, 2020, accessed: 2020-06-01.[65] “tidb,” https://github . com/pingcap/tidb, 2020, accessed: 2020-06-01.[66] “Issue . com/pingcap/parser/issues/187, 2020, accessed: 2020-06-01.[67] “Issue . com/sensu/sensu-go/issues/3754, 2020, accessed: 2020-06-01.[68] “sensu-go,” https://github . com/sensu/sensu-go, 2020, accessed: 2020-06-01.[69] “Issue . com/sensu/sensu-go/issues/3970, 2020, accessed: 2020-06-01.[70] “Issue . com/dmacvicar/terraform-provider-libvirt/issues/770, 2020, accessed: 2020-06-01.[71] “ovh/cds,” https://github . com/ovh/cds, 2020, accessed: 2020-06-01.[72] “Issue . com/ovh/cds/issues/5366, 2020, accessed: 2020-06-01.73] “Issue . com/TheThingsNetwork/ttn/issues/780, 2020, accessed: 2020-06-01.[74] C. Xu, Y. Qin, P. Yu, C. Cao, and J. Lv, “Theories and techniquesfor growing software: paradigm and beyond,” SCIENTIA SINICAInformationis , vol. 50, pp. 1595–1611, 2020.[75] J. Dietrich, D. Pearce, J. Stringer, A. Tahir, and K. Blincoe, “Depen-dency versioning in the wild,” in
Proceedings of the 16th InternationalConference on Mining Software Repositories , 2019, pp. 349–359.[76] N. Ghorbani, J. Garcia, and S. Malek, “Detection and repair of architec-tural inconsistencies in java,” in
Proceedings of the 41st InternationalConference on Software Engineering , 2019, pp. 560–571.[77] D. Dig and R. Johnson, “How do apis evolve? a story of refactoring,”
Journal of software maintenance and evolution: Research and Practice ,pp. 83–107, 2006.[78] J. Henkel and A. Diwan, “Catchup! capturing and replaying refactor-ings to support api evolution,” in
Proceedings of the 27th internationalconference on Software engineering , 2005, pp. 274–283.[79] G. Bavota, G. Canfora, M. Di Penta, R. Oliveto, and S. Panichella,“How the apache community upgrades dependencies: an evolutionarystudy,”
Empirical Software Engineering , pp. 1275–1317, 2015.[80] J. Cox, E. Bouwers, M. Van Eekelen, and J. Visser, “Measuringdependency freshness in software systems,” in
Proceedings of the 37thIEEE International Conference on Software Engineering , 2015, pp.109–118.[81] A. Decan, T. Mens, and E. Constantinou, “On the evolution oftechnical lag in the npm package dependency network,” in
InternationalConference on Software Maintenance and Evolution (ICSME) , 2018,pp. 404–414.[82] E. Derr, S. Bugiel, S. Fahl, Y. Acar, and M. Backes, “Keep me updated:An empirical study of third-party library updatability on android,” in
Proceedings of the 2017 ACM SIGSAC Conference on Computer andCommunications Security , 2017, pp. 2187–2200.[83] R. G. Kula, D. M. German, A. Ouni, T. Ishio, and K. Inoue, “Dodevelopers update their library dependencies?”
Empirical SoftwareEngineering , pp. 384–417, 2018.[84] Y. Wang, B. Chen, K. Huang, B. Shi, C. Xu, X. Peng, Y. Liu, andY. Wu, “An empirical study of usages, updates and risks of third-partylibraries in java projects,” arXiv preprint arXiv:2002.11028 , 2020.[85] S. McCamant and M. D. Ernst, “Predicting problems caused bycomponent upgrades,” in
Proceedings of the 9th ACM Joint Meetingon European Software Engineering Conference and Symposium on theFoundations of Software Engineering , 2003, pp. 287–296.[86] D. Foo, H. Chua, J. Yeo, M. Y. Ang, and A. Sharma, “Efficientstatic checking of library updates,” in
Proceedings of the 26th ACMJoint Meeting on European Software Engineering Conference andSymposium on the Foundations of Software Engineering , 2018, pp.791–796.[87] S. Raemaekers, A. van Deursen, and J. Visser, “Semantic versioningand impact of breaking changes in the maven repository,”
Journal ofSystems and Software , pp. 140–158, 2017.[88] S. Raemaekers, A. Van Deursen, and J. Visser, “Measuring softwarelibrary stability through historical version analysis,” in
Proceedingsof the 28th IEEE International Conference on Software Maintenance ,2012, pp. 378–387.[89] I. J. M. Ruiz, M. Nagappan, B. Adams, T. Berger, S. Dienst, and A. E.Hassan, “Analyzing ad library updates in android apps,”
IEEE Software ,pp. 74–80, 2016.[90] S. Kabinna, C.-P. Bezemer, W. Shang, and A. E. Hassan, “Logginglibrary migrations: A case study for the apache software foundationprojects,” in
Proceedings of the 13th Working Conference on MiningSoftware Repositories (MSR) , 2016, pp. 154–164.[91] F. L. de la Mora and S. Nadi, “Which library should i use?: a metric-based comparison of software libraries,” in
Proceedings of the 40thInternational Conference on Software Engineering: New Ideas andEmerging Technologies Results , 2018, pp. 37–40.[92] R. G. Kula, D. M. German, T. Ishio, A. Ouni, and K. Inoue, “Anexploratory study on library aging by monitoring client usage in asoftware ecosystem,” in
Proceedings of the 24th International Con-ference on Software Analysis, Evolution and Reengineering (SANER) ,2017, pp. 407–411.[93] C. Macho, S. McIntosh, and M. Pinzger, “Automatically repairingdependency-related build breakage,” in
Proceedings of the 25th In-ternational Conference on Software Analysis, Evolution and Reengi-neering , 2018, pp. 106–117.[94] C.-P. Bezemer, S. McIntosh, B. Adams, D. M. German, and A. E.Hassan, “An empirical study of unspecified dependencies in make- based build systems,”
Empirical Software Engineering , pp. 3117–3148,2017.[95] S. Mostafa, R. Rodriguez, and X. Wang, “A study on behavioralbackward incompatibilities of java software libraries,” in
Proceedingsof the 26th ACM SIGSOFT International Symposium on SoftwareTesting and Analysis , 2017, pp. 215–225.[96] “Dependency solving: A separate concern in component evolutionmanagement,”
Journal of Systems and Software , vol. 85, no. 10, pp.2228–2240, 2012.[97] C. Soto-Valero, A. Benelallam, N. Harrand, O. Barais, and B. Baudry,“The emergence of software diversity in maven central,” in
Proceedingsof the 16th International Conference on Mining Software Repositories(MSR) , 2019, pp. 333–343.[98] A. Benelallam, N. Harrand, C. Soto-Valero, B. Baudry, and O. Barais,“The maven dependency graph: a temporal graph-based representationof maven central,” in
Proceedings of the 16th International Conferenceon Mining Software Repositories (MSR) , 2019, pp. 344–348.[99] D. Mitropoulos, V. Karakoidas, P. Louridas, G. Gousios, and D. Spinel-lis, “The bug catalog of the maven ecosystem,” in
Proceedings of the11th Working Conference on Mining Software Repositories , 2014, pp.372–375.[100] L. Hernández and H. Costa, “Identifying similarity of software inapache ecosystem–an exploratory study,” in
Proceedings of the 12thinternational conference on information technology-new generations ,2015, pp. 397–402.[101] J. Businge, A. Serebrenik, and M. van den Brand, “Survival ofeclipse third-party plug-ins,” in
International Conference on SoftwareMaintenance , 2012, pp. 368–377.[102] M. M. Syeed, K. M. Hansen, I. Hammouda, and K. Manikas, “Socio-technical congruence in the ruby ecosystem,” in
International Sympo-sium on Open Collaboration , 2014, pp. 1–9.[103] J. Kabbedijk and S. Jansen, “Steering insight: An exploration of theruby software ecosystem,” in
International Conference of SoftwareBusiness , 2011, pp. 44–55.[104] R. Kikas, G. Gousios, M. Dumas, and D. Pfahl, “Structure andevolution of package dependency networks,” in
Proceedings of the14th International Conference on Mining Software Repositories (MSR) ,2017, pp. 102–112.[105] C. Jergensen, A. Sarma, and P. Wagstrom, “The onion patch: migrationin open source ecosystems,” in
Proceedings of the 19th ACM JointMeeting on European Software Engineering Conference and Sympo-sium on the Foundations of Software Engineering (ESEC/FSE 2018) ,2011, pp. 70–80.[106] A. Trockman, S. Zhou, C. Kästner, and B. Vasilescu, “Adding sparkleto social coding: an empirical study of repository badges in the npmecosystem,” in
Proceedings of the 40th International Conference onSoftware Engineering , 2018, pp. 511–522.[107] M. Zimmermann, C.-A. Staicu, C. Tenny, and M. Pradel, “Small worldwith high risks: A study of security threats in the npm ecosystem,” in
Proceedings of the 28th USENIX Security Symposium Security , 2019,pp. 995–1010.[108] F. R. Cogo, G. A. Oliva, and A. E. Hassan, “An empirical study ofdependency downgrades in the npm ecosystem,”
IEEE Transactions onSoftware Engineering , 2019.[109] C.-A. Staicu, M. T. Torp, M. Schäfer, A. Møller, and M. Pradel,“Extracting taint specifications for javascript libraries,” in
Proceedingsof the 42nd International Conference on Software Engineering , 2020.[110] A. Zerouali, E. Constantinou, T. Mens, G. Robles, and J. González-Barahona, “An empirical analysis of technical lag in npm package de-pendencies,” in
International Conference on Software Reuse . Springer,2018, pp. 95–110.[111] N. Lertwittayatrai, R. G. Kula, S. Onoue, H. Hata, A. Rungsawang,P. Leelaprute, and K. Matsumoto, “Extracting insights from the topol-ogy of the javascript package ecosystem,” in
Proceedings of the 24thAsia-Pacific Software Engineering Conference , 2017, pp. 298–307.[112] A. Abdellatif, Y. Zeng, M. Elshafei, E. Shihab, and W. Shang,“Simplifying the search of npm packages,”
Information and SoftwareTechnology , 2020.[113] K. Blincoe, F. Harrison, N. Kaur, and D. Damian, “Reference coupling:An exploration of inter-project technical dependencies and their char-acteristics within large software ecosystems,”
Information and SoftwareTechnology , pp. 174–189, 2019.[114] S. Jansen, “Measuring the health of open source software ecosystems:Beyond the scope of project health,”