[PDF] Does Link Prediction Help Detect Feature Interactions in Software Product Lines (SPLs)?

Abstract

Full PDF

DDoes Link Prediction Help Detect FeatureInteractions in Software Product Lines (SPLs)?

Seyedehzahra Khoshmanesh and Robyn LutzDepartment of Computer ScienceIowa State UniversityAmes, IA USA 50011 { zkh,rlutz } @iastate.eduSeptember 17, 2020 Abstract

An ongoing challenge for the requirements engineering of softwareproduct lines is to predict whether a new combination of features (unitsof functionality) will create an unwanted or even hazardous feature in-teraction. We thus seek to improve and automate the prediction ofunwanted feature interactions early in development. In this paper weshow how the detection of unwanted feature interactions in a softwareproduct line can be eﬀectively represented as a link prediction problem.Link prediction uses machine learning algorithms and similarity scoresamong a graph’s nodes to identify likely new edges. We here model thesoftware product line features as nodes and the unwanted interactionsamong the features as edges. We investigate six link-based similar-ity metrics, some using local and some using global knowledge of thegraph, for use in this context. We evaluate our approach on a softwareproduct line benchmark in the literature, building six machine-learningmodels from the graph-based similarity data. Results show that thebest ML algorithms achieved accuracy of 0.75 to 1 for classifying fea-ture interactions as unwanted or wanted in this small study, and thatglobal similarity metrics performed better than local similarity met-rics. The work shows how link-prediction models can help ﬁnd missingedges, which represent unwanted feature interactions that are undoc-umented or unrecognized, earlier in development. a r X i v : . [ c s . S E ] S e p igure 1: Workﬂow of proposed method to classify unwanted feature inter-actions Software product lines are widely used in industry to reap the beneﬁts ofreuse. A software product line (SPL) is a family of software products thatshare a set of basic features as a core and diﬀer in other alternative oroptional features [32]. A feature in a software product line is a unit of func-tionality that provides service to users [38] (i.e., diﬀerent from a feature inmachine learning or statistics). Features are problem-oriented and describethe users’ requirements [9, 12].A software product line tends to evolve as it grows. As more productsjoin the product line over time, new combinations of features get added [10].However, some features are incompatible, and can even cause hazardousconditions when combined in a single product [5]. These constraints aretermed unwanted feature interactions.For instance, an unwanted feature interaction occurs in a telephony sys-tem when we combine the two features call-forwarding and call-waiting [11].If we enable both features, the system enters an unexplored and unsafe statewhen, while the line is busy, the system receives another call. In this casethere is a requirements conﬂict and the system does not know whether thecall should be delayed or forwarded.An ongoing challenge for the requirements engineering of software prod-2ct lines is to predict whether a new combination of features will create anunwanted or even hazardous feature interaction. Detecting such unwantedfeature interactions is a diﬃcult and persistent problem for software productlines. Often they are not found until testing [15] or operations. While modelchecking approaches can catch some unwanted feature interactions earlier,they have been diﬃcult to implement at industrial scales [36].The work reported here explores a new approach to earlier detectionof unwanted feature interactions, inspired by link prediction in networks.Many social, information and biological systems and networks can be rep-resented as graphs. For example, a social network is a graph in which eachedge shows a friendship between two people (i.e., nodes) in the graph, anda co-authorship network is a graph in which each edge shows a paper col-laboration between two authors. Links in the ﬁeld of link prediction arebetween nodes of the same type, e.g., two people, while links in the ﬁeld oftraceability are typically between diﬀerent types of software artifacts, e.g.,a requirement and its source code [40].) Link prediction then uses similar-ity based algorithms to predict the likelihood of the creation of a new edgebetween two nodes in the graph. That is, link prediction detects potentialmissing links between nodes, such as a missing but likely friendship in asocial network [25, 29].In this paper we show how the detection of unwanted feature interactionsin a software product line can be eﬀectively represented as a link predictionproblem. We thus model a software product line as a graph of features andrelationships or interactions existing between the features. Each feature ina software product line feature model is represented as a node in the featureinteraction graph. The links or edges between features represent the featureinteractions between them.The work reported in this paper employs the knowledge of prior wantedand unwanted feature interactions captured in a product line’s feature modeland feature constraints, together with similarity measures among product-line features, link prediction, and machine learning algorithms to improveand automate the detection of missing or new unwanted feature interactionsin a new product. As shown in Fig. 1 and described in Sect. 2, we applylink prediction techniques to calculate local and global similarity among thefeatures in a feature interaction graph. Next, we build, train, and tune amachine learning model to detect potential new or missing unwanted featureinteractions in the new product or version While previous approaches havesucceeded at detecting unwanted feature interactions during testing, ourapproach can ﬁnd many of them earlier, in the requirements phase.Similarity is a key metric in our proposed framework and acts as a heuris-3ic tool for detecting a new or missing unwanted feature interaction. Thisis because similar features have been observed to behave in similar ways. Ifthere is a feature in the feature interaction graph that contributes to someunwanted feature interactions, the features which are similar to this featureoften will contribute to the same unwanted interactions [19–21].We thus target two goals in our paper. First, we want to understandwhether information about the feature interaction graph can suﬃce to detectpotential missing or new unwanted feature interactions. Second, we want toinvestigate whether link prediction and machine learning algorithms can helpachieve this detection. To address these issues, the paper aims to addressthe following questions: • RQ1:

How eﬀectively does link prediction help detect unwanted featureinteractions in a software product line? • RQ2:

Which similarity metrics and machine learning algorithms per-form better in the context of unwanted feature interaction detection?

We investigate these research questions by applying our approach toa case study, the Electronic Mail system introduced by Hall in [14] andextended as a benchmark in the software product line literature [3].Results obtained from the application of our approach to the Emailbenchmark showed a perfect accuracy of 100% in detecting unwanted fea-ture interactions using link prediction techniques with Random Forest, NaiveBayes and Linear Support Vector Machine. This indicates that the use oflink prediction, similarity metrics, and machine learning algorithms in asoftware product line may help detect missing or new unwanted featureinteractions in the requirements phase of a proposed new product in theproduct line.The contribution of the paper is a framework which combines link predic-tion and machine learning techniques to detect unwanted feature interactionsin the early-phase development of a new product in a software product lineor of a new version of a software-intensive system. While similarity mea-sures and link prediction have been considered widely in software testingand social network systems, to our knowledge they have not been studiedpreviously for detection of unwanted software feature interactions.The work described in this paper is part of our ongoing eﬀort towardimproved detection of feature interactions during the requirements analysisof a new product. In earlier work we studied similarity measures basedon features’ structural elements (classes, attributes and methods) [19] andon features’ relative positions in a software product line’s feature model419–21]. New work that is ﬁrst reported in this paper is our representationof the feature interaction problem as a link prediction problem, which makesfeature interactions amenable to classiﬁcation (wanted/unwanted) and thebuilding of a predictive learning model, together with the evaluation resultsfrom our initial application of this approach.The rest of the paper is structured as follows. Section 2 describes oursimilarity-based machine-learning method for detecting unwanted featureinteractions in a software product line. Section 3 describes results from anevaluation of its application on a small software product line. Section 4reviews related work, and Section 5 gives concluding remarks. All artifacts,code, and analysis used in this study are available at https://tinyurl.com/y8h5erwp . In this section, we explain our proposed method for detecting unwantedfeature interactions in the requirements phase of a new SPL product orversion. We ﬁrst give an overview of the method, as well as of the intuitionon which it is based. We then introduce the software product line casestudy that we use to evaluate this approach. Finally, we deﬁne the similaritymetrics whose calculated values are used by the learning algorithms.

We deﬁne a Software Product Line as a graph G = ( V, E ) in which eachfeature F in the software product line feature model is a node, V , in thegraph G. The edge, E , between two features F i and F j represents a knownfeature interaction, either wanted or unwanted, as documented in featuremodel constraints, between two features, F eatureInteraction = ( F i , F j ) G SoftwareP roductLine = ( V F , E F eatureInteraction )Fig. 1 shows the framework of our proposed method. We brieﬂy describethe steps in its process, identifying each by its number given in the ﬁgure.In (1) , we gather the requirement level artifacts, including the softwareproduct line feature model and its associated list of existing wanted andunwanted feature interactions from the software product line repository,to pre-process and create a feature interaction graph, shown in (2) . Thefeature interaction graph is the appropriate input on which to apply the5ink prediction technique. In a feature interaction graph, each edge can belabeled wanted interaction or unwanted feature interaction .The interaction graph shown in Fig. 3 is of the Electronic Mail softwareproduct line, the case study we use to investigate our research questions[4, 14]. It is introduced below. The graphis automatically created by the “igraph” package [13].Figure 2: Intuition behind our similarity-based learning to detect unwantedfeature interactionIn the process step (3) of Fig. 1, we apply proximity-based methodsfor link prediction on the feature interaction graph to obtain the similarityscores for the feature interaction pairs in the graph.Fig. 2 shows the intuition behind how using similarity indexes betweentwo nodes in an interaction graph helps to detect new or missing unwantedfeature interactions in a new version or product-line product [18–21]. Asshown in Fig. 2, there is an unwanted feature interaction between twofeatures, F i and F j that is shown by the edge between them. If, in a newversion or product of a software product line, a new feature F k is added thathas a high feature similarity score with feature F i , it is similarly likely tohave an unwanted interaction with feature F j .To make the intuition described in Fig. 2 more concrete, we extend aclassical example of unintended feature interaction, described by Batory etal. and attributed to Kang [8]. Suppose our building control product linehas three features, each of which operates correctly in isolation. F i is a6lood-Control feature with water sensors that, when they detect standingwater, turn oﬀ the water main. F j is a Fire-Control feature with sensors that,when they detect ﬁre, activate water sprinklers. There is a known unwantedfeature interaction between the Fire-Control feature and the Flood Controlfeature, shown in Fig. 2 as an edge between them. This is because thefeatures interfere with each other and create a hazardous situation, turningthe water main oﬀ while the sprinklers should be active. The third feature, F k , is a Pipes-Protection feature with sensors that, when they detect sub-freezing temperatures in the building, turn oﬀ the water main to avoid burstpipes.Suppose that the Pipes-Protection feature is being added to a buildinghaving a Fire-Control feature for the ﬁrst time. There is no known featureinteraction between these two features as they have never been combinedin a product previously. However, we seek to use the high degree of simi-larity between the Flood-Control feature and the Pipes-Protection feature,both of which turn oﬀ the water main, to predict that the Pipes-Protectionfeature may have an unrecognized and unwanted feature interaction withthe Fire-Control feature. This previously unrecognized feature interactioncan then be suggested to the requirements analyst. If conﬁrmed, it can bedocumented by adding it as an edge in the graph to also help with futureproducts.More speciﬁcally, the link prediction techniques uses similarity scores topredict the missing links, i.e., edges, between nodes in a graph [28]. Thelink prediction method predicts the new or missing edges, including theedge between F k and F j . A similarity score s F F in an interaction graphis deﬁned as “how much” two nodes in the graph are similar. A highersimilarity score means a higher likelihood that the link will appear in thefuture. We formalize the calculation of similarity below.We use similarity based algorithms to detect missing and new unwantedfeature interactions when the software product line evolves, such as featuresbeing added or new versions introduced. Table 1 shows the eight similaritymetrics which we used in our investigation. These similarity metrics cancapture local and global similarity between two nodes in the feature inter-action graph of a software product line. The similarity metrics that onlyneed the local topology of the graph to be calculated belong to the localsimilarity category, while the similarity metrics that require global topo-logical information for a graph (e.g., shortest path) belong to the globalsimilarity category. These similarity metrics are widely used in the linkprediction literature, and research studies report that they are among thehighest accuracy local and global similarity metrics in ﬁelds as varied as7able 1: Local and global link-based similarity metrics used in detection ofunwanted feature interactionsMetric Name Category1 Common Neighbors Local Similarity2 Jaccard distance [16] Local Similarity3 Cosine distance [37] Local Similarity4 Adar index [1] Local Similarity5 Resource Allocation Index (RA) [42] Local Similarity6 Katz [17] Global Similarity7 Random Walk with Restart (RWR) [39] Global Similarity8 Local Path Index (LP) [42] Quasi-local methodsnetwork science, electrical power-grids, and protein-protein interaction net-works [24, 27, 33, 42].Step (4) in our framework process is to input the data in the form ofa data frame to the machine learning models. Each record in our cleaneddata describes diﬀerent similarity scores for an edge between two featuresin the graph. The class variable indicates whether this edge contributes toan unwanted feature interaction or not. Therefore, we have a classiﬁcationor supervised learning problem. We train and tune six diﬀerent machinelearning algorithms, as described below in the Results section. The bestmachine learning model can then be saved to evaluate on unseen data in thefuture. We save the optimized ﬁnal machine learning model in step (5) .Finally, the requirement analyst can use the saved model shown in (5) of Fig. 1 to check combinations of product-line features proposed for a newproduct as early as possible in order to learn about the possible missing ornew unwanted feature interactions. The ﬁnal report, shown in (6) , providesuseful information regarding potential missing and new unwanted featureinteraction in the new products. Additionally, as a product evolves over itslife cycle, the results of applying our method could be used as an incrementallearning model in order to improve model accuracy and generalizability. In this subsection, we describe the case study on which we applied andevaluated our proposed method for detecting unwanted feature interactionsearly in the development. We selected the Electronic Mail System, or Email,software product line from the literature, since it provides multiple featureinteractions. The Email system was originally introduced by Hall [14] and8ater became a product-line benchmark used by Apel and others [3, 5]. TheEmail software product line models an e-mail communication system havingseveral optional features that can be enabled or disabled such as encryption,forwarding, and verify email. Its feature model is shown as part of thesoftware product line repository at the top left in Fig. 1. It shows eightavailable features for any new product. The leftmost feature there, “EmailClient,” is a commonality that must be present in all products. The otherseven features are optional.Figure 3: Unwanted Feature Interaction Graph for the Email system (auto-matically created with the “igraph” tool in R)Selecting some pairs of optional features will cause unwanted feature in-teractions. Fig. 3 is an unwanted feature interaction graph for the Emailproduct line. It shows the seven optional features as nodes and the unwantedfeature interactions as links, or edges. There are 10 unwanted feature inter-actions in the Email product line.An example of an unwanted feature interaction comes from an Emailproduct line [4] in which its Encrypt Email feature and its Forward Emailfeature each work as intended when only one of them is present. However,when both these features appear in a product, unwanted behavior occurs.Namely, an encrypted email will be forwarded in plain text if the client’spublic key is not available, violating the product’s security requirement.As we can see in Fig. 3,“Encrypt” has the highest degree among features in the interactiongraph, as it participates in ﬁve unwanted feature interactions (edges) in9he graph. Feature “Forward” similarly contributes to four interactions.Note that the ﬁgure does not display the 11 normal, i.e., wanted, fea-ture interactions, that would appear in a fully connected feature interactiongraph. However, we will use all 21 edges (10 unwanted and 11 wanted in-teractions) of the fully connected graph to build our models.

We next describe the similarity metrics used in our investigation. All thesimilarity metrics described here have shown good performance even on verylarge data and have polynomial time complexity [33]. We selected both localand global similarity indexes. The local similarity indexes are node-basedtopological similarity metrics, while global similarity indexes are path-basedtopological similarity metrics. • Common Neighbors. Two nodes x and y in a graph are more likely tohave a link if they have many common neighbours. In the context offeature interaction, two features F and F are more likely to have anunwanted feature interaction if these two features have many commonneighbours in the feature interaction graph. For a feature F , let Γ( F )denote the set of neighbors of F . The CN (Common Neighbor) isdeﬁned as counting number of in-common neighbors of two features inthe graph. s CNF F = | Γ( F ) ∩ Γ( F ) | Many studies have used CN and observed that there is a positivecorrelation between the number of common neighbours in a graph andpossible links between two nodes in a graph, such as in a friendshipgraph or a scientiﬁc collaborative graph [22, 30]. • Jaccard Index [16]. The Jaccard Index measures the similarity betweentwo sets, and is deﬁned as: s JaccardF F = | Γ( F ) ∩ Γ( F ) || Γ( F ) ∪ Γ( F ) | • Cosine Distance [34]. Cosine distance is a metric used to measurehow similar nodes are irrespective of their degree. Mathematically, it10easures the cosine of the angle between two vectors projected in amulti-dimensional space. It is deﬁned as: s Cosine F F = | Γ( F ) ∩ Γ( F ) | (cid:112) k F × k F where k F is the degree of node F . • Adamic/Adar (AA) Index [1]. An AA measure is deﬁned as the in-verted sum of degrees of common neighbors for two given vertices.This metric measures the closeness of two nodes based on their sharedneighbors. A value of 0 indicates that two nodes are not close, whilehigher values indicate that nodes are close. This index reﬁnes thesimple counting of common neighbors by assigning the less-connectedneighbors more weight, and is deﬁned as s AAF F = (cid:88) z ∈ Γ( F ) ∩ Γ( F ) k z • Resource Allocation Index (RA) [42]. This index is motivated by theresource allocation in networks. Given a pair of nodes, x and y whichare not directly connected, the node x can send some resource to y with their common neighbors as transmitters. We assume that eachtransmitter has a unit of resource, and will equally distribute it toall of its neighbors. The similarity between x and y is deﬁned as theamount of resource that y received from x which is: s RAxy = (cid:88) z ∈ Γ( x ) ∩ Γ( y ) k ( z ) • Katz Index [17]. The Katz centrality of a node is a measure of cen-trality in a network. Unlike typical centrality measures which con-sider only the shortest path between a pair of nodes, Katz centralitymeasures inﬂuence by taking into account the total number of walksbetween a pair of nodes. It is similar to Googles PageRank. • Random Walk with Restart (RWR) [28]. This index is a direct appli-cation of the Page Rank algorithm. • Local Path Index (LP) [42]. This metric uses the local paths andwider common neighbors (neighborhoods of second order) to reducethe complexity of the Katz metric.11

Results

In this section, we describe the evaluation results for the two research ques-tions presented in the introduction regarding our method. To answer theseresearch questions, we applied the eight link-based similarity metrics in Ta-ble 1 to the feature nodes in the feature interaction graph of the Email casestudy. These similarity metrics are used widely in link prediction literature,and research studies have shown their usefulness [28]. (a) Naive Bayes (b) Random Forest(c) Neural Net (d) C5.0(e) SVMLinear (f) KNN

Figure 4: Performance and tuning results of training machine learning mod-els on 80% (10-fold cross-validation) of the Email dataWe used the “link prediction” package in R [24] to calculate the eight12imilarity metrics on the Email product line’s feature interaction graph. Fig.1 shows how the feature interaction graph serves as input for the calculationof similarity indexes for each pair of nodes in it. Each of the eight similarityscores for a node is a variable in the data frame subsequently used to buildthe machine learning models.We next built a fully connected version of the graph. Since there are 7features in the feature interaction graph in Fig. 3, there are (42 /

2) = 21edges in the fully connected graph: the 10 in Fig. with the unwanted featureinteraction label and the other 11 with the wanted feature interaction label.We use the labels for these 21 edges to train a supervised learning model to classify an edge as an unwanted or wanted feature interaction .To do this, we selected six widely used machine learning algorithms tobuild models on the existing data, with the goal of detecting missing or newunwanted feature interactions in a new product or new version of a system.These six ML algorithms were Neural Net (NNET), Naive Bayes (NB), Lin-ear Support Vector Machine (SVMLinear), Decision Tree (C5.0), RandomForest (RF), and K- Nearest Neighbour(KNN). We used the following set-tings in our experiments to train, tune, and identify feature importance, andto test the data: • “link prediction” package in R to calculate the similarity metrics [24]. • “ Caret” package in R to train, tune, and test the machine learningmodels [23]. • • stratiﬁed splitting to divide the data into the 80% training set and the20% test set. Stratiﬁed sampling ensured that the training and testsets have approximately the same percentage of samples of each targetclass as the complete set. • variable importance for Random Forest, Neural Net, and C5.0 in“Caret” to identify the most important ML features. • ROC curve variable importance to identify the most important fea-tures for those models such as Naive Bayes, SVMLinear, and KNNwhich do not have built-in variable-importance functions.

RQ1:

How eﬀectively does link prediction help detect unwanted featureinteractions in a software product line? raining. Fig. 4 shows the performance results of 10-fold cross valida-tion on the training data for the six machine learning algorithms. We reportthe Accuracy of the ﬁnal tuned model for each machine learning algorithm.Fig. 4 also shows the tuning parameter and selected parameters for eachmachine learning model. As shown there, the Naive Bayes model has an ac-curacy of 1 when we use Gaussian for the distribution type. Random Foresthas the highest accuracy of 1 even using two randomly selected predictors.Neural Net has at most an accuracy of 0.95 when we use 1 hidden layer andweight decay of 1 e −

04. KNN has at most an accuracy of 0.75 when weuse k = 5. Boosting Decision Tree (C5.0) has at most an accuracy of 0.90.SVM Linear has the highest accuracy of 1 with a mis-classiﬁcation cost of0.25. Testing.

The results of the ﬁnal models for each of the six machinelearning models on 20% of unseen data (i.e., two edges with unwanted featureinteractions and two edges with wanted feature interactions) are shown inTable 2. We see that KNN mis-classiﬁed one of the two unwanted featureinteraction as a wanted feature interaction. However, the other ﬁve machinelearning algorithms correctly detected all unwanted feature interactions onunseen data.Table 2: Performance Results of ﬁnal trained machine learning models on20% of unseen Email dataModel Name Accuracy Sensitivity SpeciﬁcityNaive Bayes 1 1 1Random Forest 1 1 1Neural Net 1 1 1Decision Tree (C 5.0) 1 1 1SVM Linear 1 1 1KNN 0.75 0.5 1The results indicate that using the machine learning algorithms alongwith link prediction helped detect new or missing unwanted feature inter-actions eﬃciently in the early development of a new product in a softwareproduct line.

RQ2:

Which similarity metrics and machine learning algorithms per-form better in the context of unwanted feature interaction detection?

With regard to ML algorithms, the results described above showed thatRandom Forest, Naive Bayes, and SVM Linear had the highest Accuracyof 1 in classifying the feature interactions in the Email product line. These14hree machine learning models are all interpretable, simple and eﬃcient.It is an open question at this point whether any of the other three modelswill be useful when we evaluate our approach on much larger datasets. Forexample, Neural Net generally performs better on large datasets so may berelevant for more complicated software product lines. (a) Naive Bayes (b) Random Forest(c) Neural Net (d) C5.0(e) SVMLinear (f) KNN

Figure 5: Variable Importance of ML models in detecting unwanted featureinteractions in EmailTo answer which similarity scores perform better, we ﬁrst extracted thefeature importance related to each machine learning model described inRQ1. Fig. 5 shows the feature importance plot for these six machine learning15odels. We used AUC (Area Under the ROC Curve) to extract the featureimportance of Naive Bayes, KNN, and SVMLinear. We used the built-invariable importance function for Random Forest, Neural Net, and C5.0.The most important feature for all models except C5.0 was “Randomwalk with restart,” which is a global similarity metric. “Katz” was the mostimportant variable for C5.0 and was the second most important variablefor Random Forest, SVM Linear, KNN and Naive Bayes. “Katz” is also aglobal similarity metric. The next most important variables were “Cosine”and “Jaccard” which are local similarity metrics.The global similarity metrics thus played a more important role than lo-cal similarity scores in the link prediction. Within global similarity metrics,“Random walk with restart” and “Katz” were the most important features.Within local similarity metrics, “Cosine” and “Jaccard” performed bettercompared their peers.Figure 6: AUC (Area under the ROC Curve) of 6 diﬀerent similarity metricsin detecting new unwanted feature interactionsFig. 6 shows the AUC (Area under the ROC Curve) bar chart of the16ix similarity scores in detecting each unwanted feature interaction pair inthe Email system. AUC provides an aggregate measure of performanceacross all possible classiﬁcation thresholds. As an example, to detect theunwanted feature interaction between “Forward-Verify”, we calculated thesimilarity score of 6 diﬀerent similarity metrics on the 9 remaining unwantedfeature interactions and predicted the new edge “Forward-Verify”. Thehighest similarity score could detect the formation of the new pair. For“Forward-Verify”, the “Jaccard” metric had the highest Area Under theROC Curve, so performed better compared to other similarity metrics indetecting “Forward-Verify”. We see from Fig. 6 that each of “Katz,” “Ran-dom Wail With Restart,” and “Jaccard” could detect 3 unwanted featureinteractions, thus were more important compared to other similarity metricsfor detecting unwanted feature interactions in the Email product line.

Threats to validity:

Internal threats.

We have investigated a single, small case study as aninitial investigation into the feasibility of our approach. However, this casestudy is considered to be a benchmark in the SPL literature and to havecorrect artifacts based on realistic wanted and unwanted feature interactions,so results on it are a good ﬁrst step.The work described in this paper aims to detect unwanted pairwisefeature interactions, where the presence of one of two features causes achange in the behavior of the other feature [8]. This raises the questionof whether our approach misses many interactions, namely those involvingmore than two features. However, studies have shown that two-feature in-teractions are the most common form of feature interactions in softwareproduct lines [31, 36, 41]. A study analysing variability in 40 large softwareproduct lines also found that structural interactions exist mostly betweentwo features [26]. Therefore, while more study is needed to generalize theresults, our method’s approach to handling pairwise feature interactions tar-gets most, if not all, feature interactions [7, 8].

External threats.

While our work can be applied to other domains gener-ally, we do not have any information about the applicability of this approachto large, real-world software product lines, and more work is necessary toconﬁrm and improve these initial results. Evaluation on additional softwareproduct lines in other application areas beyond that presented here also isneeded. Future work will seek to ascertain whether our proposed method forlink prediction-based learning models for unwanted feature detection holdsup and can be usefully applied to software product lines in other domains.17

Related Work

Regarding the use of features as requirement engineering artifacts, Classenet al. [12] deﬁned a “feature” as a problem-level feature, which includes aset of related requirements, speciﬁcations and domain assumptions. Theyproposed a veriﬁcation tool for Software Product Lines to uncover featureinteractions. Our method diﬀers from their framework in not using formalmethods and model veriﬁcation tools. Instead, we use machine learningalgorithms to eﬃciently uncover unwanted feature interactions early in thedevelopment of a proposed new product in a software product line or of anew version in a software-intensive system.In the area of detecting unwanted feature interaction using formal meth-ods, Apel et al. [4, 5] introduced the “feature-aware veriﬁcation” methodto detect feature interactions using variability encoding automatically. Ourapproach to dealing with the feature interaction problem diﬀers from theirstudies in exploiting known unwanted feature interactions and not requiringformal methods.Atlee, Fahrenberg, and Legay [6] proposed an approach to measure thedegree to which features interact in a software product line. They usedtransition systems and similarity metrics to compute the degree of featureinteraction in a featured transition system. Our study diﬀers from theirs innot requiring developers to produce a formal model of the system.In the area of using similarity measures in a software product line,Henard, et al. [15] used similarity measures to prioritize test cases to de-crease the number of product conﬁgurations in software product-line testing.Our study is diﬀerent in that we used link prediction to detect unwantedfeature interaction during requirements analysis, prior to coding or testing.Al-Hajjaji et al. [2] suggested a similarity-based prioritization that en-hances coverage of SPL test cases to detect errors in a reasonable time. Theycompared the result of their algorithm with three sampling algorithms andconcluded that the similarity-based prioritization algorithm could competewith them and produce the test cases faster.S´anchez, Segura, and Ruiz-Cort´es [35] investigated ﬁve diﬀerent prior-itization criteria including dissimilarity to generate test cases for softwareproduct-line testing. They obtained 87% accuracy with prioritization basedon dissimilarity. While we use a similarity model to detect feature interac-tion, their work diﬀers from ours in that we apply similarity to individualfeatures rather than to the entire product in a software product line anddetect feature interaction during the requirements phase rather than thetesting phase. 18n the area of using link prediction, Lu and T.Zhou [28, 42] examinedthe usage of local similarity measures based on the node similarity in linkprediction on six real networks. They described the usage of link predictionmethods in ﬁnding the missing links in the network and classiﬁcation ofpartially labeled networks.Rawashdeh and Ralescu [33] investigated the structural and semanticsimilarity metrics in social networks. They compared diﬀerent similaritymetrics based on time and space complexity and highlighted the diﬃcultyof choosing an appropriate similarity metric for link prediction. We are notaware of previous studies of link prediction for detecting missing or newfeature interactions in a software product line.We instead aim by our approach to use existing requirement artifacts,including knowledge of prior unwanted feature interactions and the softwareproduct line’s feature model. We calculate similarity indexes and apply linkpredictions, then feed the results to machine learning algorithms to uncoverunwanted feature interactions eﬃciently at an early stage of developmentof a new product in an evolving product line or new version in an evolvingsystem. An open question is whether this method also could help developersconverting a legacy system to a software product line identify unforeseen,problematic feature interactions.

This paper described a framework which uses machine learning algorithmsalong with a novel link-prediction based method to detect potential, un-wanted feature interactions during the requirements phase of a new productor version in a software product line.Representing the software product line as a feature interaction graphenabled us to use similarity-based link prediction and machine learning al-gorithms to detect unwanted feature interactions much earlier in the devel-opment process for a new product than current testing techniques. Resultsfrom application and evaluation on a small product line showed that the bestML algorithms achieved good accuracy (0.75 to 1) for classifying product-line feature interactions as unwanted or wanted. Directions for future workare to evaluate our link-prediction method on a larger product line and toinvestigate whether incorporating structural similarity measures, as in [19],improves classiﬁcation of feature interactions.19 cknowledgment

The ﬁrst author thanks professor James Bailey for a useful discussion aboutlink prediction. The work in this paper was partially funded by the USNational Science Foundation Grant CCF:1513717.

References [1] Lada A Adamic and Eytan Adar. Friends and neighbors on the web.

Social networks , 25(3):211–230, 2003.[2] Mustafa Al-Hajjaji, Thomas Th¨um, Jens Meinicke, Malte Lochau, andGunter Saake. Similarity-based prioritization in software product-linetesting. In

Proceedings of the 18th International Software Product LineConference-Volume 1 , pages 197–206. ACM, 2014.[3] Sven Apel, Alexander von Rhein, Philipp Wendler, Armin Gr¨oßlinger,and Dirk Beyer. Strategies for product-line veriﬁcation: case studiesand experiments. In

Proceedings of the 2013 International Conferenceon Software Engineering , pages 482–491. IEEE Press, 2013.[4] Sven Apel, Hendrik Speidel, Philipp Wendler, Alexander Von Rhein,and Dirk Beyer. Detection of feature interactions using feature-awareveriﬁcation. In

ASE , pages 372–375, 2011.[5] Sven Apel, Alexander Von Rhein, Thomas Th¨um, and ChristianK¨astner. Feature-interaction detection based on feature-based speci-ﬁcations.

Computer Networks , 57(12), 2013.[6] Joanne M Atlee, Uli Fahrenberg, and Axel Legay. Measuring behaviourinteractions between product-line features. In

Formal Methods in Soft-ware Engineering (FormaliSE), 2015 IEEE/ACM 3rd FME Workshopon , pages 20–25. IEEE, 2015.[7] Don Batory. Feature models, grammars, and propositional formulas.In

International Conference on Software Product Lines , pages 7–20.Springer, 2005.[8] Don Batory, Peter H¨ofner, and Jongwook Kim. Feature interactions,products, and composition. In

ACM SIGPLAN Notices , volume 47,pages 13–22. ACM, 2011. 209] Jan Bosch.

Design and use of software architectures: adopting andevolving a product-line approach . Pearson Education, 2000.[10] Goetz Botterweck and Andreas Pleuss. Evolution of software productlines. In

Evolving Software Systems , pages 265–295. Springer, 2014.[11] Muﬀy Calder, Mario Kolberg, Evan H Magill, and Stephan Reiﬀ-Marganiec. Feature interaction: a critical review and considered fore-cast.

Computer Networks , 41(1):115–141, 2003.[12] Andreas Classen, Patrick Heymans, and Pierre-Yves Schobbens.What’s in a feature: A requirements engineering perspective. In

In-ternational Conference on Fundamental Approaches to Software Engi-neering , pages 16–30. Springer, 2008.[13] Gabor Csardi, Tamas Nepusz, et al. The igraph software package forcomplex network research.

InterJournal, complex systems , 1695(5):1–9,2006.[14] Robert J Hall. Fundamental nonmodularity in electronic mail.

Auto-mated Software Engineering , 12(1):41–79, 2005.[15] Christopher Henard, Mike Papadakis, Gilles Perrouin, Jacques Klein,Patrick Heymans, and Yves Le Traon. Bypassing the combinatorialexplosion: Using similarity to generate and prioritize t-wise test con-ﬁgurations for software product lines.

TSE , (7), 2014.[16] Paul Jaccard. A comparative study of ﬂoral distribution in a portionof the Alps and Jura. 37:547–579, 1901.[17] Leo Katz. A new status index derived from sociometric analysis.

Psy-chometrika , 18(1):39–43, 1953.[18] Seyedehzahra Khoshmanesh. The role of similarity in detecting featureinteraction in software product lines (spls). 2019.[19] Seyedehzahra Khoshmanesh and Robyn R Lutz. The role of similarityin detecting feature interaction in software product lines. In

ISSREW ,pages 286–292. IEEE, 2018.[20] Seyedehzahra Khoshmanesh and Robyn R Lutz. Feature similarity:A method to detect unwanted feature interactions earlier in softwareproduct lines. In

International Conference on Similarity Search andApplications , pages 356–361. Springer, 2019.2121] Seyedehzahra Khoshmanesh and Robyn R Lutz. Leveraging featuresimilarity for earlier detection of unwanted feature interactions in evolv-ing software product lines. In

International Conference on SimilaritySearch and Applications , pages 293–307. Springer, 2019.[22] Gueorgi Kossinets. Eﬀects of missing data in social networks.

Socialnetworks , 28(3):247–268, 2006.[23] Max Kuhn, Jed Wing, Steve Weston, Andre Williams, Chris Keefer, Al-lan Engelhardt, Tony Cooper, Zachary Mayer, Brenton Kenkel, R CoreTeam, et al. Package caret.

The R Journal , 2020.[24] Micha l Bojanowski and Bartosz Chro. Proximity-based methods forlink prediction in graphs with r package linkprediction.[25] David Liben-Nowell and Jon Kleinberg. The link-prediction problemfor social networks.

Journal of the American society for informationscience and technology , 58(7):1019–1031, 2007.[26] J¨org Liebig, Sven Apel, Christian Lengauer, Christian K¨astner, andMichael Schulze. An analysis of the variability in forty preprocessor-based software product lines. In

Proceedings of the 32nd ACM/IEEE In-ternational Conference on Software Engineering-Volume 1 , pages 105–114. ACM, 2010.[27] Jing Liu, Josh Dehlinger, Hongyu Sun, and Robyn Lutz. State-basedmodeling to support the evolution and maintenance of safety-criticalsoftware product lines. In , pages 596–608. IEEE, 2007.[28] Linyuan L¨u and Tao Zhou. Link prediction in complex networks: A sur-vey.

Physica A: statistical mechanics and its applications , 390(6):1150–1170, 2011.[29] Huda Nassar, Austin R Benson, and David F Gleich. Pairwise linkprediction. In

Proceedings of the 2019 IEEE/ACM International Con-ference on Advances in Social Networks Analysis and Mining , pages386–393, 2019.[30] Mark Ed Newman, Albert-L´aszl´o Ed Barab´asi, and Duncan J Watts.

The structure and dynamics of networks.

Princeton university press,2006. 2231] Sebastian Oster, Florian Markert, and Philipp Ritter. Automated in-cremental pairwise testing of software product lines. In

InternationalConference on Software Product Lines , pages 196–210. Springer, 2010.[32] Klaus Pohl, G¨unter B¨ockle, and Frank J van Der Linden.

Software prod-uct line engineering: foundations, principles and techniques . SpringerScience & Business Media, 2005.[33] Ahmad Rawashdeh and Anca L Ralescu. Similarity measure for socialnetworks-a brief survey. In

Maics , pages 153–159, 2015.[34] Gerard Salton and Donna Harman.

Information retrieval . John Wileyand Sons Ltd., 2003.[35] Ana B S´anchez, Sergio Segura, and Antonio Ruiz-Cort´es. A comparisonof test case prioritization criteria for software product lines. In

Soft-ware Testing, Veriﬁcation and Validation (ICST), 2014 IEEE SeventhInternational Conference on , pages 41–50. IEEE, 2014.[36] Norbert Siegmund, Sergiy S Kolesnikov, Christian K¨astner, Sven Apel,Don Batory, Marko Rosenm¨uller, and Gunter Saake. Predicting per-formance via automated feature-interaction detection. In

Proceedingsof the 34th International Conference on Software Engineering , pages167–177. IEEE Press, 2012.[37] Amit Singhal et al. Modern information retrieval: A brief overview.

IEEE Data Eng. Bull. , 24(4):35–43, 2001.[38] Larissa Rocha Soares, Pierre-Yves Schobbens, Ivan do Carmo Machado,and Eduardo Santana de Almeida. Feature interaction in software prod-uct line engineering: A systematic mapping study.

Information andSoftware Technology , 98:44–58, 2018.[39] Jimeng Sun, Huiming Qu, Deepayan Chakrabarti, and Christos Falout-sos. Neighborhood formation and anomaly detection in bipartitegraphs. In

Fifth IEEE International Conference on Data Mining(ICDM’05) , pages 8–pp. IEEE, 2005.[40] Tassio Vale, Eduardo Santana de Almeida, Vander Alves, Uir´a Kulesza,Nan Niu, and Ricardo de Lima. Software product lines traceability: Asystematic mapping study.

Inf. Softw. Technol. , 84:1–18, 2017.2341] Alan W Williams. Determination of test conﬁgurations for pair-wiseinteraction coverage. In

Testing of Communicating Systems , pages 59–74. Springer, 2000.[42] Tao Zhou, Linyuan L¨u, and Yi-Cheng Zhang. Predicting missing linksvia local information.