Machine Learning for Wireless Link Quality Estimation: A Survey
Gregor Cerar, Halil Yetgin, Mihael Mohorčič, Carolina Fortuna
11 Machine Learning for Wireless Link QualityEstimation: A Survey
Gregor Cerar , , Halil Yetgin , , Mihael Mohorˇciˇc , , and Carolina Fortuna Department of Communication Systems, Joˇzef Stefan Institute, SI-1000 Ljubljana, Slovenia. Joˇzef Stefan International Postgraduate School, Jamova 39, SI-1000 Ljubljana, Slovenia. Department of Electrical and Electronics Engineering, Bitlis Eren University, 13000 Bitlis, Turkey. { gregor.cerar | halil.yetgin | miha.mohorcic | carolina.fortuna } @ijs.si Abstract —Since the emergence of wireless communicationnetworks, a plethora of research papers focus their attentionon the quality aspects of wireless links. The analysis of the richbody of existing literature on link quality estimation using modelsdeveloped from data traces indicates that the techniques usedfor modeling link quality estimation are becoming increasinglysophisticated. A number of recent estimators leverage MachineLearning (ML) techniques that require a sophisticated designand development process, each of which has a great potential tosignificantly affect the overall model performance. In this paper,we provide a comprehensive survey on link quality estimators de-veloped from empirical data and then focus on the subset that useML algorithms. We analyze ML-based Link Quality Estimation(LQE) models from two perspectives using performance data.Firstly, we focus on how they address quality requirements thatare important from the perspective of the applications they serve.Secondly, we analyze how they approach the standard designsteps commonly used in the ML community. Having analyzedthe scientific body of the survey, we review existing open sourcedatasets suitable for LQE research. Finally, we round up oursurvey with the lessons learned and design guidelines for ML-based LQE development and dataset collection.
Index Terms —link quality estimation, machine learning, data-driven model, reliability, reactivity, stability, computational cost,probing overhead, dataset preprocessing, feature selection, modeldevelopment, wireless networks.
I. I
NTRODUCTION
In wireless networks, the propagation channel conditionsfor radio signals may vary significantly with time and space,affecting the quality of radio links [1]. In order to ensurea reliable and sustainable performance in such networks,an effective link quality estimation (LQE) is required bysome protocols and their mechanisms, so that the radio linkparameters can be adapted and an alternative or more reliablechannel can be selected for wireless data transmission. To putit simply, the better the link quality, the higher the ratio of suc-cessful reception and therefore a more reliable communication.However, challenging factors that directly affect the qualityof a link, such as channel variations, complex interferencepatterns and transceiver hardware impairments just to namea few, can unavoidably lead to unreliable links [2]. On onehand, incorporating all these factors in an analytical model isinfeasible and thus such models cannot be readily adoptedin realistic networks due to highly arbitrary and dynamicnature of the propagation environment [3]. On the other channelphysical layerlink layernetwork layer physical layerlink layernetwork layerpeer layer comm.peer layer comm.Fig. 1: The unified model of data-driven LQE comprising ofphysical layer (layer 1) and link layer (layer 2).hand, effective prediction of link quality can provide greatperformance returns, such as improved network throughputdue to reduced packet drops, prolonged network lifetime dueto limited retransmissions [4], constrained route rediscovery,limited topology breakdowns and improved reliability, whichreveal that the quality of a link influences other designdecisions for higher layer protocols. Eventually, variations inlink quality can significantly influence the overall networkconnectivity. Therefore, effectively estimating or predicting thequality of a link can provide the best performing link from aset of candidates to be utilized for data transmission.More broadly, the quality of a wireless link is influencedby the design decisions taken for: i) wireless channel, ii)physical layer technology, and iii) link layer, as depicted inFig. 1. The channel used for communication can be describedby several parameters, such as operating frequency, trans-mission medium (e.g. air, water), environment (e.g. indoor,outdoor, dense urban, suburban) as well as relative positionof the communicating parties (e.g. line-of-sight, non-line-of-sight) [1]. The physical layer technology implemented at thetransmitter and receiver comprises several complex and well-engineered blocks, such as the antenna (e.g. single, multipleor array), frequency converter, analog to digital converter,synchronization and other baseband operations. The link layer a r X i v : . [ c s . N I] N ov is responsible for successfully delivering the data frame viaa single wireless hop from transmitter to receiver, thereforeit comprises of frame assembly and disassembly techniques,such as attaching/detaching headers, encoding/decoding pay-load, as well as mechanisms for error correction and con-trolling retransmissions [3]. While the quality of a link isultimately influenced by a sequence of complex, well studied,designed and engineered processing blocks, the performanceof the realistic and operational systems is quantified by arelatively limited number of observations [2], the so-called link quality metrics , which are detailed later in Section II-Cusing Table IV.In this paper, we refer to the wireless link abstraction ascomprising of link layer and physical layer. More explicitly, link quality is referred to the quality of a wireless link that isconcerned with the link layer and the physical layer. The LQEmodels reviewed in this survey paper are based on physicaland link layer metrics, namely all potential metrics for theevaluation of link quality that lie within the dotted rectangleof Fig. 1.To briefly overview, the research on data-driven LQE usingreal measurement data started in the late 90s [5] and is still car-ried on with a plethora of publications in the last decade [5]–[16]. Early studies on this particular topic mainly utilizedrecorded traces and the models were developed manually [5],[7]–[16]. Over the past few years, researchers have paid a lot ofattention to the development of LQE using ML algorithms [6],[17]–[19]. A. Applications of ML in wireless networks
The use of ML techniques in LQE is promising to sig-nificantly improve the performance of wireless networks dueto the ability of the technology to process and learn fromlarge amount of data traces that can be collected acrossvarious technologies, topologies and mobility scenarios. Thesecharacteristics of ML techniques empower LQE to becomemuch more agile, robust and adaptive. Additionally, a moregeneric and high level understanding of wireless links couldbe acquired with the aid of ML techniques. More explic-itly, an intelligent and autonomous mechanism for analyzingwireless links of any transceiver and technology can assist inbetter handling of current operational aspects of increasinglyheterogeneous networks. This opens up a new avenue forwireless network design and optimization [58], [59] and callsfor the ML techniques and algorithms to build robust, agile,resilient and flexible networks with minimum or no humanintervention. A number of contributions for such mechanismscan be found in the literature, for instance radio spectrumobservatory network is designed in [60] and [61].The diagram provided in Fig. 2 exhibits a broad picture ofwhat problems are being solved by ML in wireless networksand what broad classes of ML methods are being used forsolving these particular problems. It can be observed thatimprovements on all layers of the communication networkstack, from physical to application, are being proposed usingclassification, regression and clustering techniques. For eachtechnique, algorithms having statistical, kernel, reinforcement, deep learning, and stochastic flavors are being used. The scopeof the ML works analyzed in this paper is shaded with grayin Fig. 2 and further detailed later in Fig. 5. For a morecomprehensive and intricate analysis, [54] and [55] surveydeep learning in wireless networks, and [62] surveys ArtificialIntelligence (AI) techniques, including ML and symbolicreasoning in communication networks, but without investingany particular effort on LQE.
B. Existing surveys on LQE
To contrast our study against existing survey papers onthe aspects of link quality estimation, we have identified acomprehensive list of survey and tutorial papers summarizedin Table I. We have observed that there are existing dis-cussions on the “link quality” considering various wirelessnetworks, as outlined in Table I. However, only Baccour etal. attempted to address LQE in [2]. They highlighted distinctand sometimes contradictory observations coming from alarge amount of research work on LQE based on differentplatforms, approaches and measurement sets. Baccour et al. provide a survey on empirical studies of low power links inwireless sensor networks without paying any special attentionto procedures using ML techniques. In this survey paper, wecomplement the aforementioned survey by analyzing the richbody of existing and recent literature on link quality estima-tion with the focus on model development from data tracesusing ML techniques. We analyze the ML-based LQE fromtwo complementary perspectives: application requirements andemployed design process. First, we focus on how they addressquality requirements that are important from the perspective ofthe applications they serve in Section III. Second, we analyzehow they approach the standard design steps commonly usedin the ML community in Section IV. Moreover, we also reviewpublicly available data traces that are most suitable for LQEresearch. C. Contributions
Considering recent contributions on LQE using ML tech-niques, it can be challenging to reveal the relationship betweendesign choices and reported results. This is mainly becauseeach model relying on ML assumes a complex developmentprocess [63], [64]. Each step of this process has a greatpotential to significantly affect the overall performance ofthe model, and hence these steps and their associated designchoices must be well understood and carefully considered. Ad-ditionally, to provide the means for fair comparison betweenexisting and future approaches, it is of critical importance tobe able to reproduce the LQE model development process andresults [65]–[67], which indeed also requires open sharing ofdata traces.The major contributions of this paper can be summarizedas follows. • We provide a comprehensive survey of the existing liter-ature on LQE models developed from data traces. We This survey paper is also a more recent contribution on link qualityestimation models than [2] from 2012. Besides, we focus our attention onthe data-driven LQE models with ML techniques.
Machine Learning forWireless Networks Physical Layer Localization Regression Kernel Methods [20]Deep Learning [21]Statistical [22]Channel Equalization Regression Deep Learning [23]Statistical [24]ClusteringModulation and Coding Classification Deep Learning [25], [26]Kernel Methods [27]Detection Algorithm Regression Deep Learning [28]Channel Modeling Regression Deep Learning [29], [30]Kernel methods [31]Statistical [32]Clustering Kernel methods [33]Link Layer Access Control Classification Reinforcement Learning[34]Rate Adaptation Classification Stochastic [35]Fault Identification Classification Statistical [36]Kernel Methods [36]Frame Size Optimization Regression Neural Networks [37]Link Quality Estimation Classification Statistical [14], [17], [38],[39]Deep Learning [40]Regression Statistical [9]Network Layer Traffic Engineering Clustering Statistical [41]Protocol Identification Clustering Statistical [42]Classification Statistical [43]Routing Optimization Regression Reinforcement Learning[44]Application Layer QoE Classification Kernel Methods [45]Statistical [45]Anomaly Detection Classification Kernel Methods [46]Service Optimization Classification Deep Learning [47]
Fig. 2: Layered taxonomy of machine learning solutions for wireless communication networks.analyze the state of the art from several perspectivesincluding target technology and standards, purpose ofLQE, input metrics, models utilized for LQE, output ofLQE, evaluation and reproducibility. The survey reveals that the complexity of LQE models is increasing and thatcomparing LQE models against each other is not alwaysfeasible. • We provide a comprehensive and quantitative analysis
TABLE I: Existing surveys and tutorials relating to the terms that can define the quality of a link in the state-of-the-art literature.
Publication A summary with particular focus Related context in the relevant publication Its related section [2], 2012 A survey on empirical studies of low power links in wireless sensor networks aswell as on LQE without paying any special attention to procedures using MLtechniques Characteristics of low-power links and linkquality estimation Section V[48], 2012 A tutorial on improving the reliability of wireless communication links usingcognitive radios Failures in wireless networks Section II-B[49], 2013 A survey of the techniques and protocols to handle mobility in wireless sensornetworks Prediction of link quality for mobilityestimation Section IV[50], 2014 A survey on fair resource sharing/allocation in wireless networks The impact of link quality on packet delay Section III-B[51], 2016 A survey of communication related issues in unmanned aerial vehiclecommunication networks Dynamic topology changes and time-varyinglinks Sections I-B/I-C[52], 2018 A survey on link- and path-level reliable data transfer schemes in underwateracoustic networks Channel quality control on physical layer asshown in Table II Section III[53], 2018 A tutorial on key technologies of cloud access radio network optical fronthaul Link performances of radio over fibertransport schemes illustrated in Table X Section VII-E[54], 2018 A survey on deep learning applications for different layers of wireless networks A brief discussion on deep learning for linkevaluation Section IV-C[55], 2019 A survey on deep learning techniques applied to mobile and wireless networkingresearch Deep learning driven network control andnetwork-level mobile data analysis Sections I/VI[56], 2019 A survey of effective capacity models used in various wireless networks A brief discussion on selection of betterquality links Section VII-B[57], 2019 A survey of current issues and machine learning solutions for massive machine typecommunications in ultra-dense cellular Internet of things networks Learning link quality and reliability to adaptcommunication parameters Section VI-A
This survey A comprehensive survey of data-driven LQE models, application qualityaspects regarding the development of ML-based LQE models, ML designprocess for LQE models and publicly available trace-sets suitable for LQEresearch. Additionally, we provide a comprehensive performance data forwireless link quality classification and for design decisions taken throughoutthe LQE model development. Finally, we also put forward a comprehensivelessons learned section for the development of ML-based LQE model as well asthe design guidelines for ML-based LQE development and dataset collection. Data-driven link quality estimation models
All sections of wireless link quality classification by extracting theapproximated per class performance from the reportedresults of the literature in order to enable readers toreadily distinguish the performance gaps at a glimpse. • We analyze the performance of candidate classification-based LQEs and reveal that autoencoders, tree basedmethods and SVMs tend to consistently perform betterthan logistic regression, naive Bayes and artificial neuralnetworks whereas the non-ML TRIANGLE estimatorperforms considerably well on the two, i.e., very good and good quality links, of the five classes included in theanalysis. • We identify five quality aspects regarding the develop-ment of an ML-based LQE that are important from theapplication perspective: reliability, adaptivity/reactivity,stability, computational cost and probing overhead. Weprovide insightful analyses on how ML-based LQE mod-els address these five quality aspects considering the useof ML methods for a diverse set of specific problems. • Starting from the standard ML design process, we inves-tigate and quantify the design decisions that the existingML-based LQE models considered and provide insightsfor their potential impact on the final performance of theLQE using the accuracy as well as the F1 score andprecision vs. recall metrics. • We survey publicly available datasets that are most suit-able for LQE research and review their available featureswith a comparative analysis. • We provide an elaborated lessons learned section forthe development of ML-based LQE model. Based onthe lessons learned from this survey paper, we derivegeneric design guidelines recommended for the industry and research community to follow in order to effectivelydesign the development process and collect trace-sets forthe sake of LQE research.The rest of this paper is structured as portrayed in Fig. 3.Section II provides a comprehensive survey of the state-of-the-art literature on LQE models built from data traces. Section IIIand Section IV analyze ML-based LQE models from the per-spective of application requirements, and of the design process,respectively. Section V then provides a comprehensive analysisof the open datasets suitable for LQE research. As a resultof our extensive survey, Section VI provides lessons learnedand design guidelines, while Section VII finally concludes thepaper and elaborates on the future research directions.II. O
VERVIEW OF D ATA -D RIVEN L INK Q UALITY E STIMATION
With the emergence and spread of wireless technologiesin the early 90s [71], it became clear that packet delivery inwireless networks was inferior to that of wired networks [5].At the time of the experiment conducted in [5], wirelesstransmission medium was observed to be prone to undulylarger packet losses than the wired transmission mediums.Up until today, roughly speaking, numerous sophisticatedcommunication techniques, including modulation and cod-ing schemes, channel access methods, error detection andcorrection methods, antenna arrays, spectrum management,high frequency communications and so on, have emerged.As part of this combination of revolutionary techniques, adiverse number of estimation models for the assessment of linkquality, based on actual data traces in addition to or insteadof simulated models, have been proposed in the literature.
Sec. II−DSec. II−ESec. IIISec. IV Sec. II−CSec. II−GSec. II−ASec. II−BSec. II−F − Technologies and standards− Purpose of the LQE− Models for LQE− Output of link quality estimator− Evaluation of the proposed models− Reproducibility− Input metrics for LQE models− Application Perspective of ML−Based LQEs− Cleaning and interpolation steps− Feature selection− Resampling strategy− Machine learning methodSec. IV−BSec. IV−CSec. IV−DSec. IV−ASec. VSec. VI Sec. VI−A− SummarySec. VII−ASec. VII−BSec. VI−B/C − Lessons learned− Design guidelines− Conclusions− Future research ideas− Overview of Measurement Data Sources− Design Process Perspective of ML−Based LQEsSec. I−CSec. I−B − ContributionsSec. II − Findings−Sec. I−A Applications of ML inIntroductionSec. VII Overview of Data−Driven LinkQuality Estimation−−Sec. wireless networksI − Existing surveys on LQE
Fig. 3: Structure overview of this survey paper.The research of data-driven LQE based on measurementdata reaches back into late 90s [5] and has gained momentumparticularly in the last decade [6]. As summarized in thetimeline depicted in Fig. 4, early attempts on LQE researchmainly hinge on the recorded traces with statistical approachesand the manually developed models [5], [7]–[16]. On theother hand, only after 2010, researchers have started paying agreat attention to the development of LQE model using MLalgorithms [17]–[19].To date, many analytical and statistical models have beenproposed to mitigate losses and improve the performanceof wireless communication. These models include channelmodels, radio propagation models, modulation/demodulationand encoding/decoding schemes, error correction codes, andmulti-antenna systems just to name a few. Such models areessentially based on model-driven link quality estimators,where they calculate predetermined variables based on thecommunication parameters of the associated environment.However, their one significant shortcoming is that they abstract the real environment, and thus consider only a subset of thereal phenomena. Data-driven models, on the other hand, relyon actual measured data that capture the real phenomena. Thedata are then used to fit a model that best approximates theunderlying distribution. As it can be readily seen in Fig. 4,up until 2010, statistical approaches were the favored toolsfor LQE research. From then on, as in other research areasof wireless communication, portrayed in Fig. 2, ML-basedmodels replaced the conventional approaches and became thepreferred tool for LQE research.Empirical observation of wireless link traffic is a crucial partof the data-driven LQE. An observation of link quality metricswithin a certain estimation window, e.g. time interval or adiscrete number of events, allows for constructing differentvarieties of data-driven link quality estimators. However, thereare a few drawbacks of the data-driven approaches that needto be taken into account. Since the ultimate model strictlydepends on the recorded data traces, it has to be carefullydesigned in a way that records adequate information aboutthe underlying distribution of the phenomena. If sufficientmeasurements of the distribution can be captured, then it ispossible to automatically build a model that can approximatethat particular distribution. Data-driven LQE models are inno way meant to fully replace or supersede model-drivenestimators but to complement them. It is certainly possibleto incorporate a model-driven estimator into a data-driven oneas the input data.To some extent, different varieties of data-driven metricsand estimators were studied in [2], where the authors madethree independent distinctions among hardware- and software-based link quality estimators. The software-based estimatorsare further split into Packet Reception Ratio (PRR)-based,Required Number of Packets (RNP)-based, and score-basedsubgroups. The first distinction is based on the estimator’sorigin presenting the way how they were obtained. The seconddistinction is based on the mode their data collection wasdone, which can be in passive, active and/or hybrid manner,depending on whether dummy packet exchange was triggeredby an estimator. The third distinction is based on which sideof the communication link was actively involved. LQE metricscan be gathered either on the receiver, transmitter or both sides.Going beyond [2], Tables II and III provide a comprehensivesummary of the most related publications that leverage a data-driven approach for LQE research. All the studies summarizedin Tables II and III rely on real network data traces recordedfrom actual devices. The first column in Tables II and IIIcontains the title, reference and the year of publication. Thesecond column provides the testbed, the hardware and thetechnology used in each publication, whereas the third columnlists the objectives of these publications with respect to LQEapproach. Columns four, five and six focus on the charac-teristics of the estimators, particularly on their correspondinginput(s), model and output. The last two columns summarizestatistical aspects of the data traces and their public availabilityof the trace-sets for reproducibility, respectively.
Fig. 4: Timeline of the most prominent models in the evolution of wireless LQE. M ac h i n e l ea r n i ng T r a d iti on a l a pp r o ac h · · · · · ·• Link loss of pre-WiFi networks using Markov model [5]. · · · · · ·•
Transport layer protocol that uses link loss notification [7]. · · · · · ·•
Improved multi-hop routing based on link quality model [8]. · · · · · ·•
Four bit cross-layer (phy, link and network) information basedLQE [10]. · · · · · ·•
Triangle, a PRR, LQI and SNR based LQE [12]. · · · · · ·•
Fuzzy logic based LQE [13]. · · · · · ·•
Bayes, regression and neural network LQE classification [17],[38]. · · · · · ·•
Topological features + SVM, k-NN, regression trees, gaussianregression LQE [68]. · · · · · ·•
Reinforcement learning-based LQE [69]. · · · · · ·•
LQE for a mmWave base station handover system [70]. · · · · · ·•
Stacked autoencoder + SVM based link quality estimator [40]. · · · · · ·•
Satellite image + network metrics LQE using SVM [6].
A. Technologies and standards
As outlined in the second column of Tables II and III,earlier studies on LQE were performed on WaveLAN [5], [7],a precursor on the modern Wi-Fi. The study in [5] aimed tocharacterize the loss behavior of proprietary AT&T WaveLAN.It used packet traces with various configurations for thetransmission rate, packet size, distance and the correspondingpacket error rate. Then, they built a two-state Markov modelof the link behavior. The same model was then utilized in [7]to estimate the quality of wireless links in the interest ofimproving Transmission Control Protocol (TCP) congestionperformance. More recently, [70], [72] used IEEE 802.11standard in their studies for throughput and online link qualityestimators.Later on, the majority of publications related to LQEfocused on wireless sensor networks relying on IEEE 802.15.4standard and only a few targeted other type of wire-less networks, such as Wi-Fi (IEEE 802.11) or Bluetooth(IEEE 802.15.1). This can be explained by the fact thatIEEE 802.15.4-based wireless sensor networks are relativelycheaper to deploy and maintain. Perhaps, the first such largertestbed was available at the University of Berkley [8] usingMicaZ nodes and TinyOS [73], which is an open sourceoperating system for constrained devices. Other hardwareplatforms, such as TelosB and TMote, and operating systems,e.g. Contiki, have emerged and enabled researchers to furtherexperiment with improving the performance of single andmulti-hop communications for wireless networks composed ofbattery-powered devices. Finally, one recent contribution focuses on LoRA technol-ogy, a type of Low Power Wide Area Network (LPWAN) forestimating the quality of links, and therefore aiming for theimprovement of the coverage for the technology [6].Whereas earlier research on LQE leveraged proprietarytechnologies [5], wireless sensor networks utilized relativelylow cost hardware and open source software, therefore enableda broader effort from the research community. This resultedin a large wave of research focusing on ad-hoc, mesh andmultihop communications [8], [10], [13]–[17], [19], [38], [40],[74], all of which rely on the estimation of link quality. Thenodes implementing the aforementioned technologies are stillbeing maintained in various university testbeds.
B. Purpose of the LQE
With respect to the research goal summarized in the thirdcolumn of Tables II and III, the surveyed papers can becategorized into two broad groups. The goal of the first groupwas to improve the performance of a protocol or process. Thegoal of the second group of papers was to propose a new orimprove an existing link quality estimator. For this class ofpapers, any protocol improvement in the evaluation processwas secondary.
1) LQE for protocol performance improvement:
The au-thors of [5], [7] investigated TCP performance improvement,whereas others focused on routing protocol performance. Thisgroup of papers proposed a novel link quality estimatorsas an intermediate step towards achieving their goal, e.g.performance improvement of TCP, routing optimization andso on.
TABLE II: Existing work on link quality estimation using real network data traces (Part 1 of 2)
Title Tech. Goal Input Model Output Data Reproduce
A trace-based approachfor modeling wirelesschannel behavior [5],1996 WaveLAN,BARWANtestbed, BSD 2.1 Maximizethroughput,channel errormodel SNR, signalquality,throughput, PRR Improvedtwo-state Markovmodel Probability oferror to occur andpersist Not specified( < ≈
600 000 packets(8 packets/s,200 packets/P Tx ) No*(4B) Four-bit wirelesslink estimation [10],2007 Intel Mirage:85x MicaZ;USC TutorNet:94x TelosB;IEEE 802.15.4,TinyOS Improve routingtable management LQI, PRR,broadcast,ACK count Construct 4-bitscore of link state Estimated linkquality Mirage: N.A.,40-69 min/experi-ment; TutorNet:N.A.,3-12h/experiment; No*A Kalman filter-basedlink quality estimationscheme for wirelesssensor networks [9],2007 TelosB,IEEE 802.15.4 PRR estimation RSSI, noise floor Kalman filter +SNR to PRRmapping PRR estimation 25 200 000(500 samples/s,14 h) NoPRR is not enough [11],2008 IEEE 802.11,IEEE 802.15.4 Link stateestimation PRR Gilbert-ElliottModel (2-stateMarkov process); good and bad state Link qualitytransitionprobability Rutgers andMirage trace-sets YesThe triangle metric: fastlink quality estimationfor mobile wirelesssensor networks [12],2010 Tmote Sky,Sentilla JCreate,IEEE 802.15.4,Contiki OS New LQE RSSI, noise floor,LQI Pythagoreanequation maps todistance from theorigin(hypotenuse) Estimated linkquality as verygood , good , average or bad
30 000 + N.A.,(64 packets/s, allchannels, unicast) NoF-LQE: A fuzzy linkquality estimator forwireless sensor networks,[13] 2010, [75] 2011 RadiaLE testbed,49x TelosB,IEEE 802.15.4,TinyOS Link qualityestimation,improve routing PRR Fuzzy logic mapscurrent toestimated linkquality Binaryhigh/low-quality(HQ/LQ) linkestimation N.A. (bursts,packet sizes,20-26 channel) No*Foresee (4C): Wirelesslink prediction using linkfeatures [17], 2011 54x Tmote(local),180x Tmote Sky(Motelab),IEEE 802.15.4, Improve routing PRR, RSSI, SNR,LQI Logisticregression model Probability ofreceiving nextpacket 80 000 + 80 000noise floor( ≈
10 packets/s) No*Fuzzy logic-basedmultidimensional linkquality estimation formultihop wireless sensornetworks [14], 2013 (local)15x TelosB,TinyOS,IEEE 802.15.4 Improve routing,minimizetopology changes PRR Fuzzy logic linkquality estimator Binaryhigh/low-qualitylink estimation N.A., (20 min/ex-periment,12h) NoTemporal adaptive linkquality prediction withonline learning,[38] 2012, [18] 2014 Motelab, Indriyaand (local)54x Tmotetestbed,IEEE 802.15.4 Link qualityestimation,improve Routing PRR, RSSI, SNR,LQI Logisticregression withSGD ands-ALAP adaptivelearning rate Binary, estimatesif link qualityabove desiredthreshold 480 000, (30bytes size, 6 000per exp., 10/sec.),Rutgers andColoradotrace-sets No [38]Yes [18]Low-Power link qualityestimation in smart gridenvironments [15], 2015 IEEE 802.15.4 Improve routing,LQE reactivity RNP, SNR, PRR Optimized F-LQE[13] with betterreactivity Binaryhigh/low-qualitylink estimation N.A., 500kVsubstation env.data, TOSSIM 2simulator NoTime series analysis topredict link quality ofwireless communitynetworks [68], 2015 Conventionalrouters,IEEE 802.15.4,IEEE 802.11,AX.25,(FunkFeuer meshnetwork) Link qualityestimation,regression,clustering,time-seriesanalysis LQ, NLQ, ETX SVM, k-nearestneighbor,regression trees,Gaussian processfor regression Predicted LQvalue for differentwindows sizes N.A., (404 nodes,2 095 links, 7days of data) No*Machine-learning basedchannel quality andstability estimation forstream-basedmultichannel wirelesssensor networks [76],2016 CC2420,IEEE 802.15.4,Matlab simulation Evaluation ofnew algorithmwith two possibleextensions RSSI, LQI,channel rank,channel Normalequation-basedchannel qualityprediction,weighted inputextension,stability extension Channel qualityestimation basedon 3-classestimator Simulation YesWNN-LQE: Wavelet-neural-network-basedlink quality estimationfor smart gridWSNs [19], 2017 10x CC2530WSNs,IEEE 802.15.4 Improve routing,estimate PRRrange SNR Wavelet-neural-network-basedlink qualityestimator Upper and lowerbound ofconfidenceinterval for PRR 2 500 (20 bytessize, 3.33 persecond) No
Note: Asterisk (*) indicates that the experiment was performed on a public testbed, but no data is available.
TABLE III: Existing work on link quality estimation using real network data traces (Part 2 of 2)
Title Tech. Goal Input Model Output Data Reproduce
A reinforcementlearning-based linkquality estimationstrategy for RPL and itsimpact on topologymanagement [69], 2017
Sim.:
Coojasimulator(Contiki 3.x);
Exp.:
23x TelosB,CC2420,IEEE 802.15.4 Improve RPLprotocol PER, RSSI,energyconsumption Unsupervised ML PRR estimation
Sim.: ∞ ; Exp.:
N.A., 178 links,mobile nodes(0.5 m/s),University of Pisa
Sim.:
Yes;
Exp.:
NoResearch on LinkQuality EstimationMechanism for WirelessSensor Networks Basedon Support VectorMachine [74], 2017 2x TelosB,CC2420,IEEE 802.15.4,TinyOS 2.x link qualityestimation,comparison RSSI, LQI, PRR SVM classifier Classification, 5classes 121 datapoints NoMachine-learning-basedthroughput estimationusing images formmWavecommunications [70],2017 2x IEEE 802.11ad@ 60 GHz(mmWave),RGB-D camera(Kinect) Throughputestimation,obstacledetection, comm.handover w/ocontrol frames Throughput,depth value(Kinect) Online adaptiveregularization ofweight vectors(AROW) regression,throughputestimation N.A. NoQuick and efficient linkquality estimation inwireless sensorsnetworks [16], 2018 Grenoble testbedFIT-IoT,28x AT86RF231,IEEE 802.15.4 Analysis of LQI,fast decisions,improve routing LQI Classificationbased on arbitraryvalues Classify link as good , uncertain or weak N.A. (2 000 perlink, 16 channels) No*Online ML algorithms topredict link quality incommunity wirelessmesh networks [72],2018 Conventionalrouters,IEEE 802.15.4,IEEE 802.11,AX.25,(FunkFeuer meshnetwork) Link qualityestimation, onlineregression,compares onlineML algorithms LQ, NLQ, ETX onlineperceptrons,online regressiontrees, fastincrementalmodel trees,adaptive modelrules Metric estimation,regression N.A. ( ≈ ≈ verybad , bad , common , good , very good N.A., interiorcorridors, grove,parking lots, road NoAutomated Estimation ofLink Quality for LoRa:A Remote SensingApproach [6], 2019 Dragino LoRa 1.3(RF96 chip),LoRa Link qualityestimation,environmentclassification Node/Gatewayposition,time-stamp,RSSI, SNR,multispectralaerial images SVMclassification ofLoRa coverage Mapping LoRacoverage ontogeographical map 8 642 samples, 23sites, 1 packet per40s, Delft (NL) NoOn Designing a MachineLearning Based WirelessLink QualityClassifier [39], 2020 29x IEEE 802.11 Link qualityprediction,importance ofpreprocessing RSSI logistic,regression, SVM,decision trees,random forest,multi-layerperceptron Classification offuture link stateas good , intermediate or bad Rutgers dataset Yes
Note: Asterisk (*) indicates that the experiment was performed on a public testbed, but no data is available.
One of the earliest publications from this group is [8]that aimed for improving the reactivity of routing tables inconstrained devices, such as sensor nodes. They collectedtraces of transmissions for nodes located at various distanceswith respect to each other. Then, they computed receptionprobabilities as a function of distances and evaluated a numberof existing link estimation metrics. They also proposed anew link estimation metric called Window Mean with anExponentially Weighted Moving Average (WMEWMA) andshowed an improvement in network performance as a resultof more appropriate routing table updates. The improvementswere shown both in simulations and in experimentation. Thisstudy was also among the earliest studies introducing thethree different grade regions of wireless links, i.e., good , intermediate and bad .Later, [10] noticed that by considering additional metricsalongside WMEWMA, also from higher levels of the protocolstack, the link estimation could be better coupled with datatraffic. Therefore, they introduced a new estimator referred to as Four-Bit (4B), where they combined information fromthe physical (PRR, Link Quality Indicator (LQI)), link (ACKcount) and network layers (routing) and demonstrated that itperforms better than the baseline they chose for the evaluation.In [13], the authors developed a new link quality estimatornamed Fuzzy-logic based LQE (F-LQE) that is based on fuzzylogic, which exploits average values, stability and asymmetryproperties of PRR and Signal-to-Noise Ratio (SNR). As forthe output, the model classifies links as high-quality (HQ) orlow-quality (LQ). The same authors compared F-LQE againstPRR, Expected Transmission count (ETX) [77], RNP [78]and 4B [10] on the RadiaLE testbed [75]. The comparison ofthe metrics was performed using different scenarios includingvarious data burst lengths, transmission powers, sudden linkdegradation and short bursts. Among their findings, theyshowed that PRR, WMEWMA and ETX, which are PRR-based link quality estimators, overestimate the link quality,while RNP and 4B underestimate the link quality. The authorsof [75] demonstrated that F-LQE performed better estimation than the other estimators compared.The authors of [14] used fuzzy logic and proposed a Fuzzy-logic Link Indicator (FLI) for link quality estimation. TheFLI model uses PRR, the coefficient of variance of PRRand the quantitative description of packet loss burst, whichare gathered independently, while the previous F-LQE [13]requires information sharing of PRR. FLI was evaluated in atestbed for 12 hours worth of simulation time against 4B [10],and it was reported to perform better.Foresee (4C) [17] is the first metric from this group focusedon protocol improvement that introduced statistical ML tech-niques. The authors used Received Signal Strength Indicator(RSSI), SNR, LQI, WMEWMA and smoothed PRR as inputfeatures into the models. They trained three ML models basedon na¨ıve Bayes, neural networks and logistic regression. TAL-ENT [38] was then improved on 4C by introducing adaptivelearning rate.More recently, [69] proposed enhancement to the RPLprotocol, which is used in lossy wireless networks. Thisbackward compatible improvement (mRPL) for mobile sce-narios introduces asynchronous transmission of probes, whichobserve link quality and trigger the appropriate action.
2) New or improved link quality estimator:
Srinivasan etal. [11] proposed a two-state model with good and bad states,and 4 transition probabilities between the states to improveon the existing WMEWMA [10] and 4B [10]. Then, Senel etal. [9] took a different approach and developed a new estimatorby predicting the likelihood of a successful packet reception.Besides, Boano et al. [12] introduced the TRIANGLE metricthat uses the Pythagorean equation and computes the distancebetween the instant SNR and LQI. This study identifiesfour different link quality grades including very good , good , average and bad links. Some of the classifiers propose a five-class model [40], [74] and a three-class model [16], [39] forLQE research. Other LQE models leverage regression ratherthan classification in order to generate a continuous-valuedestimate of the link [6], [70], [72]. C. Input metrics for LQE models
With respect to the input metrics used for estimating thequality of a link summarized in the fourth column of Ta-bles II and III, we distinguish between the single and themultiple metric approaches. Single metric approaches use aone dimension vector while multiple metric approaches use amultidimensional vector as input for developing a model.
Single metric input approaches have a number of advan-tages. The trace-set is smaller and thus often easier to collect,the model typically requires less computational power to com-pute, and as shown in [17] they can be more straightforwardto implement, especially on constrained devices. However, byonly analyzing and relying on a single measured variable,such as RSSI, important information might be left out. Forthis reason, it is better to collect traces with several, possiblyuncorrelated metrics , each of them being able to contributemeaningful information to the final model. A good exampleof the latter is using RSSI and spectral images for instance.The estimators surveyed based on single input metric appearin [8], [11], [16], [19], [39] whereas the estimators based on multiple metrics are considered in [5]–[7], [9], [10], [12]–[15],[17], [38], [40], [69], [70], [72], [74].One can readily observe from the fourth column of Tables IIand III that the most widely used metric, either directly orindirectly, is the PRR, which is used as model input in [5],[8]–[11], [13]–[15], [17], [38]. Other input metrics derivedfrom PRR values are also used as input metrics in [9], [12].Looking at the frequency of use, PRR is followed by hardwaremetrics, i.e., RSSI, LQI and SNR in [9], [10], [12], [16], [17],[19], [38]. Other features are less common and tend to appearscarcely in single papers.Table IV summarizes metrics that can be used for measuringthe quality of the link. Every metric from the first column ofthe table can also be used as input for another new metric.The so-called hardware-based metrics [2], such as RSSI, LQI,SNR and Bit Error Rate (BER) are directly produced by thetransceivers, and they also depend on underlying metrics, suchas RSS, SNR, noise floor, implementation artifacts and vendor.The so-called software-based metrics are usually computedbased on a blend of hardware and software metrics. It is clearfrom the first and the last columns of Table IV that the numberof independent input variables is limited. However, recently,additional input has been taken into account in [68].
Topo-logical features assuming cross-layer information exchange,where LQE is informed of node degree, hop count, strengthand distance is considered in [68], while [70] and [6] haveexclusively shown that imaging data can be used as input forLQE models as an alternative source of data, as outlined atthe bottom of Table IV.In addition to finding other new sources of data, a chal-lenging task would be to analyze a large set of measurementsin various environments and settings, from a large number ofmanufacturers to understand how measurements vary acrossdifferent technologies and differ across various implementa-tions within the same technology, and derive the truly effectivemetrics for an efficient development of LQE model.
D. Models for LQE
Considering the models used for developing LQE summa-rized in the fifth column of Tables II and III, the publicationssurveyed can be distinguished as those using statistical mod-els [5], [7]–[9], [11], rule and/or threshold based models [10],[12], [16], fuzzy ML models [13]–[15], [75], statistical MLmodels [6], [17], [18], [38], [39], [68], [70], [72], [74],[76], reinforcement learning models [69] and deep learningmodels [19], [40]. For readers’ convenience, the correspondingtaxonomy is portrayed in Fig. 5.With regard to the statistical models , the authors of [5],[7] manually derived error probability models from traces ofdata using statistical methods. Additionally, Woo et al. [8]derived an exponentially weighted PRR by fitting a curve toan empirical distribution, whereas Senel et al. [9] first used aKalman filter to model the correct value of the RSS, then theyextracted the noise floor from it to obtain SNR, and finally,they leveraged a pre-calibrated table to map the SNR to a valuefor the Packet Success Ratio (PSR). Srinivasan et al. [11] usedthe Gilbert-Elliot model, which is a two-state Markov process TABLE IV: Metrics that can be used to measure the quality of a link.
Link quality Hardware Software-based Image Topological Sides involved Gathering methodmetrics based PRR-based RNP-based Score-based based Rx Tx Passive Active Related base-metric(s)
RSSI (cid:51) (cid:51) (cid:51)
RSS, SNRLQI (cid:51) (cid:51) (cid:51)
Vendor-specificSNR (cid:51) (cid:51) (cid:51)
RSS, noise floorBER (cid:51) (cid:51) (cid:51) –PRR (cid:51) (cid:51) (cid:51)
PERWMEWMA (cid:51) (cid:51) (cid:51)
PER, PRR4B (cid:51) (cid:51) (cid:51) (cid:51) (cid:51)
LQI, PRR, ACK,broadcastLQ, NLQ (cid:51) (cid:51) (cid:51) (cid:51) –ETX (cid:51) (cid:51) (cid:51) (cid:51)
LQ, NLQ4C (cid:51) (cid:51) (cid:51)
LQI, PRR, SNR,RSSITRIANGLE (cid:51) (cid:51) (cid:51)
SNR, LQIImage-based (cid:51)
Topological (cid:51) with good and bad states with four transition probabilities.The output of the model is the channel memory parameterthat describes the “burstiness” of a link.Considering the rule based models , 4B [10] constructs alargely rule based model of the channel that depends on thevalues of the four input metrics, whereas Boano et al. [12]formulate the metric using geometric rules. First, Boano etal. [12] computed the distance between the instant SNR andLQI vectors in a 2D space. Then, they used three empiricallyset thresholds to identify four different link quality grades: very good , good , average or bad . Finally, [16] manually rulesout good and bad links based on LQI values and then, for theremaining links, computes additional statistics that are used todetermine their quality with respect to some thresholds.The first fuzzy model , F-LQE [13] uses four input metricsincorporating WMEWMA, averaged PRR value, stability fac-tor of PRR, asymmetry level of PRR and average SNR, andfuzzy logic to estimate the two-class link quality. Rekik etal. [15] adapts F-LQE to smart grid environments with higherthan normal values for electromagnetic radiation, in particular50 Hz noise and acoustic noise. Finally, Guo et al. [14]proposed a different two-class fuzzy model based on thetwo input metrics, namely coefficient of variance of PRRand quantitative description of packet loss burst, which aregathered independently, and are different from the ones usedfor F-LQE.One of the earliest statistical ML model , the so-called 4C,was proposed by Liu et al. [17], where 4C amalgamatedRSSI, SNR, LQI and WMEWMA, and smoothed PRR totrain three ML models based on na¨ıve Bayes, neural networksand logistic regression algorithms. Then, Liu et al. [18],[38] introduced TALENT, an online ML approach, wherethe model built on each device adapts to each new datapoint as opposed to being precomputed on a server. TALENTyields a binary output (i.e., whether PRR is above the pre-defined threshold), while 4C produces a multi-class output.TALENT also uses state-of-the-art models for LQE, suchas Stochastic Gradient Descent (SGD) [79] and smoothedAlmeida–Langlois–Amaral–Plakhov algorithm [80] for theadaptive learning rate and logistic regression.Other statistical models, such as Shu et al. [74] used Support Vector Machine (SVM) algorithm along with RSSI,LQI and PRR as input to develop a five-class model of thelink. Besides, Okamoto et al. [70] used an on-line learningalgorithm called adaptive regularization of weight vectors tolearn to estimate throughput from throughput and images.Then, Bote-Lorenzo et al. [72] trained online perceptrons,online regression trees, fast incremental model trees, andadaptive model rules with Link Quality (LQ), Neighbor LinkQuality (NLQ) and ETX metrics to estimate the quality ofa link, whereas Demetri et al. [6] benefit from a seven-classSVM classifier to estimate LoRa network coverage area bymeans of using 5 input metrics to train the classifier includingmulti-spectral aerial images. More recently, [39] evaluated fourdifferent ML models, namely logistic regression, tree based,ensemble, multilayer percepron, against each other.The only proposed reinforcement learning model for linkquality estimation appears in [69]. The authors train a greedyalgorithm with Packet Error Rate (PER), RSSI and energyconsumption input metrics to estimate PRR in view of protocolimprovement in mobility scenarios.The two LQE models using deep learning algorithms havealso been proposed. For the first model, Sun et al. [19]introduce Wavelet Neural Network based LQE (WNN-LQE),a new LQE metric for estimating link quality in smart gridenvironments, where they only rely on SNR to train a waveletneural network estimator in view of accurately estimatingconfidence intervals for PRR. In the latter model, Luo etal. [40] incorporate four input metrics, namely SNR, LQI,RSSI, and PRR, and trains neural networks to distinguish afive-class LQE model. E. Output of link quality estimator
Regarding the output of link quality estimators summarizedin the sixth column of Tables II and III, we can observe threedistinct types of the output values.The first type is a binary or a two-class output , which isproduced by the classification model. This type of output canbe found in [8], [14], [15], [18], [75]. The applications noticedare mainly (binary) decision making [8] and above/belowthreshold estimation [14], [15], [18], [75]. Link Quality Estimation Machine Learning Classification Bayesian Naive Bayes [17], [18],[38]Regression Logistic regression[17], [18], [38], [39],[76]Kernel methods SVM [6], [39], [74]Neural networks Artificial NeuralNetworks [17], [18],[38]Multilayer perceptron[39]Deep Learning [40]Trees Decision trees [39]Ensemble methods Random forests [39]Regression Trees Regression trees [68],[72]Kernel methods SVM [68]Instace based k-NN [68]Filter based Kalman filter [9]Regularization Adaptive regularizationof weight factors [70]Neural networks Artificial neuralnetworks [72]Deep learning [19]Fuzzy Rule learning[13]–[15], [75]Reinforcement learning (cid:15) -Greedy [69]Traditional Statistical [5], [7]–[9],[11]Rule or thresholdbased [10], [12], [16]
Fig. 5: Taxonomy of the LQE approaches using ML algorithms and traditional methods.The second type is multi-class output value. Similar to thefirst type, it is also produced by the classification model.The multi-class output values are utilized in [6], [12], [16],[40], [74], [76], where [16], [39], [76] use a three-class, [12]utilizes a four-class, [40], [74] rely on a five-class, and [6]leverages a seven-class output. The applications observed arethe categorization and estimation of the future LQE state,which is expressed through labels/classes.It is not clear from the analyzed work how the authorsselected the number of classes in the case of multi-classoutput LQE models. The three class output models seem to bejustified by the three regions of a wireless links [2]. The sevenclass output model [6] justifies the 7 types of classes based onseven types of geographical tiles. For the rest or the work, itis not clear what is the justification and advantage of a four,or five class LQE model. Generally, by adding more classes,the granularity of the estimation can be increased while thecomputing time, memory size and processing power increase.The third type is the continuous-valued output . In contrast tothe first two types, it is produced by a regression model, which is considered by [5], [7], [9]–[11], [17], [19], [68]–[70], [72].The value is typically limited only by numerical precision. Theapplications observed are the direct estimation of a metric [5],[7], [9], [19], [68]–[70], [72], probability value [11], [17] andtheir proposed scoring metric [10], which are later used forcomparative analysis.Some of the proposed or identified applications requirecontinuous-valued LQE estimation, for instance, network con-gestion controller (TCP Reno) [7], communication handover[70], and routing table managers [10], [17], [19], [68], [69],[72]. For other routing table managers and applications, adiscrete valued LQE suffices according to the surveyed work.Note that any continuous estimator can be subsequently con-verted to discrete valued one.
F. Evaluation of the proposed models
We analyze the way Tables II and III evaluate the proposedLQE models along several dimensions. The evaluation metricanalysis of the surveyed literature is presented in Table V. The TABLE V: A survey of the comparison for LQE models and their respective evaluation metrics considering the research paperscomprehensively surveyed in Tables II and III.
ID Evaluation metrics The proposed LQE models Link quality estimators that the proposed LQE modelsare compared to [5]15 LQE sensitivity, LQE stability, CDF [13], [75] ETX [77], WMEWMA [8], RNP [78], 4B [10]16 Number of downloads [7]17 PRR, number of parent changes [8]18 Total number of transmissions, average tree depth, deliveryrate (PSR) [10] ETX [77], Collection Tree Protocol (CTP) [83],MultiHopLQI19 PSR [9] ETX [77], RNP [78]20 Throughput [11]21 Channel rank estimation, energy consumption, channelswitching delay, stability [76]22 Average packet loss, num. of control packets, energyconsumption [69] second column of the table lists the metrics used to evaluate theLQE model by the research papers listed in the third column ofthe table. The fourth column of the table identifies what otherexisting link quality estimators were utilized and comparedagainst the ones proposed in the papers outlined in the thirdcolumn.
1) Evaluation from the purpose of the LQE perspective:
Firstly, we analyze the evaluation of the models through thelens of the purpose of the LQE as discussed in Section II-B.We identify direct evaluation, where the paper directly quanti-fies the performance of the proposed LQE models vs. indirectevaluation, where the improvement of the protocol or theapplication as a result of the LQE metric is quantified.
Direct evaluations of LQE models typically evaluate thepredicted or estimated value against a measured or simulatedground truth. The metrics used for evaluation depend onthe output of the proposed model for LQE discussed inSection II-E.When the output are categorical values, then it is possibleto use metrics based on predicted label count versus the labelcount of the ground truth. Confusion matrices are used by [12],[16], [18], [38]–[40] as seen in rows 1, 2, 3 and 4 of Table V,classification accuracy is used by [6], [17], [18], [38], [40],[74] as observed in rows 3, 5, 6 and 7, and recall is used incombination with accuracy and confusion matrix by [40] asillustrated in the fourth row of the table. Only more recently,[39] uses the combined confusion matrix, precision, recall and F1 to provide more detailed insights into the performanceof their classifier. Well known evaluation metrics, such asclassification precision, classification sensitivity, F1 and Re-ceiver Operating Characteristic (ROC) curve are used seldomor not at all among the evaluation metrics in the surveyedclassification work. However, they can be computed for someof the metrics based on the provided confusion matrices.The LQE metrics listed in rows 1-3 of Table V can becompared to each other in terms of performance by mappingthe 5 and 7 class estimators to the 2 or 3 class estimator.This results in a number of comparable 2 or 3 dimensionconfusion matrices that can be analyzed. However, as themetrics are developed and evaluated under different datasets,the comparison would not be exactly fair and it would not beclear which design decision led one to be superior to another.The same discussion holds also for other rows of the table thatshare common evaluation metrics. High level comparisons thatabstract such details are provided later in Sections III and IVfor selected ML works that reported their results in sufficientdetail.When the output is continuous, then each predicted valueis compared against each measured or simulated value usinga distance metric. For instance, the authors of [14], [19], [70]use Root-Mean-Square Error (RMSE) as a distance metric asshown in rows 8-10 of Table V, whereas the authors of [68],[72] use mean absolute error (MAE) as in rows 11 and 12of the table. Some other research papers as in [5], [13], [15], WMEWMA [8]ETX [77]RNP [78]STLE [81]4B [10]Kalman [9]F-LQE [75]4C [17]FLI [14]TALENT [18]Opt-FLQE [15]WNN [19] SVM [74] SAE [40]
Fig. 6: Visualization of relationships for cross-comparison ofthe research papers with their corresponding evaluation metricsoutlined in Table V.[75] use CDF as illustrated in rows 13-15, while the authorsof [5] leverage R in row 14 of Table V. Indirect evaluations of LQE models evaluate against appli-cation specific metrics. The papers evaluate the performance oftheir objective functions based on the presence of link qualityestimators. For example, the studies conducted in [5]–[8], [11],[12], [16], [69], [70], [72], [76] consider their respective objec-tive functions for the particular applications and demonstrate toobtain better results by means of using estimators compared tothe cases with the absence of estimators. While these researchpapers are likely to be leading on the respective use casesof LQE models owing to their first attempts in their specificapplication domains, their results and design decisions are stilldifficult to compare against each other. Various applicationspecific evaluation metrics, such as number of downloads [7],number of parent changes [8], throughput [11] can also befound as listed in the rows 16-22 of Table V.
2) Evaluation from cross-comparison perspective:
Sec-ondly, we categorize papers that evaluate their outcomesagainst other estimators existing at the time of writing versuspapers that are somewhat stand alone . For instance, in row3 of Table V, TALENT [38] is evaluated against ETX,STLE, 4B, WMEWMA and 4C. For more clarity, this isrepresented visually in Fig. 6 with directed arrows exiting fromTALENT and entering the boxes of the respective metrics,which explicitly depicts the relationship between the last twocolumns of Table V. Such comparisons are informative asdemonstrated by [75]. Among their findings, they showedthat PRR, WMEWMA, and ETX, which are PRR-based linkquality estimators, overestimate the link quality, while RNPand 4B underestimate the link quality. F-LQE performed betterestimation than the other compared estimators.However, metrics of the surveyed papers [6], [16], [69],[70], [76] are not evaluated against other existing estimators,due to their unique approach (application) and/or being among the first to tackle certain aspect of estimation. For instance, theauthors of [76] evaluate the estimated ranking/classification ofsubset of wireless channels and the authors of [69] evaluatethe impact of networking performance with estimator assistedrouting algorithm against vanilla (m)RPL protocol, whilethe authors of [70] evaluate estimated and real throughputdegradation when line of sight is blocked by an object.Besides, the authors of [16] evaluate data-driven bidirectionallink properties, and [6] evaluates estimated vs. ground truthsignal fading, which is influenced by ML algorithm’s abilityto classify geographical tiles.
3) Evaluation from infrastructure perspective:
Thirdly, wecategorize papers to those that perform evaluation and vali-dation on real testbeds [5]–[10], [12]–[14], [17], [18], [38],[69], [70], [75] shown as in rows 1, 3, 6, 8, 9, 14-19, 22,those that perform evaluation in simulation such as [15],[69], [76] in rows 13, 21, 22, and the rest that perform onlynumerical evaluation. The papers in the first category, thatperform evaluation and validation on testbeds, are better at pre-senting how the estimator will actually influence the network.The papers from second category performing simulation canprovide good foundation for further examination and potentialimplementation. Finally, the papers in third category, that onlydo numerical evaluation, can unveil possible improvementsthrough statistical relationships.
4) Evaluation from convergence perspective:
Fourthly, dur-ing our analysis it has emerged that a number of papersreflect on and quantify the convergence of their model. Forinstance, in [11], they concluded that their model starts toconverge at approximately 40,000 packets. In [9], the authorsdemonstrated that the link degradation could be detected evenwith a single received packet. The metric proposed in [12]required approximately 10 packets to provide the estimation ineither a static or mobile scenario. In [17], they suggested thatdata gathered from 4-7 nodes for approximately 10 minutesshould be sufficient to train their models offline. Althoughthese papers indicate convergence rate/size, a communitywide systematic investigation of LQE model convergence ismissing.At this point, we can conclude that research communityin general have shown remarkable improvements, use cases,and skills toward better estimators. However, despite theaforementioned evaluation of proposed estimators, providinga completely fair comparison of LQE models is not feasibleconsidering the diverse evaluation metrics outlined in Table V.
G. Reproducibility
Reproducibility of the results is recognized as being an im-portant step in the scientific process [65]–[67] and is importantfor replication as well as for reporting explicit improvementsover the baseline models. When researchers publicly share thedata, simulation setups and their relevant codes it becomeseasy for others to pick up, replicate and improve upon, thusspeeding up the adoption and improvement. For instance,when a new LQE model is proposed, it can be ran on thesame data or testbed as a set of existing models provided thedata and models are publicly accessible to the community. Purpose of MLLQE models New & ImprovedLQE Prediction /Estimation Link quality [6],[19], [39], [40], [68],[72], [76]Stability [76]Throughput [6], [70]ProtocolImprovement Maximize Throughput [17],[18], [38], [74]Reliability [14], [15]Reactivity [15], [76]Minimize Topology changes[14]Depth of routingtree [14]Probing overhead[69]Traffic congestion[74]
Fig. 7: Classification of the works by considering the purpose for which the ML LQE model was developed.The existing models can also be re-evaluated in the samesetup, thus replicating the existing results or they can beused as baselines in new scenarios. With this approach, theperformance of the new LQE model can be directly comparedto the existing models with relatively low effort.With respect to the reproducibility of the results in thesurveyed publications, we notice that only [11], [18], [39]are easily reproducible because they rely on publicly availabletrace-sets. Studies reported in [5], [7], [8], [10], [16], [17], [75]use open testbeds that, in principle, could be used to collectdata and the results can be reproduced. However, it is not clearwhether some of these testbeds are still operational given that10-20 years have passed after the date of publication of thecorresponding research. We were not able to find any evidencethat the results in [9], [12], [14], [15], [19] could be reproducedas they strictly rely on an internal one-time deployment anddata collection.III. A
PPLICATION P ERSPECTIVE OF
ML-B
ASED
LQE S In this section, we provide an analysis of the ML-basedLQEs from application perspectives. We identify what isimportant from an application perspective and how that affectsML methods utilized for the LQE modeling. We first focus onthe purpose of the LQE model development followed by theanalyses of the application quality aspects.
A. LQE design purpose
In Section II-B, we have reflected on the purpose for whichan LQE model was developed, and as depicted in Fig. 7, wefound that about half of the ML-based LQE studies developedan estimator with the goal of improving an existing protocol, while the other half aimed for a new and superior LQE model.Fig. 7 presents that ”protocol improvement” group attempts tominimize or maximize a particular objective, such as trafficcongestion, probing overhead, topology changes, just to namea few. Most of the studies that fall into ”new & improvedLQE” group only aim to improve the prediction or estimationof the quality of a link.The body of the work considering ”protocol improvements”is intricate to quantitatively compare against each other sincenumerical details of the LQE models are not explicitly pro-vided in the respective works, as previously discussed in Sec-tion II-F. Similar difficulties also arise for a large part of thebody of work related to ”new & improved LQE” models sincethey do not utilize consistent evaluation metrics. For instance,for LQE models formulated as a classification problem, only asubset of the works leverages accuracy as a metric, while othersubsets use confusion matrix or specifically defined metrics,which indeed renders them impractical to quantitatively com-pare against each other, as outlined in Table V and discussed inSection II-F. Attaining a fair comparison is even more difficultfor the works that formulate the LQE problem as a regression.Later in Section VI-C, we provide guidelines with regards tothis aspect.Fig. 8 presents a high level comparison of the selected worksthat use ML for LQE model development [17], [18], [38]–[40] and one that does not [12]. All the considered worksformulated the LQE model as a classification problem and itis possible to extract the approximated per class performancefrom the reported performance results of those respectiveworks. Notice that they are different in terms of; i) the inputfeatures used to train and evaluate the models (more details inSection II-C), ii) the number of classes used for the model (very) bad bad intermediate good (very) goodClasses of link quality020406080100 C o rr ec tl y c l a ss i fi e d li nk s [ % ] NaiveBayes [17], 2011LogReg [17], 2011ANN [17], 2011TALENT [38], 2012OnlineTALENT [18], 2014TRIANGLE [12], 2010AutoEncoder → Corridor [40], 2019AutoEncoder → Groove [40], 2019AutoEncoder → Parking [40], 2019AutoEncoder → Road [40], 2019SVM+RBF [74], 2017SVM+Poly [74], 2017LogReg [39], 2020DecisionTree [39], 2020RandomForest [39], 2020SVM [39], 2020MultiLayerPerceptron [39], 2020
Fig. 8: Comparison of the wireless link quality classification performances throughout the surveyed papers.(more details in Section II-E), and iii) the considered MLalgorithm (more details in Section II-D).On the x-axis, Fig. 8 presents five different link qualityclasses, while on the y-axis it presents the percentage ofcorrectly classified links. The comparison reveals that, au-toencoder [40], which is a type of deep learning method,on average performs best with above 95% correctly classified very bad , bad and very good links and about 87% correctlyclassified intermediate quality link classes. Autoencoders areoutperformed by the non-ML baseline [12] and SVM withRBF kernel [74] on the very good link quality class by about4 percentage points, by over 30 percentage points on the good quality link class and by about 12 percentage points on the intermediate quality link class. As autoencoders are known tobe powerful methods, we speculate that such high performancedifference on those three classes might be due to insufficienttraining data or other experimental artifacts.Tree-based methods and SVM [39] as well as the cus-tomized online learning algorithm TALENT [38] follow theperformance of the autoencoders very closely with a tinymargin on very bad , very good and intermediate link qualityclasses. Next, the offline version of TALENT [18] exhibitsvery similar performance to tree-based methods and SVM onthe intermediate class and about 17 percentage points worseon the very good class. Moreover, traditional artificial neuralnetworks, logistic regression and Naive Bayes [17] follownext with almost 20 percentage points difference comparedto autoencoders on the very good and very bad link qualityclasses and almost 30 percentage points on the intermediate link quality class. The relative performance difference of thework reported in [17] might be due to the poor data pre-processing practices, such as the lack of interpolation, which can significantly influence the final model performance that isdiscussed later in Section IV.To summarize, the analysis of Fig. 8 reveals that autoen-coders, tree based methods and SVM tend to consistentlyperform better than logistic regression, naive Bayes and ANNswhile the non-ML TRIANGLE estimator performs very wellon two of the classes, namely very good and good link qualityclasses. Discussion:
The observations from Fig. 8 also conform tothe general performance intuitions regarding ML approaches.Namely, fuzzy logic and Naive Bayes are generally compara-ble with the latter being far more practical and popular. Neitherof the two are known to exhibit better relative performanceagainst logistic or linear regression. As shown in [17], [38],Naive Bayes tends to exhibit reduced performance comparedto logistic regression, whereas ANNs are usually superior.Fuzzy logic, Naive Bayes, linear and logistic regression arerelatively simple and require modest computational load andmemory consumption. Therefore, these ML methods can besuitable for implementation in embedded devices, especiallyfor small-dimensional feature spaces. Besides, ANNs can bedesigned to optimize computational load and memory con-sumption, particularly by simplifying their considered topolo-gies, which in turn, comes with a cost to their performance.For classification in constrained embedded devices, the au-thors of [17], [38] selected logistic regression for its simplicityamong other three candidates. The selection was based onpractical considerations, but their experiments proved thatANNs were superior compared to other LQE models. Thereason behind this is because logistic and linear regressionsare linear models that tend to be more suitable to approximatelinear phenomena. Contrarily, LQE models do not follow linear models and therefore ANN-based model outperformedits counterpart LQE models in [17], [38].SVMs, part of the so-called Kernel Methods, were pop-ular and frequently used at the beginning of the centurybefore significant breakthroughs brought by deep learning(deep neural networks (DNN)). SVMs often exhibit at leastsimilar performance to ANNs and also to decision/regressiontrees [68]. However, there are only a paucity of contributionson adapting them for embedded devices [84]. In [72], theauthors performed an in-depth comparison of ML algorithmsincluding SVM, decision trees and k nearest neighbors (k-NN)from several perspectives, such as accuracy, computationalload and training time. Their results showed that SVMs areconstantly superior in terms of accuracy to k-NN and regres-sion trees at the expense of significant resource consumption.While many of the traditional ML methods including de-cision/regression trees and k-NN typically require an explicit,often manual feature engineering step, SVMs are able to au-tomatically weight the features according to their importanceautomatizing part of the effort allocated for manual featureengineering. SVMs are known to be highly customizablethrough hyperparameter tuning, which is a dedicated researcharea within the ML community. Through appropriate selectionof the kernel and parameter space [85], they are able toperform very well on both linear and non-linear problems.Therefore, from this particular perspective, SVMs and thebroader Kernel Methods are indeed favorable choices fordeveloping LQE models.Deep learning, represented by DNNs are a new class ofML algorithms that are currently under intense investigationin various research communities penetrating also wireless andLQE [40]. These algorithms are very powerful and accuratefor approximating both linear and non-linear problems, albeitrequiring high memory and computational cost. Such modelsare prohibitive for embedding in constrained devices. How-ever, there are a number of research efforts [86] investedin employing transfer learning approaches [87]. When anLQE based data processing occurs on a non-constrained de-vice, such as the case in [6], DNNs can show an outstandingperformance. While the authors of [6] proposed a novel andvisionary approach for the development of an LQE model andaccomplished robust results using SVMs, employing DNNsmight assist in surpassing those existing results. B. Application quality aspects
Following the analyses from Sections II-B and II-F, wehave identified five important link quality aspects to considerwhen choosing or designing an LQE model (estimator). Theseaspects are often used to indirectly evaluate the performanceof LQE models, by evaluating the behavior of the applicationthat relies on LQE versus the one that does not rely on it.1)
Reliability - The LQE model should perform estimationsthat are as close as possible to the values observed. Moreexplicitly, LQE models should maintain high accuracy.2)
Adaptivity/Reactivity - The LQE model should reach andadapt to persistent link quality changes. This indicatesthat when a link changes its quality for a longer period of time, the LQE model should be able to capturethese changes and accordingly perform the estimations.Changes in estimation subsequently unveil routing topol-ogy changes.3)
Stability - The LQE model should be immune to tran-sient link quality changes. This immunity ensures arelatively stable topology leading to reduced cost ofrouting overheads.4)
Computational cost - The computational complexityof LQE models should be considerate of the targetdevices, where computational load can be appropriatelyapportioned among constrained and powerful devices.5)
Probing overhead - LQE models consider a diverse setof metrics to estimate the link quality, as discussed inSection II-C, which are gathered through probing. LQEmodels should be designed in an optimal way so thatthe probing overhead is minimized.A comprehensive classification of the ML-based LQE stud-ies according to the aforementioned five application qualityaspects is exhibited in Fig. 9, which reveals that most ofthe LQE studies explicitly consider computational cost and reliability aspects in their evaluations, whilst only a paucityof the studies considers probing overhead , adaptability and stability . With respect to computational cost , it can be readilyobserved from the figure that tree- and neural network-basedmethods tend to have higher computational cost, whereasonline logistic regression has medium cost, and Naive Bayes,fuzzy logic and offline logistic regression have relatively lowcomputational cost. With regards to the probing overhead fortrace-set collection, it is perceived from Fig. 9 that some LQEmodels are designed to incur zero-overhead, and one incursboth asynchronous and synchronous (async. & sync.) probing,whereas the other is devised to use an adaptive probing rate.As far as reliability is concerned, some LQE studies focuson the reliability of the routing tree topology, and on thelink prediction/estimation, whereas others put emphasis onthe traffic. Adaptability is explicitly taken into considerationmostly in studies employing online learning algorithms, while stability is considered for those studies focusing on offlinelearning algorithms.
Discussion:
To support a more in-depth understanding,Table VI presents an aggregated and elaborated view of thepapers that are systematically categorized in Figs. 7 and 9.The first column of the table shows the purpose for whichLQEs have been developed, the second column of the tablelists the problem that is being solved using ML-based LQEmodels, the third provides the relevant research papers solvingthose respective problems, column four includes the ML typeand method, while the last five columns correspond to thelink quality metrics previously enumerated in this section. Thelast five columns are filled in, if those quality aspects aregiven consideration in these respective research papers andleft empty otherwise.The first line of Table VI indicates that the problem solvedby [17], [18], [38] is to reduce the cost of packet delivery witha well-known multi-hop protocol, the so-called collection treeprotocol (CTP). In their first approach, [17] achieve this bydeveloping three batch ML models that, according to their ApplicationQuality Aspects
Reliability
Traffic[14], [70], [74]Link[6], [19], [39], [40], [68], [72], [76]Topology, routing[14], [70], [74]
Adaptivity
Onlinelearning
Naive Bayes [17]Logistic Regression[18], [38]ANN[18], [38], [72]
Stability
Offlinelearning
Fuzzy Methods [14], [15]Custom Algorithm [15]
ProbingOverhead
Trace-setCollection
Adaptive probing rate [69]Async. & sync. probing [69]Zero-overhead [6], [70]
ComputationalCost
HighMediumLow
Regression Trees [68]Online Perceptrons [72]ANN [17], [18], [38]Online LogisticRegression [18], [38]LogisticRegression [17]Fuzzy Rules [14]Naive Bayes [17], [18], [38]
Fig. 9: Classification of the surveyed LQE papers by taking into consideration the identified application quality aspects.evaluation, perform better than 4BIT. However, ML modelsare trained in batch mode and remain static after training,therefore the estimator is not adaptive to persistent changesin the link. Batch or offline training of ML algorithms [88]means that the model is trained, optimized and evaluated onceon available training and testing sets, and has to be completelyre-trained later in order to adapt the possible changes in thedistribution of the updated data. In practice, this correspondsto sporadic updates, e.g., once in few hours and once perday depending on how the overall system is engineered. Forthe case of embedded devices, the device has to be fully orpartially reprogrammed [89]. In the specific case of [17], itis clear that the coefficients of the linear regression modellearned during training are hard-coded on the target deviceand reprogramming is required for obtaining the updates.When the behavior of the links changes significantly, es-pecially for wireless networks having mobility, the offlinemodel is expected to decrease in performance, since those linkchanges may not be recognized by the ML model residingon the devices. In [18], [38], they improve their previouslyproposed offline modeling by introducing adaptivity to theirmodels and thus developing online versions of the learningalgorithms. Online ML algorithms are capable of updatingtheir model [88] as new data points arrive during regularoperation. The authors of [18], [38] also address reliabilityand computational cost aspects in their evaluation, as can bereadily seen in the respective columns of Table VI.Realizing the shortcomings of the offline-models [68] forestimating LQE in community networks and then developingon-line [72] models can be also noticed in the sixth line ofTable VI. This research problem is formulated as a regression problem, while the previous one addressed in [17], [18], [38] isformulated as a classification one. Both approaches are suitablefor the purpose and both need to implement a threshold- orclass-based decision making on whether to use the link ornot. ML methods used in [68] and [72] target WiFi devices(routers) and are thus more expensive in terms of memoryand computational cost than those that target constraineddevices (sensors), as outlined at the first line of Table VI.Generally speaking, ML algorithms, such as SVM and k-NNused in [68], [72] and outlined at line six of Table VI arecomputationally more expensive than naive Bayes and logisticregression utilized in [17], [18], [38] and outlined at the firstline of Table VI.In addition to the adaptivity trade-offs noticed in researchpapers at the first and sixth rows of Table VI, reactivity trade-offs can be perceived from research papers outlined in thesecond, third and seventh rows of Table VI. More explicitly,in the second row, LQE model is used to improve networkreliability by reducing topology changes and the depth ofthe routing tree [14], while still maintaining high reliability,and in the third and seventh rows, [15] and [76] enhancereliability, stability and reactivity, respectively. The applicationrequirements of these studies seem to favor reliable and costeffective routing with minimal routing topology changes. Tosum up, the LQE model has to be as accurate as possible,update the model on significant link changes and remainimmune to short-term variations for the sake of a stabletopology. To achieve such goal, the right tuning of on-linelearning algorithms that ensure a good stability vs adaptivitytrade-offs has to be performed.The computation of LQE models involves probing overhead TABLE VI: Overview of the applications of the ML-based LQE models for the relevant papers surveyed in Tables II and III.
Purpose Specific Problems ResearchPapers ML Type and Method Reliability Adaptivity Stability ComputationalCost ProbingOverhead
LQE forprotocolperformance 1. Reduce the cost ofdelivering a packet inmultihop networks(CTP protocol) [17] Classification: Naive Bayes,Logistic regression,Artificial neural networks No(offline) Low[18], [38] Yes Yes(online) Medium2. Improve networkreliability, reduce topologychanges and routing depth [14] Regression: Fuzzy logic (2inference rules,defuzzification) Yes Yes Low3. Improve reliability andreactivity in an applicationspecific network [15] Classification: Customalgorithm based on fuzzylogic Yes Yes Yes4. Minimize the overheadcaused by active probingoperations [69] Regression: Reinforcementlearning Yes5. Select links thatmaximize the delivery rateand minimize trafficcongestion for routing. [74] Classification: SVM YesNew orimproved LQE 6. Prediction the qualityof link in communitynetwork (WiFi) [68] Regression: SVM,regression trees, k-nearestneighbor, Gaussian processfor regression Yes No(offline) High[72] Regression: perceptron,regression trees, incrementalmodel trees with driftdetection and adaptivemodel rules Yes Yes(online) High7. Link prediction quality,stability and reactivity [76] Classification: customalgorithm + 2 extensions Yes Yes8. Reliable link qualityestimation usingprobability-guaranteedestimation result [19] Regression: Wavelet Neuralnetworks Yes9. Improved LQE [40] Classification: Deeplearning (autoencoders) Yes10. No overhead throughputestimation in mmWavesusing RGB imaging [70] Regression: Adaptiveregularization of weightvectors Yes Yes11. Accurate estimation ofLoRA transmissions usingmultispectral imaging [6] Classification: SVMs withRadial Basis Function(RBF) kernel Yes Yes12. On Designing aMachine Learning BasedWirelessLink QualityClassifier [39] Classification: Logisticregression, decision trees,random forest, SVM,multi-layer perceptron Yes to collect relevant metrics, as discussed in Section II-C andTable IV. Minimizing the probing overhead has also beena major concern for a number of research papers [6], [69]and [70], as it can be readily observed from rows four, tenand eleven of Table VI. In row four, probing overhead isreduced by using reinforcement learning to guide the probingprocess [69], while in [6] and [70], network related infor-mation obtained via probing is replaced with external non-networking sources based on imaging. Replacing the probingoverhead with additional hardware components that involvelearning from image data, image capturing and processing,consequently leads to increased computational complexity ofthe system.The remaining research papers [19], [40] and [74] out-lined at lines five, eight and nine of Table VI address theaspects of developing more accurate estimators against pre-determined baseline models. Additionally, the LQE modelproposed by [19] provides probability-guaranteed estimationusing packet reception ratio for satisfying reliability require-ments of the smart grid communication standards. IV. D
ESIGN P ROCESS P ERSPECTIVE OF
ML-B
ASED
LQE S For the development of any ML model, the researchers haveto follow some very precise steps that are well established inthe community, defined in the Knowledge Discovery Process(KDP) [63], [90], namely data pre-processing, model buildingand model evaluation. The data pre-processing stage is knownto be the most time-consuming process, tends to have a majorinfluence on the final performance of the model and is appliedon the training and evaluation data collected based on theinput metrics discussed in Section II-C. This stage includesseveral steps, such as data cleaning and interpolation, featureselection and resampling. The model building and selectionsteps usually take a set of ML methods, train them usingthe available data and evaluate their results, as discussed inSection II-F.Analyzing the existing works from the perspective of thedesign process is equally important and complements theanalysis from the application perspective performed in Sec-tion III. Fig. 10 classifies the studies based on the reporteddesign decisions taken while developing ML-based LQE mod- Design Decisions
Type ofML
RegressionClassification
Fuzzy Logic [14]SVM [68]Regression Trees[68], [72]Gaussian Process [68]kNN [68]NN, WNN [19]Reinforcement Learning[69]ARN [70]Online perceptron [72]Naive Bayes[17], [18], [38]Logistic Regression[17], [18], [38], [39]Fuzzy [15]SVM [6], [39], [74]DL, ANN[17], [18], [38]–[40]Decision Trees [39]Custom [76]
EvaluationMetrics
StandardMetricsApplicationSpecific
Accuracy[6], [17], [18],[38]–[40], [74]Recall [40]Confusion Matrix[18], [38]–[40]MSE, RMSE[6], [14], [19], [69], [70]MAE [68], [72]CDF [15]Throughput [70]Topology Changes [14]Stability [15], [76]Delivery Cost [17]
Cleaning &Interpolation
Missing Values[14], [18], [38]–[40], [70]Averaging [39], [70], [72]Scaling [40]
Resampling
ML-basedRandom [39]
FeatureSelection
Available [6], [15], [17]–[19],[38]–[40], [69], [70], [74], [76]Synthetic [6], [14], [15], [17], [18],[38]–[40], [68], [69], [72], [74], [76]
Fig. 10: Overview of the design decisions taken during the development of the ML-based LQE models for the relevant paperssurveyed in Tables II and III.els, namely cleaning and interpolation, feature selection, re-sampling strategy and ML model selection. Fig. 11 comparesthe reported influence of the respective steps on the final modelconsidering accuracy as the metric while Fig. 12 depicts thetrade-off for the process considering the F1 score and theprecision and recall metrics. A. Cleaning & interpolation steps
From the Cleaning & Interpolation branch of the mindmap depicted in Fig. 10 it can be seen that only seven ofthe ML-based LQE models provide explicit consideration ofthe cleaning and interpolation step. While in the general MLpractice that use real world datasets, the cleaning step isvery difficult to avoid and LQE-based research papers mostlyleverage carefully collected datasets, often generated in-housefrom existing testbeds, as discussed in Section II-A. Forinstance, Okamoto et al. [70] perform cleaning on the imagedata they selected to use as part of the model training. F ∗ precision ∗ recall/ ( precision + recall ) precision = true positives/ ( true positives + false positives ) recall = true positives/ ( true positives + false negatives ) With respect to interpolation, however, several works [14],[18], [38], [40] fill in missing values with zeros. Their designdecision with respect to this step of the process can also bereferred to as interpolation using domain knowledge as theyreplace the missing RSSI values with 0, which represents apoor quality link with no received signal, yielding PRR equalto 0. It is not clear how [72] handle the missing data, however,they drop measurement data if there are not enough variationsin their values.Explicitly mentioning the design decision with respect tocleaning and interpolation is important for reproducibility(discussed in Section II-G) as well as for its potential influenceon the final performance of the ML model. For instance, itcan be readily seen from Fig. 11a that, all the other set-tings kept the same, domain knowledge interpolation denotedby ”padding” can increase the accuracy of a classifier on good classes from 0.88 to 0.95, while also increasing theperformance on the minority classes from 0.49 to 0.87 for intermediate and nearly 0 to 0.98 for bad , which can also beperceived from the findings of [39]. Going beyond accuracyas an evaluation metric, Fig. 12 shows significant performanceincrease, measured with F1 score which is the harmonic meanof the precision and recall, if the type of used interpolation is none [39] gaussian [39] padding [39] Interpolation methods . . . . . . A cc u r ac y badinterm.good (a) Interpolation LQI [17] PRR [17] LQI+PRR [17] RSSI i [39] RSSI AVG [39] RSSI SD [39] RSSI i ,RSSI AVG ,RSSI SD , [39] Input features . . . . . . A cc u r ac y badinterm.good (b) Features none [39] undersample [39] oversample [39] Resampling strategy . . . . . . A cc u r ac y badinterm.good (c) Resampling Naive Bayes [17] Logistic Regression [17] ANN [17] LR [39] RForest [39] MLP [39]
Machine learning models . . . . . . A cc u r ac y badinterm.good (d) ML models Fig. 11: Accuracy performance analyses for various steps of the design process as an exemplifying three-class LQE classificationproblem with unbalanced training data.optimized for a particular scenario. More specifically, F1 scorefor no-interpolation is about 0.43 on the left lower part of thefigure, then increases to 0.80 with Gaussian interpolation, andfinally reaching 0.94 with constant interpolation (denoted by”padding” in Fig. 11a) that utilizes domain knowledge.
B. Feature selection
According to the feature selection branch of the mind mapdepicted in Fig. 10, all research papers provide details on theirfeature selection. Often, all the features directly collected fromtestbed and simulator are used, as discussed in Section IV-B.Part of the literature, i.e., [17], [18] and [38] also considersthe performance of the final model as a function of the inputfeatures as part of their analysis, while others only report afixed set of features that are then used to develop and evaluatemodels. It may be because, the authors implicitly consideredthe feature selection step and solely reported the final featuresselected for their models to keep their paper concise. In suchcases, the influence of other features or synthetic features [91]cannot be readily assessed in the related works surveyed.Perceived from an extensive comparative evaluation in [39]and from another study that explicitly quantifies the impact ofthe feature selection on an LQE classification problem in [17],we summarize the reported performances with respect to thefeature selection step in Fig. 11b. While the works of theaforementioned figure leverage different datasets and distinct ML approaches, therefore they cannot be fairly benchmarkedagainst each other, it is clear that the feature engineering cansignificantly increase the accuracy of a classifier within thesame work by keeping all the other settings the same. Liu [17]reports up to 9 percentage points classification improvement inall classes by using LQI+PRR compared to the scenario usingLQI only and PRR only, while Cerar [39] reports on averageclassification performance increases from 0.89 to 0.95, whilealso increasing the performance on the minority class from0.38 to 0.87. Furthermore, according to Fig. 12, classificationperformance of F1 score ranges from 0.61 to 0.93, of precisionranges from 0.62 to 0.93 and of recall ranges from 0.63 to 0.93.
C. Resampling strategy
Resampling is used in ML communities when the availableinput data is imbalanced [92], [93]. For instance, assume aclassification problem where the aim is to classify links into good , bad and intermediate classes, similar to the problemapproached in [16], [76]. If the good class would represent75% of the examples in the training dataset, bad wouldrepresent 20% and intermediate would represent the remaining5%, then a ML model would likely be well trained to recognizethe good classes as it has been exposed to many such instances.However, it might be difficult for the model to recognize theother two classes, as they are scarcely populated instances inthe dataset. Fig. 12: Precision vs. recall performance trade-off for various design decisions including interpolation, feature selection,resampling and model selection, where the figure situated at the top-right corner is a zoomed-in portion of the closest regionto F1=1 of the main figure.According to the resampling branch of the mind map inFig. 10, only one very recent research papers elaborateson their resampling strategy. In other works it is often notclear whether they employed a resampling strategy in thecase of imbalanced datasets. For instance, the performanceof the predictor on two of the five classes is modest in [40].It would be interesting to understand whether employing aresampling strategy would provide a better discrimination ofthe considered classes. Resampling could also improve othersurveyed estimators in [6], [18], [38], [74].From Fig. 11c, it can be seen that, all the other settingsbeing the same, performing resampling can slightly decreasethe accuracy of a classifier on the two majority classes from0.97 to 0.95, albeit it can yield a dramatic increase in theclassification performance of the minority intermediate classwith an accuracy raise from 0.61 to 0.88, which can also beworked out from the findings of [39]. Going beyond accuracyas an evaluation metric, Fig. 12 exhibits significant precision,recall and F1 score increase for the minority class, when aresampling strategy is leveraged. More specifically, an LQEmodel without resampling yields an F1 score of about 0.87,which then increases to about 0.93 with undersampling and remains at 0.93 when oversampling is considered.
D. Machine learning method
According to the ML method branch of the mind map shownin Fig. 10 seven of the works estimate the link quality interms of discrete values, therefore they perform classification,while the remaining seven estimate it as actual values, henceregression is employed. The preferred ML method is chosenaccording to the specific application considered. It can beseen from this branch that the same type of algorithm canbe adopted for classification and regression, respectively. Forexample, SVMs are exploited for regression in [68] and forclassification in [6], [74]. Besides, every ML algorithm can beadapted to work in an on-line mode by means of retrainingthe model with every new incoming value during its operation.As discussed in Section III, online learning are particularlysuitable for LQE models that also optimize the adaptivityin [18], [70], [72].For classification, the most frequently used ML algorithmsare naive Bayes, logistic regression, artificial neural networks(ANNs) and SVMs. The first three are used in [17], [18],[38], while SVMs are used in [6], [74]. The ML algorithms used for regression are more diverse ranging from fuzzylogic to reinforcement learning. While the performance of theclassification algorithms is often evaluated according to theprecision/recall and F1 scores in ML communities, potentiallyvia complementary confusion matrices, the performance ofregression are evaluated using distance metrics, such as RMSEand MAE.Fig. 11d shows that, all the other settings being the same,the selection of the ML method for a selected classificationproblem has a relatively smaller impact on the accuracy of aclassifier compared to the other steps of the design process.As reported in both [17] and [39], the accuracy changes byup to 3 percentage points between the considered models. Thezoomed portion of Fig. 12 exhibits the negligible impact ofthe model selection on the F1 score, which is up to around0.02.V. O VERVIEW OF M EASUREMENT D ATA S OURCES
To complement the survey of the LQE models developedusing data, we perform a survey of the publicly available trace-sets that have already been used or could be used for LQE.The data collected for a limited period of time on a givenradio link, is referred to as traces in this section. When a setof these traces is recorded using more links and/or periods inseveral rounds of tests for a given testbed, we refer for it as a trace-set . Traces and trace-sets, in general, are prone to haveirregularities and missing values that need to be preprocessed,especially when ported into ML algorithms. In this paper,we refer to a trace-set that has been preprocessed as dataset .Ideally, a trace-set should include all the information availablethat is directly or indirectly related to the packets’ trip.To support our analysis, Tables VII and VIII summarizethe publicly available trace-sets and the available features ineach trace-set respectively. Our survey only analyzes publiclyavailable trace-sets for LQE research that we were able tolook into, however we mention other applicable trace-setsthat are not publicly available. Table VII reviews the sourceof the trace-sets and the estimated year of creation alongwith the hardware and technology used for the trace-setgathering. Additionally, data that each trace relies on, thesize of the trace/trace-set, the type of communication usedin the measurement campaign, and additional notes on thespecification and characteristic of the trace-sets can also befound in Table VII. Table VIII lists the trace-sets in the firstcolumn while the remaining columns refer to various metricscontained within the trace-set. This table maps the availablemetrics, also referred to as features, to the analyzed trace-sets.To summarize the important points of these trace-sets, theywere collected by the research teams at various universitiesworldwide using their own testbeds [94], [96], [100] or viaconducting one-time deployments [97]–[99], [101], [103]. Thisconfirms that the trace-sets were likely generated on testbedsdeveloped and maintained in universities, which is consistentalso with our findings in Section II-A. According to thesecond column of Table VII, four of the trace-sets are basedon IEEE 802.11, three utilize IEEE 802.15.4, one is basedon IEEE 802.15.1, and one operates on a proprietary radio technology. According to the fourth column of the table, thenumber of entries, i.e. data points, ranges from only 6 thousandup to 21 million, whereas the number of measured data perentry ranges from one to about fifteen. The third column ofthe table lists the measurements available in each trace-set. Formore clarity, the measurements are summarized in Table VIIIfor each trace-set and their meaning and importance for LQEis summarized as follows: • A sequence number holds key information on the consec-utive orders of the received packets and/or frames. Withthe aid of the sequence number, reconstruction of timeseries is enabled and thus it inherently provides informa-tion on packet loss and duplicated packets. It is alreadypart of the frame headers owing to the standardizationefforts. Sequence numbers can be processed to providePRR and its counterpart PER that are useful input forLQE model. • A time-stamp, which can be relative or absolute, is asuitable addition to the aforementioned sequence number.It reveals the amount of elapsed time between measure-ments. Therefore, it can help for deciding on whether aprevious data point is still relevant and thus improvingLQE in a dynamic environment. If a high precision timerand dedicated radio hardware are available, time-stampscan also empower localization. • Measurement points indicating the quality of receivedsignal on the links are mainly described by SNR, RSSIand LQI. SNR represents the ratio between the signalstrength and the background noise strength. Compared toall other features, it allows the most clear-cut observationof the radio environment. However, some hardware, espe-cially constrained devices, might not support direct SNRobservation. In contrast to SNR, RSSI is the most widely-used measurement data and it can be accessed on themajority of radio hardware. It shows high correlation withSNR, since it is obtained in a similar way. Researchersmay argue on its inaccuracy due to the low precision,i.e., quantization is around 3dB on most hardware. Asopposed to the SNR and RSSI, LQI is a score-basedmeasurement data and mostly found in radios of ZigBee-like (IEEE 802.15.4) technologies, which provides anindication of the quality of a communication channel forthe transmission and the flawless reception of signals.However, the drawback of LQI is the lack of strictdefinition, leaning it to the vendor to decide its way ofimplementation and it may lead to the difficulty of cross-hardware comparison across vendors. • For a more dynamic environment of wireless networks,where nodes are mainly mobile, information regardingthe physical (geographical) locations can be beneficial. • Additionally, there are other software related measure-ments data including queue size, queue length and framelength just to name few. If we refer to domain knowl-edge , shorter frames tend to be more prone to errors, Domain knowledge is the knowledge relating to the associated environ-ment in which the target system performs, where the knowledge concerningthe environment of a particular application plays a significant role in facili-tating the process of learning in the context of ML algorithms. TABLE VII: Publicly available trace-sets for the analysis of LQE.
Origin of Trace-sets HW. & Technology Measurements Data Points Type Additional Notes
MIT, Roofnet, [94], [95],2002 Cisco Aironet 350,IEEE 802.11b, mesh,custom Roofnet protocol Source, destination,sequence, time, signal,noise and so on 21 258 359(1725 links, 4 bitrates) 1-to-N Which packets were lost on a link isnot provided.Rutgers University,ORBIT testbed, [96],2007 29x PC + Atheros 5212,IEEE 802.11abg Seq. number, RSSI 611 632(406 links,300 packets/link,1 packet/100 ms, 5 levelsof noise) 1-to-N Minor preprocessing is involved.“Packet-metadata”, [97],2015 2x TelosB,IEEE 802.15.4 RSSI, LQI, noise floor,packet size, no. retries,energy, Tx power, ACK,queue size and so on 14 515 200(300 packets per80646 runs per6 distances) 1-to-1 It requires minor preprocessing.Colorado, [98], 2009 5x listeners, IEEE 802.11 Signal strength, data rate,channel, time-stamp andso on 29 000(500 packets per 58locations) 1-to-1 It requires preprocessing.University of Michigan,[99], 2006 14x Mica2, proprietaryprotocol, sub-GHz ISM RSSI 580 762(1 packet/0.5s,30 min/device,3191 records/link) 1-to-N MATLAB’s binary format isconsidered and inconsistent data isobserved (leading zeros and no units).Source and destination nodes are notclearly identified.EVARILOS, UGent,[100], 2015 6 nodes, Bluetooth RSSI, time-stamp 5 938( < <
35 000 records/link) 1-to-N Hospital environment is considered inthe absence of interference.University of Colorado,[101], [102], 2009 6x PC withomni-directionalantennas, 1x distinctlyconfiguredomni-directional antennafor transmitter,IEEE 802.11 Seq. number, coordinates,direction, TX power,5x RSSI values per log 5x 623 207(500 packets per180 positions per4 directions per11 Tx levels per 5 nodes) 1-to-N Experiment is composed of nodesequipped with antennas that arecapable of serving 4 differentdirections. Tx power is variable andextensive documentation is available.Brussels University,[103], 2007 19x Tmote Sky,IEEE 802.15.4 Seq. number, RSSI, LQI,time-stamp 112 793( < TABLE VIII: Available features of the trace-sets surveyed in Table VII for the sake of LQE.
Trace-set Seq. Numbers Time-stamp RSSI LQI SNR (Signal/Noise) Location Queue (Size/Length) Frame Size HW. Specs.
Roofnet [94], [95] (cid:51) (implicit) (cid:51) / (cid:51) (cid:51) Rutgers [96] (cid:51) (cid:51) (cid:51) (cid:55) / (cid:51) (cid:51) (cid:51) “Packet-metadata” [97] (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) / (cid:51) (cid:51) (cid:51) / (cid:51) (cid:51) (cid:51) Colorado [98] (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) (cid:51)
University ofMichigan [99] (cid:51) (cid:51) (cid:51)
EVARILOS [100] (cid:51) (cid:51) (cid:51) (cid:51) (cid:51)
Colorado [101], [102] (cid:51) (cid:51) (cid:51) (cid:51) (cid:51)
Brussels [103] (cid:51) (cid:51) (cid:51) (cid:51) (cid:51) while queuing statistics can reveal information concern-ing buffer congestions. • For the interpretation of the technical research outcome,revealing which hardwares were utilized during datacollection is important to help diagnosing potential er-ratic behaviors of some hardware, including sensitivitydegradation with time.As can be seen from Table VIII, no single metrics appearsin all trace-sets, however, sequence numbers, time stamps,RSSI, location and hardware specifications are available inthe majority.The Roofnet [94] is a well known WiFi-based trace-set builtby MIT. It contains the largest number of data points amongthe trace-sets listed in Table VII. However, it is difficult toobtain the exact Roofnet setup/configuration used during thecollection of the measurement data, since it has evolved withother contributions. One particular drawback of Roofnet is that PRR, as a potential LQE candidate, can only be computed asan aggregate value per link without the knowledge of howthe link quality varied over time. Table VIII shows that thisparticular trace-set strictly depends on SNR values for theanalysis of LQE.The Rutgers trace-set [96] was gathered in the ORBITtestbed. It is large enough for ML models, requires onlymoderate preprocessing and is appropriately formed for data-driven LQE. It contains the overall packet loss of 36.5%. Themeta-data contains information regarding physical positions,timestamps and hardware used. The trace-set for each nodecontains raw RSSI value along with the sequence number,as depicted in Table VIII. From the surveyed papers, [18]relies on both Rutgers and Colorado, while [11] considers onlyRutgers.The “packet-metadata” [97] comes with a plethora of fea-tures convenient for LQE research, as indicated in Table VIII. In addition to the typical LQI and RSSI, it provides infor-mation about the noise floor, transmission power, dissipatedenergy as well as several network stacks and buffer relatedparameters. One of the major characteristic of this trace-setis to enable the observation of packet queue. Packet loss canonly be observed in rare cases with very small packet queuelength.Upon closer investigation for the remaining six trace-setslisted in Table VII, they are not primarily targeted for data-driven LQE research. The trace-set from the University ofMichigan [99] is somewhat incomplete and suffers froman inconsistent data format containing lack of units, miss-ing sequence numbers and inadequate documentation. Thetwo EVARILOS trace-sets [100] are mainly well formated,whereas each contains fewer than 2,000 entries, and thusboth are not well suited for data-driven LQE research. InColorado trace-set [101], the diversity of the link performanceis missing as all links seem to exhibit less than 1% packetloss. Finally, the trace-set of Brussels University [103], at thetime of writing, is inadequate for data-driven LQE analysis,and suffers from an inconsistent data structure and deficientdocumentation.After careful evaluation of the candidate trace-sets, we canconclude that the most suitable candidate for data-driven anal-ysis of LQE is the Rutgers trace-set. Roughly speaking, all theother candidates lack sufficient size, are structured in improperformat, contain negligible packet loss hindering from practicalLQE investigation and/or rely on deficient documentation.However, these are the main characteristics required for ML-based LQE investigation, where it’s classification primarilydepends on PRR. Even though we concluded that the Rutgerstrace-set is the most suitable one for data-driven LQE research,it also lacks some critical aspects for near-perfect data-drivenLQE research including explicit time-stamps and non-artificialnoise sources just to name a few. We take this conclusion inaccount later in Section VI-C where we suggest industry andresearch community a design guideline on how a good trace-set should be collected.VI. F
INDINGS
In this section, we present our findings as a result of thecomprehensive survey of data-driven LQE models, publiclyavailable trace-sets and the design of ML-based LQE models.First, we elaborate on the lessons learned from the afore-mentioned survey of the literature, then we suggest designguidelines for developing ML-based LQE models based onapplication quality aspects and for generic trace-set collectionto the industry and research community.
A. Lessons Learned
Having surveyed the comprehensive literature for LQEmodels using ML algorithms in Section II, we now outlinethe lessons we have learned throughout this section. • While traditionally, most LQE models were developedto be eventually used by a routing protocol, recentlyresearchers have also identified their potential application in single hop networks, particularly with the intention ofreducing network planning costs via automation [6]. • Recently, new sources of information or input metrics,such as topological- and imaging-based are consideredfor the development of LQE models, as noted in SectionII-C. • From Sections II-D and III, it can be concluded thatreinforcement learning is a relatively less popular MLmethod for LQE research. • A number of LQE models provide categorization (grade)for link quality rather than continuous values. The anal-ysis in Section II-E shows that the number of categoriesor classes (link quality grades) varies between 2 and 7. • There is no standardized and easy way of evaluating andbenchmarking LQE models against each other, as it isevident from the analysis in Section II-F. • Only a small number of research papers provide all thedetails and datasets so that the results can be readilyreproduced by the research community to improve uponand to be utilized as a baseline/benchmarking modelfor the sake of comparative analysis, as discussed inSection II-G.We highlight the following lessons learned from the ap-plication perspective analysis of the ML-based LQE modelsperformed in Section III. • From the application that uses LQE, such as a multi-hoprouting protocol, we were able to identify five applicationquality metrics that are indispensable for the developmentof an ML-based LQE model: reliability, adaptivity/reac-tivity, stability, computational cost and probing overhead.These application quality metrics are outlined and ex-plained in Section III and distilled from the extensivesurvey in Section II. These metrics are sometimes used toevaluate the performance of the application with/withoutusing LQE. • Only a paucity of contributions explicitly considers adap-tivity, stability, computational cost and probing overheadin their evaluation for the performance of an LQE model,as perceived from the analysis in Section III. No researchpaper considers all five aspects together. • To develop LQE models for wireless networks withdynamic topology, adaptivity can be enabled with theaid of online learning algorithms. Important link changesare difficult to capture with offline models, resulting in adegradation of the performance of the LQE model, as theup-to-date link state is unknown to the intended devices.The lessons learned from design decisions taken for de-veloping existing ML-based LQE models as analyzed inSection IV can be summarized as follows. • Training data for ML models often miss data points, forexample no records for the lost packets can be found.The approach adopted for compensating the missing data,such as interpolation, may have significant impact onthe final performance of the LQE model and explicitlydescribing the process is important for enabling repro-ducibility. • The feature sets that are utilized for LQE research are notalways explicitly reported nor identical among differentLQE models, which hinders fair comparative analysis fordiverse parameter settings. • Training data for ML models can be highly imbalanced.Classification-wise, for example, the training dataset canbe dominated by one type of link quality class (grade),which consequently leads to a highly biased LQE modelthat is unable to recognize minority classes. To counterthis artifact, resampling has to be employed for highlyimbalanced datasets. No research papers explicitly statetheir resampling strategy, as readily observed in Fig. 10of Section IV-C. • Logistic and linear regressions are linear models that tendto be more suitable to approximate linear phenomena. Inpractical scenarios, LQE models do not obey linearity andtherefore ANN-based models outperform linear models.However, ANN- and DNN-based models usually requirehigh memory and computational resources, which isunfavorable for constrained devices, albeit they may betuned to necessitate less resources but at the expense ofproportional performance.From the overview of measurement data sources in Sec-tion V, we have learned the following lessons. • Only a limited number of publicly available datasetsrecord overlapping/identical metrics, which can indeedempower fair comparative analyses between diverse LQEmodels. • Measurement points indicating the quality of the receivedsignal on links are commonly defined by SNR, RSSI andLQI.
B. Design Guidelines for ML-based LQE Model
Due to a very large decision space for developing a ML-based LQE model, it can be challenging to provide a universaldecision diagram or methodology. However, showing howapplication requirements affect design decisions, and by reflex-ivity, how certain design decisions can favor some applicationrequirements can be invaluable for the development of ML-based LQE models. In this section, we provide design guide-lines on developing a ML-based LQE model starting from thefive application quality aspects identified in Section III andtheir implications on decisions during the design steps of theML process discussed in Section IV. The visual relationshipof how the application quality aspects influence the designdecisions for developing LQE models is illustrated in Fig. 13.
1) Reliability:
When reliability is the only applicationquality aspect to be optimized for developing a ML-basedLQE model, trace-set collection, data pre-processing and MLmethod selection should be carefully considered, as depictedin the Reliability branch of the mind map in Fig. 13. Trace-set collection:
The trace-set collection and subse-quent probing mechanism utilized during the actual operationof an LQE model, can collect all the input metrics listed inTable VIII and perhaps even other inventive metrics that havenot been used up-to-date in the existing literature.
Data pre-processing:
During data pre-processing, highdimensional feature vectors using recorded input metrics aswell as synthetically generated ones (see Section IV-B) canbe used as there are no constraints on the memory useor computational power of the machine used to train thesubsequent model.
ML method selection:
During ML method selection, morecomputationally expensive methods, such as DNN, SVMswith non-linear kernel as well as ensemble methods, such asrandom forests can be considered. For accurate models thatprovide very good reliability , these methods are able to trainon high dimensional feature vectors. However, they will alsorequire many training data-points, possibly hours or days ofmeasurements. While DNNs are known to be very powerful,they are also excessively data hungry. Their performance canbe significantly diminished if the data-points are not sufficient.
2) Adaptivity:
When adaptivity is the only applicationquality aspect to be optimized for developing an ML-basedLQE model, data pre-processing and ML method selection arethe two aspects to be examined, as illustrated in the Adaptivitybranch of the mind map in Fig. 13. Data pre-processing:
Adaptivity requires LQE modelto capture non-transient link fluctuations, therefore it has tomonitor temporal aspects of the link. This is usually realizedby introducing time windows on which the pre-processing isdone. As opposed to pre-processing all available data in abulk mode for subsequent offline development as employedfor reliability aspect, each window is pre-processed separatelyfor the adaptivity . The size of the window then influences the adaptivity of the model, where a smaller window size yieldsa more adaptive model.
ML method selection:
During the ML method selection,online versions of ML methods or reinforcement learning aremore suitable for capturing the changes in time. Generally, theonline version of an offline ML method may be slightly moreexpensive computationally and its performance may be slightlyreduced. Reinforcement learning is a class of ML algorithmsthat learn from experience and these are inherently designedto adapt to changes. The higher the required adaptivity , thefaster the model has to change, leading to a more reactive ML(method) parameter tuning.
3) Stability:
When stability is the only application qualityaspect to be optimized for developing an ML-based LQEmodel, the same ML design steps are affected as outlinedin the Adaptivity aspect, namely data pre-processing and MLmethod selection, as portrayed in the Stability branch of themind map in Fig. 13. However, they are reversely affectedwhen compared to the adaptivity aspect.
Data pre-processing:
Stability requires LQE model tobe immune to transient link behavior. While it may assumechanges over time, it encourages only relevant changes. Thesize of the window chosen in this case typically represents acompromise between the batch approach mentioned for relia-bility and the relatively small reactive window that maximizes adaptivity . ML method selection:
During the ML method selection,online versions of ML methods or reinforcement learning aremore suitable for capturing changes in time, however, they Design Guidelinesfor DevelopingLQEs
Reliability
Tracesetcollection Consider longerobservation timeConsider more metrics(see Tab. VIII)Consider external metricsfrom reliable sourcesConsider activeprobing mechanismsConsider increasingnumber of datapointsPreprocessing Consider producing sytheticfeatures (see Sec. IV-B)High dimensionalityfeature vectorConsider adding alternativedata representationsMethodSelection Consider DNN forhighdimensional dataConsider SVM withnon-linear kernelConsider ensemblemethods
ComputationalCost
PreprocessingMinimizefeature setConsider reducing datadimensionalityConsider reducing dataprecisionConsider smallerwindow size MethodselectionLess expensive approachese.g. NaiveBayes;Linear algorithms;pretrained & pruned DNNConsider onlineML approaches
Stability
PreprocessingConsider reducing influenceof transient effectsConsider largerwindow size MethodsselectionOptimize to detectpersisting changesMinimize influenceof transient effects
Adaptability
TracesetcollectionFast enough to picktransient effectsConsider smallerwindow PreprocessingConsider operating ontime window; batchesor streamsConsider smallerwindow size MethodselectionConsider onlineversion of algorithmsConsider fastoffline methodsReinforcementlearning
ProbingOverhead
Tracesetcollection Consider reducing datacollectionPick directlyavailable dataMinimize feature set(see Tab. VIII)Consider passive probingConsider Alternativelearning datasets
Fig. 13: Mind map representation of design guidelines for LQE model development.need to be optimized to detect persistent link changes, whilebeing immune to transient ones.
4) Computational Cost:
When computational cost is theonly application quality aspect to be optimized for developingan ML-based LQE model, data pre-processing and ML methodselection should be carefully contemplated, as outlined in theComputational Cost branch of the mind map in Fig. 13. Data pre-processing:
Computational cost optimizationrequires reducing memory and energy consumption as well asprocessor performance aspects required for the LQE modeldevelopment. For offline or batch processing, the size ofthe feature vectors should be kept to a minimum, therefore it has to include only the most relevant real or syntheticfeatures. Alternatively, projecting large feature vectors to alower dimensional space might help for training. Additionally,for online processing, smaller time windows that minimizeRAM consumption are favored.
ML method selection:
During ML method selection,less intensive methods, such as naive Bayes or linear/logisticregression are preferred. When online versions of the MLmethods are utilized, their configurations should be appropri-ately adjusted so that the resource usage is kept at minimum.For instance, transfer learning [87] approaches enable stripped down versions of a complete model that was previouslylearned on a powerful machine, which is then deployed tothe production environment. Transfer learning is becoming arelatively popular way of deploying DNN-based models onflying drones for instance [87].
5) Probing overhead:
When
Probing overhead is the only application quality aspect to be optimized for developing anML-based LQE model, trace-set collection is the only designprocess that requires careful attention, as illustrated in theprobing overhead branch of Fig. 13. Trace-set collection:
Trace-set collection and subsequentprobing mechanism utilized during actual operation of theLQE model should only collect few and most importantmetrics from the ones listed in Table VIII. Ideally, LQE modelcan be engineered to work on passive probing so that it canonly use the metrics that the transmitter captures.
6) Practical scenarios:
A practical application using LQEwill likely request optimizing more than one of the fiveidentified application quality aspects. As a result, the guidelineand its illustrations for such cases would be more sophisticatedand interconnected than in Fig. 13. However, the proposedguideline provides an overview of the measures to be takenand presents an invaluable trade-off between these applicationquality aspects that require careful attention for the develop-ment of an ML-based LQE model.For example, when the application requires high reliability and adaptivity , large feature spaces can be used with powerfulonline algorithms on appropriately identified time windows.However, if computational cost is appended to the require-ments, the feature space should be limited and the algorithmparameters should be optimized. If the LQE model is stillcomputationally expensive, transfer learning or other out-of-the-box ML methods should be employed. When probingoverhead is also appended to the previously-mentioned appli-cation quality aspects, then the feature set should only includelocally available data (passive probing) and limited number ofmetrics (possibly none) involving active probing, as discussedin Section II-C. In brief, this guideline can be used as areference for the development of an ML-based LQE modeldepending on the combination or quality aspects relevant forthe application.
C. Design Guidelines for Trace-Set Collection
We now attempt to provide a generic guideline on howto design and collect an LQE trace-set, as portrayed inFig. 14. It is worth noting that this design guideline comprisesof plausible and reasonable observations gleaned from thissurvey of LQE and trace-sets, and from the analysis of MLmethods reviewed for the sake of LQE models. Our plausiblerecommendations on how to design and collect an LQE trace-set can be summarized as follows, which can also be followedas in Fig. 14.
1. Core components of a trace-set : Deciding on the datacollection strategy, the application and the environment is acrucial stage, since the development of an LQE model isstrictly dependent on the trace-set environment including in-dustrial, outdoor, indoor and “clean” laboratory environments. State of the radio spectrum and interference level are importantmetrics to be taken into account before collecting a trace-set. For example, for an LQE model to work efficiently ina particular environment that is exposed to interference, thenthe LQE model has to be developed and trained over thiskind of trace-set. More explicitly, one cannot expect an ML-based LQE model to perform well in an interference-exposedenvironment without having it implemented and tested ona trace-set containing interference measurement data, whichleads us to data collection strategy and the application.
2. Availability and documentation : Making trace-set pub-licly available is also another important stage, which canindeed empower better cross-testbed comparisons and providegood support/foundation from research community to conductand disseminate research on LQE models. There are numerousways to make trace-sets publicly available. One well knownrepository for wireless trace-sets is CRAWDAD , although re-searchers can also take advantage of other methods like publicversion control systems, e.g., GitHub, GitLab and BitBucketjust to name a few. Moreover, a systematic description onhow the trace-set was collected is also required for researchcommunity to understand, test and improve upon. This willindeed help in capacity building between research groups.
3. Essential measurements data : Plausible logic dictatesthat a generic trace-set that can be utilized for any kind of LQEresearch is infeasible considering numerous features inducedby the wireless communication parameters. By interpretingour overall observations gleaned from this survey paper, someof the most important measurements data or features thatare recommended for an effective LQE research are alreadyincluded in the design guideline of Fig. 14 with a notice thatother application-dependent features may be required for astrong analysis of the LQE model. The elaborated details ofthese essential measurements data can be found in Section V.There may be other application-dependent metrics and fea-tures (measurements data) related to the set of parametersof wireless communication that could be taken into accountfor a healthy investigation of a particular LQE model. Weobserve from the outcomes of this survey paper that eachapplication can have unique characteristics and requirementsfor maintaining reliability, for satisfying a certain QoS andmore generally for accomplishing a target objective, suchas in smart grid, wireless sensor network, mobile cellularcommunication, air-to-air communication, air-to-ground com-munication, traditional terrestrial communication, underwatercommunication and other wirelessly communicating networks.Explicitly, for each application of these networks, determininga suitable evaluation metric is vitally important for the sakeof maintaining a reliable and adequate communication. There-fore, trace-sets have to be designed and collected based on notonly applications but also on evaluation metrics consideringdiverse environments, settings and technologies in order to beable to derive the properly effective metrics for an efficientdevelopment of the link quality estimation models.Nonetheless, from the perspective of innovative datasources, a trace-set can be built without on-site measurements A repository for archiving wireless data at Dartmouth: https://crawdad.org. SystematicAvailableSequence NumberSignal StrengthBackground Noise Location InformationSoftware Related Frame LengthTime StampsLocalizationEmpoweringStandardization Improved LQE
Availability and DocumentationEssential Measurements Data
PubliclyMeasurementEnvironment Data CollectionStrategyTrace-Set Target ApplicationCross-testbed Packet InformationHardware SpecificationsSNR, RSSI, LQI Time-Varying ChannelMobility of NodesQueue Size and LengthErratic Behaviours & Research CommunitySupport from IndustryComparison Documentations Hardware AgingChannel Quality (PRR, PER, Duplicates)
Core Components of a Trace-Set
More application-dependent features may be required for a potent analysis of the LQE model.
Fig. 14: Design guidelines recommended for the industry and research community to follow in order to design and collecttrace-sets for the sake of LQE research.and before embarking on hardware deployments in order toprovide a good estimate for the link quality for the sakeof maintaining reliable communications. To achieve suchgoal, Demetri et al. [6] exploited readily available multi-spectral images from remote sensing, which are then utilizedto quantify the attenuation of the deployment environmentbased on the classification of landscape characteristics. Thisparticular research demonstrates that the quantification andclassification of links can be conducted via solely relying onthe image-based data source rather than the traditional on-sitemeasurements data.For urban area applications, the aforementioned techniquecan also be leveraged for maintaining up to a certain degreeof the link quality, but only considering the stationarity of the deployment environment. This is mainly because the spectralimages obtained via remote sensing represent a stationaryinstance of the landscape and thus this technique woulddramatically fail, since the LQE model developed using remotesensing would not be able to cope with the high mobility insuch a scenario with moving vehicles, slowly-fading pedestrianchannels, mobile UAVs and so on.Besides, 3D model of large buildings can also be leveragedfor the optimal indoor deployment of access points and wire-less devices in order to supply with the adequate connectivityand coverage. The trace-set built from this indoor deploymentcan be utilized for other large and similar indoor buildingsalong with an indoor-generic LQE model to understand thecharacteristics of indoor links and to provide high quality link performance. Similarly, the same strategy can be implementedfor a particular city to understand the link behavior in differentweather conditions. One study for such scenario is conductedusing high frequency [104], [105], where the impact of rainfallon wireless links was researched. They utilized rain gaugesand their models are demonstrated to contain large bias, andrainfall predictions were underestimated, which indicates that along-lasting and realistic measurement conditions are requiredalong with a plethora of measurements data before developinga healthy LQE model.Finally, recording hardware related metrics on a trace-setcould also help in diagnosing potential problems during themodel development. This would indeed require commercialradio chips that are capable of reporting the chip errors orchip related issues in order to pinpoint problems that may beencountered at the time of measurements data collection [106].VII. S UMMARY
Having outlined the lessons learned along with a compre-hensive design guideline derived for ML-based LQE modeldevelopment and trace-set collection, we now provide ourconcluding remarks and future research directions along withchallenging open problems.
A. Conclusions
The data-driven approaches have been long ago adoptedin the study of LQE. However, with the adoption of MLalgorithms, it has recently gained new momentum stimulat-ing for a broader and deeper understanding of the impactof communication parameters on the overall link quality.In this treatise, we first provide an in-depth survey of theexisting literature on LQE models built from data traces,which reveals the expanding use of ML algorithms. We thenanalyze ML-based LQE models using performance data withthe perspective of application requirements as well as withthe ML-based design process that is commonly utilized in theML research community. We complement our survey withthe review of publicly available datasets relevant for LQEresearch. The findings from the analyses are summarized anddesign guidelines are provided to further consolidate this areaof research.
B. Future Research Directions
Finally, we conclude the paper with a discussion on the openchallenges, followed by several directions for future research,regarding (i) data sources utilized for developing LQE models,(ii) applicability of LQE models to heterogeneous networksincorporating multi-technology nodes, and (iii) a broader anddeeper understanding of the link quality in various environ-ments.It is highly likely that commercial markets will leverage ei-ther pre-built LQE models for a particular application or entiretraining data to develop models from scratch. The potentialopportunity of ”model stores” and ”dataset stores” can fol-low a similar way to conventional application stores/markets,distributing models for diverse applications. The competition will gradually become ripe as time elapsed. However, data-driven models are still in their infancy and several critical openchallenges await concerning LQE models, which are outlinedas follows.1) A significant challenge is to directly compare differentwireless link quality estimators. As discussed in Sec-tion II-F, there is no standardized approach to evaluatethe performance of the estimators, and only a very smallsubset of estimators are compared directly in existingworks. Establishing a uniform way of benchmarkingnew LQE models against existing ones using standarddatasets and standard ML evaluation metrics, such aspracticed in various ML communities, would greatlycontribute to the ability to reproduce and compare in-novative ML-based LQE models.2) The performance of the existing LQE models usingclassifiers are solely evaluated based on the accuracy metric, possibly in addition to another application-specific metric, as discussed in Section II-F. However, itis well-known in the ML communities that accuracy isa misleading performance evaluation metric, especiallyfor imbalanced datasets [107]. Adopting standardizedmetrics for classification, e.g., precision , recall , F and, where necessary, the detailed conf usion matrix would lead to a more in-depth understanding of theactual performance and behavior of the LQE models forall the target classes. The same challenge applies to LQEmodels solving a regression problem.3) Another challenge is to encourage researchers and in-dustry to share trace-sets collected from real networks.More suitable public trace-sets would allow algorithmsand machine learning models to be properly evaluatedacross different networks and scenarios considering theimportant metrics discussed in Section V. Indeed, trace-sets collected in an industrial environment could betterrepresent a realistic communication network potentiallywith a broad number of parameters.4) The other challenge is to go beyond one-to-one trace-sets. Research community is required to extend the scopeto a more realistic measurement setup, e.g., consider-ing multi-hop, non-static networks representing severalwireless technologies. Such instances of trace-sets arescarce due to the necessity of exhausting efforts tomonitor and record a packet’s travel through a particularcommunication network.5) Another challenge is that certain types of trace-sets arevery expensive and time-consuming to gather. One wayto overcome this is to conduct a synthesis of artifi-cial data using generative adversarial neural networksas pointed out in [108]. Roughly speaking, this openchallenge is a formidable task, since conducting suchsynthesis could potentially introduce unwanted bias toexisting data, even though for specific applications anumber of suitable examples of this method can befound in the literature, such as wireless channel mod-eling [109], [110].6) The traditional approach to measure interference is mainly conducted through SNR or RSSI measurementdata, which strictly relies on the data collection at certainintervals, and communication established from othernodes is mainly treated as a background noise for thesake of simplicity. The aim of interference measurementas part of this challenge is to develop LQE modelsthat are aware of the on-going communication withina heterogeneous communication environment. None ofthe trace-set layouts surveyed in Section V is designedfor such asynchronous information. Therefore, researchcommunity and industry have to pay attention to col-lecting such realistic trace-sets in order to be able todevelop robust, agile and flexible LQE models that canreadily adapt in dynamic and realistic communicationenvironments.7) The wireless link abstraction comprised of channel,physical layer and link layer represents a complex sys-tem affected by a multitude of parameters, but mostof the LQE datasets and research only leverages asmall number of observed parameters. While recentlyadditional image-based and topological-based contextualinformation has been incorporated in LQE models, itwould be necessary in future large scale multi-parametermeasurement campaigns to also capture the type ofantenna, modulation and coding utilized, producer ofthe transceiver, firmware versions, to name a few. Suchefforts would lead to a more in-depth understanding ofthe real-world operational networks and potential useof the findings to make well-informed decisions for thedesign of next-generation wireless systems, even beyondML-based LQE model development.In order to realize beyond simple decision making, i.e.,channel and radio behavior modeling, hand-tuning of com-munication parameters within transceivers must be avoided. Itis anticipated that the transceivers’ internal components will begradually replaced by software-based counterparts. Therefore,an inevitable incorporation of software-defined radio (SDR),FPGAs and link quality estimators is expected for intelligentlyhandling parameters and operations through self-containedsmart components. These joint LQE models can be designedin a similar manner to [111], particularly for heterogeneousnetworks involving the 5G and beyond communications.The recent advancements in data-driven approaches in theform of machine learning and deep learning have alreadyproven to be successful for the applications of communicationnetworks. For example, attempts to use neural network-basedautoencoders for channel decoding provide promising solu-tions [112], which can also be adopted for data-driven LQEinvestigation as it is discussed in [40].The performance of link quality estimator is constrainedby the dynamic network topology and one can keep track ofthe network topology changes considering replay-buffer-baseddeep Q-learning algorithm developed in [113], where authorscontrol the position of UAVs, acting as relays, to compensatefor the deteriorated communication links.Additionally, LQE models involved in the optimizationproblems may become very large in size, and thus algorithmsthat can reduce complexity have to be developed to tackle with the scale of the problem. For example, a similar deeplearning approach to [114] can be adopted for improvingthe performance of the proposed LQE model by means ofeliminating the links from optimization problem that are notutilized for transmission.Referring back to Section II-F, we discussed the conver-gence rate of LQE models. While some contributions [9], [11],[12], [17] focus their attention on the convergence of their LQEmodel, majority of the papers tend to neglect it. Motivatedby this premise, we suggest the research community to payparticular attention on the LQE model convergence in orderto prove the validity of their proposed models.In addition to finding other new sources of data, a chal-lenging task would be to analyze a large set of measurementsin various environments and settings, from a large number ofmanufacturers to understand how measurements vary acrossdifferent technologies and differ for various implementationswithin the same technology, and derive truly effective metricsfor an efficient development of the link quality estimationmodel. A CKNOWLEDGMENT
This work was funded in part by the Slovenian ResearchAgency under Grants P2-0016 and J2-9232, and in part bythe European Community H2020 NRG-5 project under Grant762013. The authors would also like to thank Timotej Galeand Matjaˇz Depolli for their valuable insights. A CRONYMS Four-Bit Foresee AI Artificial Intelligence
BER
Bit Error Rate
CDF
Cumulative Distribution Function
ETX
Expected Transmission count
F-LQE
Fuzzy-logic based LQE
FLI
Fuzzy-logic Link Indicator
KDP
Knowledge Discovery Process LQ Link Quality
LQE
Link Quality Estimation
LQI
Link Quality Indicator
MAE
Mean Absolute Error ML Machine Learning
NLQ
Neighbor Link Quality
PER
Packet Error Rate
PRR
Packet Reception Ratio
PSR
Packet Success Ratio
RMSE
Root-Mean-Square Error
RNP
Required Number of Packets
ROC
Receiver Operating Characteristic
RSS
Received Signal Strength
RSSI
Received Signal Strength Indicator
SGD
Stochastic Gradient Descent
SNR
Signal-to-Noise Ratio
SVM
Support Vector Machine
TCP
Transmission Control Protocol
WMEWMA
Window Mean with an ExponentiallyWeighted Moving Average
WNN-LQE
Wavelet Neural Network based LQER