[PDF] Client-Based Intelligence for Resource Efficient Vehicular Big Data Transfer in Future 6G Network

Abstract

Vehicular big data is anticipated to become the "new oil" of the automotive industry which fuels the development of novel crowdsensing-enabled services. However, the tremendous amount of transmitted vehicular sensor data represents a massive challenge for the cellular network. A promising method for achieving relief which allows to utilize the existing network resources in a more efficient way is the utilization of intelligence on the end-edge-cloud devices. Through machine learning-based identification and exploitation of highly resource efficient data transmission opportunities, the client devices are able to participate in overall network resource optimization process. In this work, we present a novel client-based opportunistic data transmission method for delay-tolerant applications which is based on a hybrid machine learning approach: Supervised learning is applied to forecast the currently achievable data rate which serves as the metric for the reinforcement learning-based data transfer scheduling process. In addition, unsupervised learning is applied to uncover geospatially-dependent uncertainties within the prediction model. In a comprehensive real world evaluation in the public cellular networks of three German Mobile Network Operators (MNOs), we show that the average data rate can be improved by up to 223 % while simultaneously reducing the amount of occupied network resources by up to 89 %. As a side-effect of preferring more robust network conditions for the data transfer, the transmission-related power consumption is reduced by up to 73 %. The price to pay is an increased Age of Information (AoI) of the sensor data.

Full PDF

11 Client-Based Intelligence for Resource EfﬁcientVehicular Big Data Transfer in Future 6G Networks

Benjamin Sliwa ,

Student Member, IEEE , Rick Adam, and Christian Wietfeld,

Senior Member, IEEE

Accepted for publication in: IEEE Transactions on Vehicular Technology2021 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposesmust be obtained from the IEEE by sending a request to [email protected].

Abstract —Vehicular big data is anticipated to become the“new oil” of the automotive industry which fuels the developmentof novel crowdsensing-enabled services. However, the tremendousamount of transmitted vehicular sensor data represents a massivechallenge for the cellular network. A promising method forachieving relief which allows to utilize the existing networkresources in a more efﬁcient way is the utilization of intelligenceon the end-edge-cloud devices. Through machine learning-basedidentiﬁcation and exploitation of highly resource efﬁcient datatransmission opportunities, the client devices are able to partic-ipate in overall network resource optimization process. In thiswork, we present a novel client-based opportunistic data trans-mission method for delay-tolerant applications which is basedon a hybrid machine learning approach: Supervised learningis applied to forecast the currently achievable data rate whichserves as the metric for the reinforcement learning-based datatransfer scheduling process. In addition, unsupervised learningis applied to uncover geospatially-dependent uncertainties withinthe prediction model. In a comprehensive real world evaluationin the public cellular networks of three German Mobile NetworkOperators (MNOs), we show that the average data rate canbe improved by up to % while simultaneously reducingthe amount of occupied network resources by up to %. Asa side-effect of preferring more robust network conditions forthe data transfer, the transmission-related power consumption isreduced by up to %. The price to pay is an increased Age ofInformation (AoI) of the sensor data. I. I

NTRODUCTION

The various sensing and communication capabilities ofmodern vehicles have brought up vehicular crowdsensing [1],[2] as a novel method for acquiring various kinds of measure-ment data. Hereby, the mobility behavior of the vehicles isexploited to dynamically cover large areas with sensing ca-pabilities. It is expected that the vehicle-as-a-sensor approachwill catalyze the development of data-driven applications suchas distributed creation of High Deﬁnition (HD) environmentalmaps, trafﬁc monitoring, predictive maintenance, road rough-ness detection, and distributed weather sensing [3].As pointed out by [4], a high amount of these targetapplications — in particular, mapping services — can becharacterized as delay-tolerant . Hereby, the applications donot require immediate data delivery but specify soft dead-lines within which the received information is consideredmeaningful. In their empirical analysis, the authors of [5]analyzed the properties of 32 existing crowdsensing systemsfrom which 23 were found to be compatible with store-and-forward data delivery mechanisms. As an example, the

The authors are with the Communication Networks Institute,TU DortmundUniversity, 44227 Dortmund, Germany (e-mail: { Benjamin.Sliwa, Rick Adam,Christian.Wietfeld } @tu-dortmund.de) AnticipatoryNetworkingMachineLearningChannelDynamicsCo-existence CellularNetwork

Vehicular Big Data ... Vehicle-as-asensor

Shared radiomediumResourcecompetition

Reinforcement Learning

Client-based Medium Access

Supervised Learning

Data Rate Prediction

UnsupervisedLearning

Black Spot Clustering f tSINR t

HD EnvironmentMapsTraﬃcMonitoringDistributedWeather SensingPredictiveMaintenanceRoadRoughness

Wireless NetworkingChallenges EnablingMethodsMobile Crowdsensing

DynamicLOS Opportunistic DataTransfermassiveMTCSmall ChannelCoherence Time Non-celular-centricNetworking

Fig. 1: Overview about applications, challenges, and enablingmethods for vehicular big data in cellular communicationnetworks.Automotive Edge Computing Consortium (AECC) has an-alyzed the requirements for distributed construction of HDenvironmental maps for automated driving in a recent whitepaper [6]. For permanent and transient static objects (e.g., roadnetwork, surrounding buildings, road work), an update intervalin the range of multiple hours is proposed. Even for reportingdynamic obstacles such as other trafﬁc participants, periodicdata transfer with an interval of 15 s is considered sufﬁcient.The rise of vehicular big data will confront the cellu-lar network with tremendous amounts of resource require-ments for vehicular massive Machine-type Communication(mMTC). Since the provision of additional spectrum resourcesthrough densiﬁcation of the network infrastructure is highlycost-intense, it would be preferable to utilize the existing resources in a more efﬁcient way through application ofmachine learning-enabled network intelligence. An overviewabout the corresponding applications, challenges, and solutionapproaches for vehicular big data transfer in cellular networks,which is further described in the following paragraphs, isshown in Fig. 1. Within the scope of this work, we ap-ply a pragmatic approach which utilizes existing methodsfrom the machine learning domain. However, it is remarkedthat these enabling methods are themselves subject to activedevelopments in their corresponding research communities.Therefore, it can be expected that future advancements withinthe neighboring ﬁelds can be utilized for further improvingthe resource efﬁciency of vehicular big data transfer.While the current deployments and research efforts forthe emerging 5G networks focus on network-side intelligence(e.g., the Network Data Analytics Function (NWDAF) allowsmachine learning-based load analysis of network slices [7]),researchers agree that pervasive intelligence will be one of the a r X i v : . [ c s . N I] F e b S I NR [ d B ] Connectivity HotspotsConnectivity Valleys

Link Loss

Avoid for TransmissionsExploit for Transmissions

Proposal - ReinforcementLearning-based Data Transfer

Retransmissions Low ResourceEﬃciencyPacket Loss Good Intra-cellCoexistence High ResourceEﬃciencyReliable and FastData Transfer

Fig. 2: Example for the dynamics of the vehicular radio chan-nel. For optimizing the achievable resource efﬁciency, client-based intelligence is used to exploit connectivity hotspots andavoid transmissions during connectivity valleys.key drivers for future 6G networks which are expected to bedeployed around 2030 [8], [9]. As a consequence, this willcatalyze the development of non-cellular-centric networkingmechanisms such as end-edge-cloud orchestrated intelligence[10] where locally applied machine learning mechanisms allowthe client devices to participate in network functions andcontribute to the overall network optimization.An important observation which motivates our contributionis that regular ﬁxed-interval data transmission schemes expe-rience a large variance of the network quality (see Fig. 2). Inorder to avoid packet errors and retransmission, the mobileUser Equipments (UEs) dynamically adjust the Modulationand Coding Scheme (MCS) to achieve a better robustness inchallenging channel situations. However, since lower MCSsreduce the transmission efﬁciency and increase the occupationtime of the Physical Resource Blocks (PRBs), this methodresults in a wastage of network resources and has a negativeimpact on the intra-cell coexistence.In this work, we exploit the delay-tolerant nature of manyvehicular crowdsensing applications as well as the mobilityof the vehicles for improving the cellular resource efﬁciency.Client-based intelligence is applied in order to autonomouslyschedule the data transfer with respect to the anticipatedtransmission efﬁciency. Our proposed method brings togetherand extends the results of previous work for reinforcementlearning-enabled data transfer in vehicular scenarios [11], [12].The contributions provided by this paper are summarized asfollows: • Presentation of Black Spot-aware Contextual Bandit(BS-CB) as a novel hybrid machine learning approachfor opportunistic data transfer for mobile and vehicularnetworks. • Comprehensive real world performance analysis andcomparison to existing data transfer methods. • Proof-of-concept evaluation for compensating conceptdrift situations of the data rate prediction through onlinelearning . • The raw results and the developed measurement softwareare provided in an open source way. https://github.com/BenSliwa/rawData opportunistic data transfer The remainder of the paper is structured as follows. Afterdiscussing the related work in Sec. II and giving an overviewabout the different evolution stages of the novel method inSec. III, we present the reinforcement learning-based solutionapproach in Sec. IV. Afterwards, the methodological setup isintroduced in Sec. V and the achieved results are presentedand discussed in Sec. VI. Based on the resulting insights,we derive recommendations for future 6G networks which aresummarized in Sec. VII.II. R

ELATED W ORK

Machine learning has received tremendous attention withinthe wireless research community due to its inherent capabilityof implicitly considering hidden interdepencies between mea-surable indicators which are too complex to model analytically.Different summary papers [13]–[16] provide comprehensiveinformation about using machine learning methods for op-timizing wireless networks. Three major machine learningdisciplines are distinguished: • Supervised learning allows to learn a model f ML on features X with labeled data Y such that f : X → Y .After the training phase, the model can be utilized tomake predictions ˜ y on novel unlabeled data x such that ˜ y = f ( x ) . For this purpose, popular model classes are(deep) Artiﬁcial Neural Networks (ANNs) [17], Classi-ﬁcation and Regression Trees (CARTs)-based methodssuch as Random Forests (RFs) [18], and Bayesian modelssuch as Gaussian Process Regression (GPR) [19]. • Unsupervised learning is applied to cluster measure-ments based on patterns in non-labeled data sets. Apopular method for this category is the k-means [20]algorithm. • Reinforcement learning [21], [22] teaches an agent toautonomously perform favorable actions in a deﬁned environment by learning from the observed rewards ofpreviously taken actions. Q-Learning [23] represents thefoundation for most more complex methods such as deepreinforcement learning.Within commercial deployments of emerging 5G networks,the implementation of machine learning-based intelligencemainly focuses on the network infrastructure side. NWDAF[7], [24] is a novel machine learning-enabled network func-tion which is used by the MNOs to determine and predictthe network load. Different use-cases that could exploit thisinformation — e.g., trafﬁc routing, mobility management, loadbalancing, and handover optimization — are motivated in [25].Among others, the white paper of [9] envisions pervasivemachine learning as one of the fundamental enabling methodsfor future 6G networks which are expected to be deployedaround 2030. As a consequence of the trend of bringingintelligence closer towards the client devices, resource-awaremachine learning has become an emerging research topic.A comprehensive summary about resource aspects for edge-based intelligence is provided by Park et al. in [26].The recent advancements in machine learning-based dataanalysis have also led to the rise of the end-to-end modelingparadigm for wireless communication systems [27] and have ▪ Fixed transmissioninterval

Periodic ▪ No consideration of radiochannel conditions

CAT ▪ Heuristic approach▪ Channel-aware datatransfer based on SINRmeasurements ▪ Prediction of the datarate based on nine contextfeatures

ML-CAT RL-CAT ▪ Autonomous decisionmaking using Q-Learning ▪ Avoid transmissions in▪ Uncertainty of theprediction model takeninto account

BS-CB black spot regions

ProbabilisticDeterministic

Reinforcement Learning

Timer

Machine Learning-based Data Rate Prediction

Prediction M ea s u r e m en t D a t a R a t e Epochs P r obab ili t y SINR S I NR Time Black Spots La t i t ude Logitude

Transmission MetricData Transfer Strategy

SINR

Fig. 3: Continuity of context-aware approaches for opportunistic data transmissions in vehicular networks.catalyzed the development of novel data-driven performanceevaluation methods. Data-driven Network Simulation (DDNS)[28], [29] allows to analyze the performance of wireless com-munication systems by replaying empirically acquired contexttraces. The end-to-end behavior of the observed target KeyPerformance Indicator (KPI) is then derived by a combinationof deterministic and probabilistic machine learning modelswhich mimics the statistical derivations of the real worldmeasurements. In comparison to conventional system-levelnetwork simulation [30], this method is able to achieve a bettermodeling accuracy of radio propagation effects in concrete realworld evaluation scenarios and achieves a massively highercomputational efﬁciency. Another advantage is a reduction ofthe simulation setup complexity since the end-to-end analysisapproach solely relies on the acquired data and does notrequire to parameterize communicating entities.

Anticipatory mobile networking [31] is a novel wirelesscommunications paradigm which aims to optimize decisionprocesses in communication systems through explicit consid-eration of context information. Since mobile and vehicularnetworks are inherently impacted by the interdependency ofmobility and radio channel dynamics [32], machine learning-enabled anticipatory networking is a promising approach forsystem optimization in this domain. As an example, Dalgkitsiset al. [33] utilize mobility prediction jointly with deep learningfor improving the service orchestration process in 5G vehicularnetworks.

Non-cellular-centric networking [34] integrates the networkclients as part of the network fabric and allows them tocontribute explicitly or implicitly to network managementfunctions. This approach allows to exploit the capability ofthe clients to sense their environments for opportunisticallyscheduling data transmissions for delay-tolerant applications[35] in a context-aware manner. In [36], Shi et al. point out thatnetwork congestion has a large short-term variance and thattrafﬁc peaks can be compensated by delaying transmissions.Therefore, the authors propose the Collaborative Application-Aware Scheduling of Last Mile Cellular Trafﬁc (CoAST)system which applies a collaborative infrastructure-assisted optimization approach based on dynamic pricing. Hereby, theannounced trafﬁc demands of the UEs are used by a centralentity which computes and broadcasts the projected data trans-fer prices for a given future time window. This information isthen used by the UEs to schedule their transmissions withrespect to the trade-off between price and additional delay.Peek-n-sneak [37] and Client-side Adaptive Scheduler Thatminimizes Load and Energy (CASTLE) [38] are distributedtransmission scheduling approaches which rely on a thresholddecision for performing or delaying the data transfer. Bothapproaches use different network quality indicators (ReferenceSignal Received Power (RSRP), Reference Signal ReceivedQuality (RSRQ), and Signal-to-interference-plus-noise Ratio(SINR)) for predicting the current network load based on aRadial Basis Function (RBF) Support Vector Machine (SVM).

Data rate prediction can serve as a metric for anticipatorydecision making such as opportunistic data transfer [39] anddynamic Radio Access Technology (RAT) selection. The pre-dictions can either be performed actively or passively . Activeprediction methods apply time series analysis – e.g., basedon Long Short-term Memory (LSTM) methods as consideredin [40], [41] – and monitor the behavior of ongoing datatransmissions. Since the need to continuously transmit datais opposed to the considered opportunistic medium accessstrategy, this work focuses on passive prediction approacheswhich have been investigated by different authors. The keyinsights are summarized as follows: • Radio channel indicators (e.g., deﬁned according to 3GPPTS 36.213 [42]) are highly correlated to the observeddata rate and can serve as meaningful information forpredicting the latter [43]–[45]. • Due to the curse of dimensionality [46], complex modelssuch as ANN-based deep learning approaches require asigniﬁcantly higher amount of training data than simplermethods such as CARTs. As typical data sets in the wire-less communication domain are comparably small [9],less complex methods often achieve a higher predictionaccuracy [29], [47]. • For the derivation of generalizable prediction models, it is important to integrate application-layer knowledge aboutthe payload size of the data packet to be transmitted [48].This way, the prediction is able to implicitly accountfor the interdependency between transmission durationand channel coherence time as well as payload-overhead-ratio and protocol-speciﬁc aspects such as the slow startmechanism of the Transmission Control Protocol (TCP). • A low data aggregation granularity should be preferred:Few models with large data sets (e.g., a single predictionmodel per MNO) achieve a better average prediction per-formance than a large amount of highly-speciﬁc models(e.g., dedicated prediction models for each evolved NodeB (eNB)) [48], [49]. • Although temporal effects have a signiﬁcant impact onthe network load, the time of day is negligible if load-dependent network quality indicators such as RSRQ areconsidered in the data set [48], [50].In addition to these purely client-based approaches, the au-thors of [51] have analyzed a possible implementation for cooperative data rate prediction in future 6G networks wherethe network infrastructure actively announces network loadinformation to the mobile clients. In an initial feasibility study,it is shown that the cooperative approach is able to reduce theRoot Mean Squared Error (RMSE) by 25 % in uplink and30 % in downlink directionIII. T

OWARDS R EINFORCEMENT L EARNING - ENABLED O PPORTUNISTIC D ATA T RANSFER

Different opportunistic data transfer methods have build thefoundation for the proposed BS-CB method. The differentevolution stages are shown in Fig. 3.

Periodic data transfer represents the regular approach fortransmitting Machine-type Communication (MTC) data. Themedium access is based on a ﬁxed timer interval ∆ t whichtransmits the data regardless of the radio channel conditions. Channel-aware Transmission (CAT) [52] is a probabilis-tic opportunistic data transfers method which schedules themedium access based on measurements of the SINR. Data isbuffered locally until a transmission decision is made for thewhole buffer. The transmission probability p TX ( t ) is computedas p TX ( t ) =  t < ∆ t min t > ∆ t max (cid:16) Φ( t ) − Φ min Φ max − Φ min (cid:17) α else (1)with Φ being the transmission metric – the SINR ( t ) mea-surement – with a deﬁned value range { Φ min , Φ max } . ∆ t represents the time since the last transmission has been per-formed. ∆ t min is used to guarantee a minimum packet size and ∆ t max ensures that the AoI does not exceed the requirementsof the target application. The exponent α allows to controlthe preference of high metric values within the data transferprocess. Machine Learning CAT (ML-CAT) [39], [53] is a ma-chine learning-based extension to CAT. Due to the short-termﬂuctuations of the SINR, the transmission decision is per-formed based on data rate predictions which are obtained from an RF model (see Sec. IV-A). While the actual transmissionis still performed based on Eq. 1, the considered metric Φ corresponds to the predicted data rate ˜ S ( t ) . Reinforcement Learning CAT (RL-CAT) [11] is a ﬁrstreinforcement learning-based variant of the ML-CAT methodwhich replaces the probabilistic medium access with a Q-learning approach aiming to maximize the data rate of the individual sensor data transmissions. The predicted data rateand the elapsed buffering time form the context tuple c t =( ˜ S ( t ) , ∆ t ) are used to lookup up the action — IDLE or TX —with the highest Q-value from a Q-table. The latter is trainedas Q ( c t , a ) = (1 − α ) · Q ( c t , a ) + α (cid:104) r a + λ · max a Q ( c t + , a ) (cid:105) (2)whereas α corresponds to the learning rate, r a is the rewardof the action a , λ represents the discount factor, and c t + isan estimation for the Q -value after a has been executed. Inclassical Q-Learning, it is assumed that the decision making ofthe agent causes a sequential improvement of its state withinthe environment and ultimately leads to reaching an “optimal”target state. However, as further discussed Sec. IV, in theconsidered opportunistic data transfer use case, the agent-related impact on the state of the environment is negligibledue to the dominance of external inﬂuences such as thechannel and network load dynamics: Even if the agent wascapable of performing hypothetical “optimal” actions, its state— represented by the context tuple c t — would be stilldetermined by the impact of the non-controllable inﬂuencefactors. Therefore λ is set to which results in a simpliﬁedQ-Learning variant Q ( c t , a ) = (1 − α ) · Q ( c t , a ) + α · r a . (3)that implements a myoptic approach focusing on optimizingthe immediate reward of the taken actions.IV. R EINFORCEMENT L EARNING - BASED O PPORTUNISTIC D ATA T RANSFER WITH

BS-CBIn this section, we present the novel BS-CB method.According to the classiﬁcation scheme for edge intelligenceprovided by [54], the proposed data transfer method representsa level 3: on-device inference edge intelligence implementationwhere the model is trained in the cloud/ofﬂine and inferenceis run completely locally.A schematic overview about the interaction between thedifferent logical entities is shown in Fig. 4. • The actual opportunistic data transfer is modeled as a re-inforcement learning agent which senses its environment ,performs actions and observes the resulting rewards . • Hereby, the environment is represented by the realworld cellular network. Classical reinforcement learningassumes that the actions taken by the agent change the state of the environment. However, in the consideredvehicular scenarios, the properties of the environmentare highly time-variant due to the dynamically changingradio channel conditions mainly related to the mobilitybehavior of the mobile UE. In addition, other users of

Agent EnvironmentSensingHardware Platform Real WorldReinforcement Learning-enabled Opportunistic Data Transfer

DDNS-enabled VirtualExploration Process

ExternalInﬂuences

RewardActionActionData Rate PredictionNetworkQuality

Fig. 4: Interaction between the different logical entities withina reinforcement learning setup for opportunistic data transfer.

Opportunistic Data TransferMachine Learning Features R S R P R S R Q S I NR C Q I T A F r eq . S peed C e ll I d P a y l oad Measurements D a t a R a t e P o s i t i on Data RatePrediction Black SpotClustering System Parameters

Feature set Label T a r ge t D a t a R a t e A o I D ead - li ne T r ade - o ﬀ F a c t o r ContextualBandit truefalse

IDLEHandleVehicle inBS region? (Sec. )IV-A (Sec. )IV-B(Sec. )IV-C

Fig. 5: Overall system architecture model of the proposedBS-CB method.the cellular network consume network resources whichleads to the conclusion that the state of the environmentmainly depends on the external inﬂuences . • The sensing of the environment is performed through thehardware platform which observes context indicators. Inorder to reduce the dimensionality of the reinforcementlearning problem, data rate prediction is applied.The overall system model of the novel BS-CB is shown inFig. 5. BS-CB implements a hybrid approach which bringstogether all major machine learning disciplines. Supervisedlearning is applied to predict the achievable data rate basedon measured context indicators. Unsupervised learning is thenutilized to detect geospatially-dependent uncertainties of theprediction model. Finally, the reinforcement learning-basedautonomous data transfer uses the acquired information foroptimizing the resource efﬁciency of vehicular data transmis-sions. In the following paragraphs, the three main componentsof the proposed methods are introduced in further details.

A. Supervised Learning: Data Rate Prediction

The overall feature set x is composed of nine differentfeatures from multiple context domains: ANN M5 RF SVM ANN M5 RF SVM3.544.555.56 R M SE [ M B i t/ s ] Uplink DownlinkUplink Downlink

Fig. 6: Resulting data rate prediction performance for differentregression models on the

MNO A data set.

ANN:

ArtiﬁcialNeural Network, M5 : M5 Regression Tree, RF : RandomForest, SVM : Support Vector Machine • Network context x net : RSRP, RSRQ, SINR, ChannelQuality Indicator (CQI), Timing Advance (TA) • Mobility context x mob : Velocity of the vehicle, cell id ofthe connected eNB • Application context x app : Payload size of the data packetto be transmittedThe data rate is then predicted based on a regression model f ML as ˜ S ( t ) = f ML ( x ) . As a preprocessing step, we com-pare the prediction performance of different machine learningmodels whereas the parameterization of each model has beenoptimized based on grid search: • Artiﬁcial Neural Network (ANN) [17] with two hiddenlayers with 10 neurons per hidden layers and sigmoidactivation function, momentum α = 0 . , learning rate η = 0 . , and 500 training epochs. • CART methods

M5 Regression Tree (M5) and

RandomForest (RF) [18] with 100 random trees and maximumdepth 15. • Support Vector Machine (SVM) with RBF kernel andSequential Minimal Optimization (SMO) training.The resulting RMSE of the data rate prediction modelson the

MNO A data set of [48] is shown Fig. 6. In bothevaluations, the lowest prediction error is achieved by the RFmodel. In uplink direction, different context indicators havespeciﬁc regions of application: As discussed in [48], RSRQis an important indicator for the data rate in cell edge regionsand SINR has a higher impact on the latter in the center ofthe cell — both can be distinguished through considering theRSRP. These interval-wise scope regions match well with thecondition-based model architecture of the RF model. However,in downlink transmission direction, the differences betweenthe considered prediction models are less signiﬁcant. Thisobservation can be explained through consideration of theﬁndings of [31]: In downlink direction, the resulting data rateis mostly related to the cell load which is partially representedby the RSRQ. The presence of this dominant feature resultsin a less complex learning task. Since the RSRQ is only animplicit indicator for the current network load, the resultingRMSE is relatively high.Due to these observations, we apply the RF model for

Longitude La t i t ude Longitude La t i t ude Longitude La t i t ude Raw Measurements Fitted EllipsesBlack Spot Clustering

Fig. 7: Steps of the black spot clustering process.performing the context-based data rate predictions in theremainder of this paper.

B. Unsupervised Learning: Black Spot Clustering

An important observation of previous work [11] is that theresulting data rate prediction accuracy in vehicular scenar-ios has a geospatial dependency : Large outliers often occur cluster-wise due to local effects such as eNB handovers,cell switches, and environment-dependent sporadic link loss.Although the knowledge about these mechanisms does notexplicitly allow us to compensate the undesired effects, itcan be exploited within the opportunistic data transmissionprocesses as a measurement for the uncertainty of the predic-tion model: Transmissions should be avoided if the predictionmodel is currently in an unreliable state and does not allowto make a precise statement about the achievable end-to-endperformance. We call these areas black spots based on theusage of the term in trafﬁc satefy where it refers to regionswith a signiﬁcantly increased probability for collisions ofvehicles.The proposed black spot-aware networking approach isdivided into the unsupervised learning-based ofﬂine data anal-ysis and the online application. The ofﬂine data analysis consists of multiple steps which are visualized in Fig. 7.1)

Geo-clustering : Unsupervised learning based on k-means [20] is applied in order to cluster the transmissionlocations into a total amount of N c clusters.2) Black spot detection : For each cluster c , the RMSE(see Sec. V) of the data rate prediction results is com-puted and compared to a threshold value RMSE max . Allclusters that exceed the given upper limit are labeled as black spot clusters .3) Ellipse ﬁtting : All detected black spot clusters are ﬁttedto rotated ellipses in order to allow their later onlineconsideration within the opportunistic data transmissionprocess. Hereby, the length a of the ellipse is calculatedbased on the dominant intra-cluster distance vector.The impact of considering information about black spotregions within the prediction model is shown in Fig. 8. WhileFig. 8 (a) shows the resulting prediction performance of theoverall data set which consists of black spot and non-blackregions, the separation of the prediction model allows toimprove the prediction accuracy for the non-black spot regionsas shown in Fig. 8 (b). In the following, we will use this variantfor predicting the data rate as the metric of the opportunisticdata transmission process. For the online application , the vehicle’s position P iscompared against all black spot ellipses with correspondingellipse centroid P i based on an intersection test for α -rotatedellipses. The vehicle is within the considered elliptic region ifthe following condition is fulﬁlled: ( c · v .x + s · v .y ) a + ( s · v .x − c · v .y ) b ≤ (4)with v = P − P i , c = cos α , s = sin α , and α being theellipse rotation. An overview about the detected black spotregions for MNO A in uplink direction is shown in Fig. 9.

C. Reinforcement Learning: Contextual Bandit-based DataTransfer

The actual opportunistic data transfer is modeled as a LinearUpper Conﬁdence Bound (LinUCB) [55] contextual banditwhereas the arms of the bandit correspond to the possibleactions: • a IDLE leads to a local buffering of the newly acquireddata as the current network quality is not consideredappropriate for allowing resource efﬁcient data transfer.It is assumed that due the mobility behavior of thevehicle, the mobile UE will encounter a more suitabletransmission opportunity in the future. • a TX causes the transmission of the whole buffered data.The context-aware arm selection process is performed basedon a sequence of matrix-vector multiplications as a = arg max a ∈ A  ˆ θ Ta c (cid:124)(cid:123)(cid:122)(cid:125) Estimated reward + α (cid:112) c T A − a c (cid:124) (cid:123)(cid:122) (cid:125) UCB  . (5)Hereby, the estimated reward is derived by ridge regressionwhereas ˆ θ a represents the regression coefﬁcients of arm a which are updated during the reinforcement learning processand c = ( ˜ S ( t ) , ∆ t ) is the d -dimensional context tuple consist-ing of the predicted data rate ˜ S ( t ) and the current bufferingtime ∆ t . A a is computed as A a = D Ta D a + I a with I a being a d -dimensional identity matrix and D a being the m × d matrixwhich contains the m previously observed context tuples. Theconstant exploration parameter α controls the greediness ofthe algorithm and is computed as α = 1 + (cid:114) ln(2 /δ ))2 (6)based on the only system parameter δ . The smaller the valueof α , the more greedy the algorithm behaves, meaning thatit will more likely exploit actions that currently seem to beoptimal.After each performed action, the regression coefﬁcients areupdated based on the observed reward r a as ˆ θ a ← A − a b a (7)with b a ← b a + r a · c (8)Hereby, b a is initialized as a d -dimensional zero vector. Thereward functions are computed action-speciﬁc, for the TX M ea s u r ed D a t a R a t e [ M B i t/ s ] (a) Overall prediction model M ea s u r ed D a t a R a t e [ M B i t/ s ] (b) Non-black spot prediction model M ea s u r ed D a t a R a t e [ M B i t/ s ] (c) Only black spot prediction model Fig. 8: The overall prediction model is separated into a more precise model for non-black spot regions and a less precise modelfor black spot regions. The gray area shows the behavior of a 0.95-conﬁdence area derived by applying a GPR model on theresults of the prediction model.Fig. 9: Resulting black spot regions along the evaluationtrack for

MNO A in uplink direction (Map: ©OpenStreetMapcontributors, CC BY-SA).action, the reward is derived as: r TX ( S, ∆ t ) = ω · ( ˜ S − S ∗ ) S max + ∆ t · (1 − ω )∆ t max (9)whereas the trade-off factor w controls the fundamental trade-off between data rate optimization and AoI optimization. S ∗ represents a target data rate which should be approachedand S max is the empirically observed maximum data rate ofthe network. ∆ t is an application-speciﬁc deadline for thetolerable AoI.The reward of the IDLE action is computed as: r IDLE (∆ t ) = (cid:40) Ω ∆ t ≥ ∆ t max else (10)whereas Ω is chosen as a negative number which ensures thatthe estimated reward of the TX action is superior to the rewardof the IDLE action if ∆ t exceeds the AoI deadline ∆ t max . C oe ﬃ c i en t o f D e t e r m i na t i on R M SE [ M B i t/ s ] Fig. 10: Trade-off between performance improvement of thedata rate prediction and tolerable reduction of the transmissionopportunities (

MNO A uplink).As a result, the data is transferred immediately regardless ofthe radio channel conditions.After the contextual bandit has made a transmission deci-sion, the information about the black spot regions is leveraged:If the vehicle is currently within a black spot region, the datatransfer is postponed since the prediction model cannot betrusted. As a result of this approach, there exists a trade-offbetween the achievable improvement of the data rate predictionaccuracy and a reduction of the usable percentage of thetrack for performing data transmissions. Fig. 10 shows theresulting R and RMSE values with respect to the tolerablepercentage of track elimination — the total spread of theblack spot regions over the overall track length — for the MNO A uplink data set of [48]. It can be seen that thereduction of transmission opportunities allows to signiﬁcantlyimprove the performance of the prediction model. Where thecurves convergence, the model only considers highly reliableconnectivity hotspots appropriate for the data transfer. In thefollowing, we allow a maximum track reduction of %.V. M ETHODOLOGY

In this section, an overview about the research methods,tools, and performance metrics is provided. A summary about

ClientMobile UE ApplicationCloud ServerNetworkeNB

End-to-End Data Rate PredictionUplink TransferDownlink TransferSensor Downlink Sensing

Fig. 11: Network model of the real world performance evalu-ation.relevant parameters of the novel transmission scheme is givenin Tab. ITABLE I: Default parameters of the evaluation setup

Parameter Value

Maximum buffering time ∆ t max

120 sTrade-off factor w Ω -1Exploration parameter δ N c max

3, 2.25, 2.5Periodic data transfer interval ∆ t

10 s

A. Real World Data Acquisition

For the empirical performance comparison, ﬁve test drivesare performed in the real world for each of the transmissionschemes. Fig. 11 shows the network model of the evaluation.A virtual sensor application generates 50 kB of sensor data persecond. Data transmissions are performed from a moving ve-hicle through the public Long Term Evolution (LTE) networksof three German MNOs in uplink and downlink direction viaTCP. The evaluations are carried out along a 25 km longevaluation track (Fig. 9) which contains highway and suburbanregions with varying building densities and speed limitations.In total, 8563 transmissions – 13.61 GB of transmitted data– are performed. The passive measurement of the contextindicators as well as the active data transmission are performedon an

Android -based UE (Galaxy S5 Neo, Model SM-G903F) based on a novel application. The latter is providedin an open source way . B. Performance Indicators

Within the real world performance comparison in Sec. VI,multiple KPIs are considered which are obtained as follows.

End-to-end data rate : The evaluation of the achieveddata rate is performed at the application level and representsthe transmission efﬁciency of the considered transmissionschemes. The actual measurements are performed at a cloudserver.

AoI : Due to the local buffering process implied by theopportunistic data transfer approach, each transmitted datapacket consists of multiple sensor packets. In order to analyzethe freshness of the received sensor information, the generation Source code available at https://github.com/BenSliwa/MTCApp time of the oldest sensor packet within the received overalldata is considered.

Network resources : For estimating the number of PRBs ofperformed transmissions in a postprocessing step, we revert theprocedure described in [56]. Hereby, the CQI measurementsare utilized to determine the MCS and Transport Block Size(TBS) indices from the 3GPP TS 36.213 lookup tables. Basedon this information and the measured data rate, the number ofPRBs is inferred.

Power consumption : The resulting power consumption ofa mobile UE is mainly determined by the applied trans-mission power P TX which controls the stage of the powerampliﬁers. Unfortunately, Android -based UEs do not exposethis information to the user space. However, the analysisin [57] has shown that P TX can be inferred from radiosignal measurements since it is highly correlated to distance-dependent indicators such as RSRP. Therefore, we apply theproposed machine learning-based prediction toolchain of [57]to estimate P TX and determine the transmission-related powerconsumption based on laboratory measurements of the device-speciﬁc power consumption behavior. Additional details aboutthe applied procedure are presented in [39]. We remark thatthe power consumption is not a major limiting factor forvehicular crowdsensing. Yet, the usage of battery-poweredrobotic vehicles such as Unmanned Aerial Vehicles (UAVs) fordata acquisition in future Intelligent Transportation Systems(ITSs) is highly being discussed. In addition, the proposedapproach might also be applied in intelligent container systemsin smart logistics scenarios. C. Data-driven Network Simulation

It is obvious that the inherently huge effort in performingreal world test drives makes this method inappropriate forcarrying out large scale parameter studies. Therefore, weexploit the computational efﬁciency of data-driven analysismethods and implement a DDNS setup according to [28] forthe initial parameter tuning phase.In contrast to classical network simulation methods whichsimulate the behavior of actual communicating entities andtheir corresponding protocol stacks, DDNS relies on replayingpreviously acquired empirical context traces of the targeteddeployment scenario. Hereby, the vehicle is virtually movedon its trajectory and the corresponding context information islookup up from the measurements. For this purpose, we utilizethe available open data set of [48]. The simulation of the end-to-end behavior of the transmission schemes is then performedby a combination of machine learning models: • Based on the available a priori data set, a deterministic data rate prediction model — equal to the RF methoddescribed in Sec. IV-A — is learned and utilized by theagent to opportunistically schedule the data transmissions.However, due to its deterministic nature, identical featuresets will always result in the same prediction results.Contrastingly, in the real world, the predictions will mostlikely differ from the ground truth measurements due toimperfections of the prediction model. • For representing this aspect within the simulation process,a probabilistic derivation model is utilized. Through

Trade-oﬀ Factor00.20.40.60.81 N o r m a li z ed E ﬃ c i en cy I nd i c a t o r Target DataRate MarginAoI DeadlineMargin

Fig. 12: Trade-off between data rate and AoI optimization for

MNO A in uplink direction.applying GPR on the results of the RF model (for avisual representation of the different models, see Fig. 8),a statistical description of the derivations between pre-dictions and measurements is derived. Furthermore, theBayesian nature of this model class allows to draw samplevalues from the learned conﬁdence interval. Within theDDNS simulation, each deterministic prediction ˜ S ( t ) isconverted to a sampled virtual ground truth value ˆ S ( ˜ S ( t )) which represents the actual resulting data rate of thecorresponding data transmission. Further details aboutthis method are presented in [28]. D. Data Analysis

For training the prediction models, we utilize theLightweight Machine Learning for IoT Systems (LIMITS)framework [58] which allows to automate low-level machineanalysis in Waikato Environment for Knowledge Analysis(WEKA) [59] and provides automated export of

C/C++ codeof the trained models. In order to generate the GPR modelsfor the DDNS setup and for performing the k-means blackspot clustering, the

Statistics and Machine Learning Toolbox of MATLAB is applied.For analyzing the performance of the machine learningmethods, multiple statistical metrics are applied. The coefﬁ-cient of determination R is a statistical metric for the good-ness of ﬁt of the resulting regression model. It is calculatedas R = 1 − (cid:80) Ni =1 (˜ y i − y i ) (cid:80) Ni =1 (¯ y − y i ) (11)with N as the number of measurements, ˜ y i being the currentprediction, y i being the current measurement, and ¯ y being themean value of the measurements.In addition, we consider Mean Absolute Error (MAE) andRMSE which are calculated asMAE = (cid:80) Ni =1 | ˜ y i − y i | N ,

RMSE = (cid:115) (cid:80) Ni =1 (˜ y i − y i ) N .

VI. R

ESULTS

In this section, the results for the DDNS-based optimizationphase as well as for the real world performance analysis arepresented and discussed. Within the latter, the novel BS-CB D a t a R a t e [ M B i t/ s ] Proposed Contextual Bandit Approach

RL-CAT with Q-Learning [ ]8RL-CAT Variant with DeepReinforcement Learning [ ]8 PeriodicML-CAT [35]

Fig. 13: Convergence behavior of the reinforcement learning-enabled transmission schemes. Each epoch corresponds to avirtual test drive evaluation in the DDNS.method is compared to the existing transmission schemesdiscussed in Sec. III.

A. DDNS-based Parameter Optimization

As discussed in Sec. IV, opportunistic data transfer issubject to a fundamental trade-off between data rate and AoIoptimization: In order to improve the end-to-end data rate, thetransmission schemes will rather prefer larger packets whichare then transmitted within connectivity hotspots. As a resultof the local buffering, the AoI is increased. For the furtheranalysis of this effect, two efﬁciency indicators are deﬁned: • The data rate efﬁciency E s = ¯ S/S ∗ is used to analyzehow good the average data rate ¯ S approaches the targetdata rate S ∗ . • The

AoI efﬁciency E AoI = 1 − ¯∆ t/ ∆ t max representsa measure for the margin between the average AoI andthe application-speciﬁc deadline ∆ t max of the age of thesensor data.The fundamental trade-off between data rate optimizationand AoI optimization which is controlled via the trade-offfactor w is shown in Fig. 12. It can be seen that the resultingdata rate can be improved by transmitting larger data packetsbased on a larger value of w . However, this is achieved througha higher buffering time of the acquired sensor data packetswhich increases the AoI of the data packets. In the following,we focus on data rate optimization and apply w = 0 . withinall considered evaluations.Although the reinforcement learning mechanisms can theo-retically be learned online in the ﬁeld, we apply an ofﬂinetraining approach based on DDNS in order to ensure thatthe real world evaluations are performed with a convergedsystem. Hereby, we replay previously acquired empirical con-text traces — which are referred to as epochs — and applythe novel reinforcement learning-based transmission schemes.The resulting data rate behavior is shown in Fig. 13. Asreferences, we consider the Q-learning-based RL-CAT and adeep reinforcement learning variant of the latter which appliesan ANN conﬁguration according to Sec. IV-A for the datarate prediction. It can be seen that the contextual bandit-basedmethod achieves the highest absolute data rate and reaches aconverged system state early after 200 epochs. The remaining P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B D a t a R a t e [ M B i t/ s ] MNO A MNO B MNO C+195% +139% +125% (a) Uplink P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B D a t a R a t e [ M B i t/ s ] MNO A MNO B MNO C+ 223 % + 152 % + 173 % (b) Downlink

Fig. 14: Comparison of the resulting real world data rate in uplink and downlink direction for the considered transmissionschemes and MNOs. P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B N u m be r o f P R B s pe r M B MNO A MNO B MNO C x 10 - 84% - 86% - 89% (a) Uplink P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B N u m be r o f P R B s pe r M B MNO A MNO B MNO C x 10 - 85% - 85% - 87% (b) Downlink Fig. 15: Comparison of the resulting real world resource efﬁciency in uplink and downlink direction for the consideredtransmission schemes and MNOs. P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P o w e r C on s u m p t i on pe r M B [ J ] MNO A MNO B MNO C- 64 % - 73 % - 53 %

Fig. 16: Transmission-related real world uplink power con-sumption of the mobile UE.error ﬂoor is caused by the imperfections of the data rateprediction model. For RL-CAT, both variants achieve a similarperformance level — about . MBit/s less than BS-CB— ofthe converged methods. However, it can be seen that the deepreinforcement learning variant achieves a faster convergencebehavior than the simple Q-learning approach.

B. Real Wold Performance Comparison

The conﬁgured and converged transmission schemes arenow applied in a real world evaluation and compared toexisting transmission approaches.The resulting data rate of the different transmission schemesis shown in Fig. 14 for uplink and downlink direction. Aclear trend of continuous improvement over the differentevolution stages can be observed: Although already the SINR-based CAT method is able to achieve signiﬁcant improvementsin comparison to the periodic data transfer approach, theintroduction of the machine learning-based data rate predictionmetric by ML-CAT leads to a signiﬁcant boost which is theresult of a more reliable way of accessing the channel behavior.Finally, it can be seen that the reinforcement learning-baseddecision making outperforms the previously considered heuris-tic approaches. Hereby, data rate improvements up to % inuplink and up to % in downlink direction are achieved bythe proposed BS-CB method. In the downlink, the differencesbetween the opportunistic transmission approaches are lessdistinct since the downlink performance is more determinedby the network congestion than the radio channel conditions[31].A comparison of the resulting network ressource efﬁciency P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B A ge o f I n f o r m a t i on [ s ] MNO A MNO B MNO C (a) Uplink P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B A ge o f I n f o r m a t i on [ s ] MNO A MNO B MNO C (b) Downlink

Fig. 17: Comparison of the resulting real world AoI of the sensor data packets.(represented by the amount of PRBs per transmitted MB) isshown in Fig. 15. It can be seen that all opportunistic datatransfer approaches are able to massively reduce — by 84 %to 89 % — the amount of occupied network resources forall MNOs in both transmission directions. One of the mainreasons for this behavior is the explicit exploitation of connec-tivity hotspot situations. Here, the robust channel conditionsallow to apply higher MCSs for the actual data transfer. Again,it can be seen that the more advanced evolution stages of theCAT approach allow to identify these favorable transmissionopportunities in a more reliable way. As a conclusion, theapparently selﬁsh goal of data rate optimization contributes toimproving the intra-cell coexistence: Since the limited PRBsare only occupied for small amounts of time, they are freedearly and are available for being allocated by other cell users.The resulting uplink power consumption of the mobile UEis shown in Fig. 16. Since the opportunistic data transmissionschemes aim to exploit connectivity hotspots, they implictlyincrease the average RSRP at the transmission time whichis highly correlated to the applied transmission power. Asdiscussed in [57], the latter is the major impact factor for theuplink power consumption since it controls the state of thedifferent power ampliﬁers of the UE. Therefore, the RSRPoptimization leads to a massive improvement of the observedpower consumption. Here, BS-CB is able to reduce the latterbetween − % and − %. For MNO B , it can be observedthat the general level of the uplink power consumption is muchhigher than for the other MNOs. However, this phenomenonis caused by network planning-related aspects of the operator:In the considered evaluation scenario, the average distanceto the eNBs is much higher than for the other MNOs. As aconsequence of the resulting RSRP reduction — the averageRSRP for

MNO B is − . dBm, − . dBm for MNO A ,and − . dBm for MNO C — the mobile UE applies ahigher transmission power to compensate the path loss effects.Although the considered opportunistic data transfer ap-proaches are able to achieve massive improvements in datarate, network resource efﬁciency, and uplink power consump-tion, the price to pay is a signiﬁcant increase in the AoIof the sensor data packets. Fig. 17 shows a comparisonof the resulting AoI values for the different transmissionschemes, MNOs, and transmission directions. The plots show

10 20 30 40 50 60 70 80 90 100 110 120Maximum buﬀering delay [s]20406080100 A o I [ s ] Quasi-linear phasedue to TCP slow start Exploitation of connectivity hotspots D a t a R a t e [ M B i t/ s ] Saturation of protocol-relateddata rate improvement

Fig. 18: Impact on the application-speciﬁc deadline ∆ t max onthe resulting data rate and AoI.that this effect is more distinct for the machine learningapproaches which detect favorable transmission opportunitiesmore reliably through considering the radio channel quality,protocol-related aspects and partially also the network load.In contrast to that, the highly dynamic behavior of the SINR(see Fig. 2) leads to a higher transmission probability for theregular CAT method which results in a comparably low AoI.However, based on the parameter ∆ t max , the tolerable AoI canbe conﬁgured with respect to the application requirements. Theimpact of different values of ∆ t max on the resulting BS-CBdata rate and the AoI of sensor data is shown in Fig. 18. Forsmall values of ∆ t max , a quasi-linear dependency to the lattercan be observed. In this phase, the behavior of the transmissionscheme is dominated by protocol effects such as TCP slowstart. However, a saturation of the data rate improvement isreached at ∆ t max = 30 s. Afterwards, the actual opportunisticbehavior starts which exploits the vehicle’s mobility behaviorfor postponing data transmissions to more robust radio channelconditions where a better resource efﬁciency can be achieved.As a summary, Fig. 19 shows a spider plot which compares Periodic

BS-CB

RL-CAT CATML-CAT

Network Optimization

DownlinkData Rate[MBit/s]UplinkData Rate[MBit/s]

30 30 1.5 0 0

Uplink PowerConsumption [J/MB] DownlinkAoI [s]UplinkAoI [s] ApplicationOptimizationClientOptimizationDownlink RessourceEﬃciency [ Uplink RessourceEﬃciency [ Fig. 19: Summary: Comparison of the average behavior ofdifferent performance indicators for the opportunistic datatransfer methods in the cellular network of

MNO A . The axisorientations are chosen such that a large footprint representsa better performance.the mean results of all considered performance indicators forthe different opportunistic data transfer methods in the networkof

MNO A . The axis orientation have been chosen such that alarger footprint corresponds to a better performance. It can beseen that all non-periodic approaches focus on optimizing thenetwork and client domain at the expense of the applicationdomain. Although the proposed BS-CB achieves a slightlybetter overall performance than RL-CAT, the major differencescan be observed between different categories and less betweenactual transmission schemes: The highest gains are achievedby the hybrid machine learning approaches that utilize datarate prediction and reinforcement learning-based autonomousdecision making.

C. Online Learning for Self Adaptation to Concept Drift

The results of the real world performance evaluation haveshown that the client-based machine learning-enabled trans-mission schemes are able to achieve signiﬁcant improvementsin comparison to existing approaches. However, changes inthe network (e.g., new resource schedulers in the networkinfrastructure) might lead to a concept drift [60] situationwhere the interplay of the considered features experiences asigniﬁcant change. Although the application of reinforcementlearning allows to further optimize the autonomous decisionmaking during the live evaluations, the data rate predictionmodel is trained in a static way and might experience asigniﬁcant reduction of the prediction accuracy. While it ispossible to periodically re-train the prediction model, a betterapproach is the application of online learning in order toenable self adaption to the changed environment conditions.With respect to the edge intelligence classiﬁcation scheme R M SE [ M B i t/ s ] MNO A

ConceptDrift Self Adaptation Convergence

MNO B

Pre-trained

Fig. 20: Self adaption of the data rate prediction model toconcept drift: An ANN model is pre-trained on the uplink dataof

MNO A and then incrementally updated with measurementsof

MNO B .of [54], the integration of online learning would migrate thetransmission scheme to level 6: all on-device where trainingand inferencing are run completely locally.Since it is not possible for us to cause concept drift in thepublic cellular network, we virtually create a situation wherethe network behavior spontaneously changes signiﬁcantly. Forthis purpose, we pre-train a prediction model on the uplinkdata set of one MNO and analyze its online adaption tothe data set of a different operator. Although online learningvariants of RFs exist — e.g., Mondrian Forests [61] — weapply an ANN model for this purpose since this model classinherently supports incremental learning. For the proof-of-concept experiment, a data split is applied: 80 % of theMNO-speciﬁc data set D is used as the training set D train and the remaining data forms the test set D test . Initially, theANN is pre-trained on the training data of MNO A , and thenincrementally updated with the training data of

MNO B . Forboth operators, the RMSE on the corresponding test sets isanalyzed.The ANN is set up according to Sec. IV-A. For the incre-mental learning, a minibatch of elements is applied. Hereby,the measurements are buffered locally until the buffer size isequal to . Afterwards, the weights of the ANN are updatedand the buffer is cleared. The resulting RMSE on the test setsof both network operators is shown in Fig. 20. Four differentcharacteristic phases can be identiﬁed:1) Pre-trained model : As the prediction model is initiallyoptimized for being applied in the network of

MNO A ,the prediction accuracy for

MNO A is signiﬁcantlyhigher than for

MNO B . Still, a certain level of pre-dictability is achieved based on the MNO-independentaspects within the feature set.2)

Concept drift : After the ﬁrst batches of

MNO B mea-surements arrive, the prediction model experiences aconcept drift: Since the weights of the ANN are neitheroptimized for

MNO A nor for

MNO B , both models suf-fer from a performance decrease. Hereby, also the MNO-independent features are affected from the changedmodel weights. This aspect is more dominant for

MNO B for which only a small amount of measurements hasbeen observed.3)

Self adaptation : After seven batch iterations, the ANN MNO AMNO BMNO CB C A A/B

Dynamic NetworkSelection

Multi-MNO (a) E CD F Multi-MNO MNO CMNO B MNO A (b) E CD F (c) Fig. 21: Compensation of black spot regions through multi-MNO network selection: General solution approach and impact onblack spot statistics.weights start to become optimized for the network of

MNO B , which results in a steady RMSE improvementfor the following iterations.4)

Convergence : After around 23 batch iterations, theprediction model reaches a converged state where theRMSE stays at a nearly constant level. In comparisonto the pre-trained phase, it can be seen that the RMSEvalues of the two MNOs have been switched and thatthe model has successfully adopted itself for

MNO B .The considered evaluation shows that online learning allowsthe data rate prediction model to autonomously adapt tochanged network conditions which have a signiﬁcant impacton the interplay of the features of the prediction model. Withinthe considered evaluation, even the on-device training time— on average 0.4511 ms per 32-element batch — can beconsidered negligible. However, the considered ANN modeldoes not reach the accuracy level of the statically trainedRF predictor (see Fig. 6). Therefore, future extensions shouldconsider the application of more advanced methods for onlinelearning.

D. Black Spot Statistics and Multi-MNO Transmission Ap-proach

Although the previous discussion has shown that the blackspot-aware data transfer approach is able to improve the datarate prediction accuracy as well as the resulting data rateof the BS-CB method, it’s usage introduces an additionalbuffering delay since transmissions are avoided within blackspot regions. A possible solution approach for compensatingthese undesired effects might be the usage of a multi-MNO ap-proach which exploits complementary network infrastructuredeployments. Fig. 21 (a) shows a schematic visualization ofblack spot compensation through application of a multi-MNOapproach. If a vehicle encounters a black spot region withinits primary network, it dynamically changes the network forperforming the sensor data transmissions.The Empirical Cumulative Distribution Functions (ECDFs)of the times and distances vehicles spend in black spot regionsare shown in Fig. 21 (b) and Fig. 21 (c). There are nosigniﬁcant variations between the considered MNOs. In around50 % of the cases, the black spot regions cover less than 100 mwhich results in a minor addition to the buffering delay. The usage of a multi-MNO approach leads to massive reductionsof both undesired effects. In fact, it is also almost able tocompensate the black spot-related effects completely.VII. R

ECOMMENDATIONS FOR F UTURE

6G N

ETWORKS

Based on the achieved insights, we summarize the ourrecommendations for using client-based intelligence in future6G networks as: • Non-cellular-centric networking approaches such asend-edge-cloud orchestrated intelligence allow to exploitthe computation and sensing capabilities of the networkclients for participating in the overall network optimiza-tion. This potential should be recognized by the MNOsand actively supported. • Data rate prediction allows to make more precise state-ments about the channel quality than considering rawnetwork quality indicators. Yet, purely client-based pre-diction methods only have limited insight into the currentload of the network. As cooperative data rate pre-diction [51] is able to signiﬁcantly reduce the end-to-end prediction error, this approach should be explicitlysupported by the network infrastructure through activelysharing knowledge about the network load (e.g., obtainedfrom the NWDAF [7]) using dedicated control channelbroadcasts. • Although machine learning has demonstrated its poten-tial in various applications related to wireless networkoptimization, the sizes of most existing data sets are faraway from being comparable to the massive data sets usedin computer vision by industry giants. Therefore, effortshould be taken to acquire data and build up massiveopen data sets, especially as additional data often leadsto larger performance gains than model tuning [49]. Apromising initial attempt for sharing data and modelsis the machine learning marketplace proposed in draftrecommendation Y.ML-IMT2020-MP of the InternationalTelecommunication Union (ITU).VIII. C

ONCLUSION

In this paper, we proposed BS-CB as a novel method forresource-efﬁcient opportunistic data transmission of vehicularsensor data. BS-CB implements a hybrid machine learning approach which relies on supervised learning for data rateprediction, unsupervised learning for identifying geospatially-dependent uncertainties of the prediction model, and reinforce-ment learning for autonomously scheduling data transmissionswith respect to the anticipated resource efﬁciency. Within areal world performance evaluation campaign, it was shown thatBS-CB is able to achieve massive improvements in compari-son to conventional periodic data transmission methods andsigniﬁcantly outperforms existing probabilistic approaches.In future work, we want to analyze more complex onlinelearning approaches such as Mondrian Forest for the data rateprediction. In addition, our research work will focus on furtherimproving the achievable prediction accuracy, e.g., throughapplication of cooperative approaches.A CKNOWLEDGMENTPart of the work on this paper has been supported by Deutsche Forschungs-gemeinschaft (DFG) within the Collaborative Research Center SFB 876“Providing Information by Resource-Constrained Analysis”, projects A4 andB4.

Benjamin Sliwa (S’16) received the M.Sc. degreefrom TU Dortmund University, Dortmund, Germany,in 2016. He is currently a Research Assistant withthe Communication Networks Institute, Faculty ofElectrical Engineering and Information Technology,TU Dortmund University. He is working on theProject ”Analysis and Communication for DynamicTrafﬁc Prognosis” of the Collaborative ResearchCenter SFB 876. His research interests include pre-dictive and context-aware optimizations for decisionprocesses in mobile and vehicular communicationsystems. Benjamin Sliwa has been recognized with a Best Paper Award atIEEE ICC 2020, a Best Student Paper Award at IEEE VTC-Spring 2018, the2018 IEEE Transportation Electronics Student Fellowship ”For OutstandingStudent Research Contributions to Machine Learning in Vehicular Commu-nications and Intelligent Transportation Systems”, and a Best ContributionAward at the OMNeT++ Community Summit 2017.

Rick Adam received the B.Sc. degree from TUDortmund University, Dortmund, Germany, in 2017and is currently working on his Master’s Thesis atthe Communication Networks Institute, Faculty ofElectrical Engineering and Information Technology,TU Dortmund University. His research is focusedon the application of machine learning algorithmsfor the optimization of communication networks,especially in the context of vehicular environments.One of the main goals is the development of aresource-efﬁcient sensor data transmission system,which enables better coexistence between different applications in mobilenetworks.

Christian Wietfeld (M’05–SM’12) received theDipl.-Ing. and Dr.-Ing. degrees from RWTH AachenUniversity, Aachen, Germany. He is currently aFull Professor of communication networks and theHead of the Communication Networks Institute,TU Dortmund University, Dortmund, Germany. Formore than 20 years, he has been a coordinator ofand a contributor to large-scale research projectson Internet-based mobile communication systems inacademia (RWTH Aachen ‘92-’97, TU Dortmundsince ‘05) and industry (Siemens AG ’97-’05). Hiscurrent research interests include the design and performance evaluation ofcommunication networks for cyber–physical systems in energy, transport,robotics, and emergency response. He is the author of over 200 peer-reviewedpapers and holds several patents. Dr. Wietfeld is a Co-Founder of the IEEEGlobal Communications Conference Workshop on Wireless Networking forUnmanned Autonomous Vehicles and member of the Technical Editor Boardof the IEEE Wireless Communication Magazine. In addition to several bestpaper awards, he received an Outstanding Contribution award of ITU-T for hiswork on the standardization of next-generation mobile network architectures. R EFERENCES[1] W. Xu, H. Zhou, N. Cheng, F. Lyu, W. Shi, J. Chen, and X. Shen,“Internet of vehicles in big data era,”

IEEE/CAA Journal of AutomaticaSinica , vol. 5, no. 1, pp. 19–35, Jan 2018.[2] J. Ren, Y. Zhang, K. Zhang, and X. Shen, “Exploiting mobile crowd-sourcing for pervasive cloud services: Challenges and solutions,”

IEEECommunications Magazine , vol. 53, no. 3, pp. 98–105, 2015.[3] B. Sliwa, T. Liebig, T. Vranken, M. Schreckenberg, and C. Wietfeld,“System-of-systems modeling, analysis and optimization of hybrid ve-hicular trafﬁc,” in , Orlando, Florida, USA, Apr 2019.[4] G. A. Akpakwu, B. J. Silva, G. P. Hancke, and A. M. Abu-Mahfouz,“A survey on 5G networks for the internet of things: Communicationtechnologies and challenges,”

IEEE Access , vol. 6, pp. 3619–3647, 2018.[5] A. Capponi, C. Fiandrino, B. Kantarci, L. Foschini, D. Kliazovich, andP. Bouvry, “A survey on mobile crowdsensing systems: Challenges,solutions, and opportunities,”

IEEE Communications Surveys Tutorials ,vol. 21, no. 3, pp. 2419–2465, 2019.[6] AECC, “White paper: Operational behavior of a high deﬁnition mapapplication,” Automotive Edge Computing Consortium, Tech. Rep., May2020.[7] 3GPP, “3GPP TS 29.520 - 5G System; Network Data Analytics Ser-vices;Stage 3,” 3rd Generation Partnership Project (3GPP), Tech. Rep.29.520, Mar 2019, version 15.3.0.[8] P. Yang, Y. Xiao, M. Xiao, and S. Li, “6G wireless communications:Vision and potential techniques,”

IEEE Network , vol. 33, no. 4, pp. 70–75, July 2019.[9] S. Ali, W. Saad, N. Rajatheva, K. Chang, D. Steinbach, B. Sliwa,C. Wietfeld, K. Mei, H. Shiri, H. Zepernick, T. M. C. Chu, I. Ahmad,J. Huusko, J. Suutala, S. Bhadauria, V. Bhatia, R. Mitra, S. Amuru,R. Abbas, B. Shao, M. Capobianco, G. Yu, M. Claes, T. Karvonen,M. Chen, M. Girnyk, and H. Malik, “6G white paper on machinelearning in wireless communication networks,” Apr 2020.[10] J. Ren, D. Zhang, S. He, Y. Zhang, and T. Li, “A survey on end-edge-cloud orchestrated network computing paradigms: Transparentcomputing, mobile edge computing, fog computing, and cloudlet,”

ACMComput. Surv. , vol. 52, no. 6, Oct. 2019.[11] B. Sliwa and C. Wietfeld, “A reinforcement learning approach forefﬁcient opportunistic vehicle-to-cloud data transfer,” in , Seoul,South Korea, Apr 2020.[12] B. Sliwa, R. Adam, and C. Wietfeld, “Acting selﬁsh for the good of all:Contextual bandits for resource-efﬁcient transmission of vehicular sensordata,” in

Proceedings of the ACM MobiHoc Workshop on CooperativeData Dissemination in Future Vehicular Networks (D2VNet) , Online,Oct 2020.[13] J. Wang, C. Jiang, H. Zhang, Y. Ren, K. Chen, and L. Hanzo, “Thirtyyears of machine learning: The road to pareto-optimal wireless net-works,”

IEEE Communications Surveys Tutorials , pp. 1–1, 2020.[14] C. Jiang, H. Zhang, Y. Ren, Z. Han, K. C. Chen, and L. Hanzo,“Machine learning paradigms for next-generation wireless networks,”

IEEE Wireless Communications , vol. 24, no. 2, pp. 98–105, April 2017.[15] H. Ye, L. Liang, G. Y. Li, J. Kim, L. Lu, and M. Wu, “Machine learningfor vehicular networks: Recent advances and application examples,”

IEEE Vehicular Technology Magazine , vol. 13, no. 2, pp. 94–101, June2018.[16] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application ofmachine learning in wireless networks: Key techniques and open issues,”

IEEE Communications Surveys Tutorials , vol. 21, no. 4, pp. 3072–3108,2019.[17] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,”

Nature , vol. 521,no. 7553, pp. 436–444, 5 2015.[18] L. Breiman, “Random forests,”

Mach. Learn. , vol. 45, no. 1, pp. 5–32,Oct. 2001.[19] C. E. Rasmussen,

Gaussian Processes in Machine Learning . Berlin,Heidelberg: Springer Berlin Heidelberg, 2004, pp. 63–71.[20] D. Arthur and S. Vassilvitskii, “k-means++: The advantages of carefulseeding,” in

In Proceedings of the 18th Annual ACM-SIAM Symposiumon Discrete Algorithms , 2007.[21] H. Gacanin, “Autonomous wireless systems with artiﬁcial intelligence:A knowledge management perspective,”

IEEE Vehicular TechnologyMagazine , pp. 1–1, 2019.[22] R. S. Sutton and A. G. Barto,

Reinforcement learning: An introduction ,2nd ed. The MIT Press, 2018.[23] C. J. C. H. Watkins and P. Dayan, “Q-learning,”

Machine Learning ,vol. 8, no. 3, pp. 279–292, May 1992. [24] S. Sevgican, M. Turan, K. G¨okarslan, H. B. Yilmaz, and T. Tugcu,“Intelligent network data analytics function in 5G cellular networksusing machine learning,” Journal of Communications and Networks ,vol. 22, no. 3, pp. 269–280, 2020.[25] 3GPP, “3GPP TR 23.791 - Study of Enablers for Network Automationfor 5G,” 3rd Generation Partnership Project (3GPP), Tech. Rep., Jun2019, v16.2.0.[26] J. Park, S. Samarakoon, M. Bennis, and M. Debbah, “Wireless networkintelligence at the edge,”

Proceedings of the IEEE , vol. 107, no. 11, pp.2204–2239, Nov 2019.[27] S. D¨orner, S. Cammerer, J. Hoydis, and S. t. Brink, “Deep learningbased communication over the air,”

IEEE Journal of Selected Topics inSignal Processing , vol. 12, no. 1, pp. 132–143, Feb 2018.[28] B. Sliwa and C. Wietfeld, “Data-driven network simulation for perfor-mance analysis of anticipatory vehicular communication systems,”

IEEEAccess , Nov 2019.[29] ——, “Towards data-driven simulation of end-to-end network perfor-mance indicators,” in , Honolulu, Hawaii, USA, Sep 2019.[30] E. R. Cavalcanti, J. A. R. de Souza, M. A. Spohn, R. C. d. M. Gomes,and A. F. B. F. d. Costa, “VANETs’ research over the past decade:Overview, credibility, and trends,”

SIGCOMM Comput. Commun. Rev. ,vol. 48, no. 2, pp. 31–39, May 2018.[31] N. Bui, M. Cesana, S. A. Hosseini, Q. Liao, I. Malanchini, andJ. Widmer, “A survey of anticipatory mobile networking: Context-basedclassiﬁcation, prediction methodologies, and optimization techniques,”

IEEE Communications Surveys & Tutorials , 2017.[32] S. Toufga, S. Abdellatif, P. Owezarski, T. Villemur, and D. Relizani,“Effective prediction of V2I link lifetime and vehicle’s next cell forsoftware deﬁned vehicular networks: A machine learning approach,” in

IEEE Vehicular Networking Conference (VNC) , Los Angeles, USA, Dec2019.[33] A. Dalgkitsis, P. Mekikis, A. Antonopoulos, and C. Verikoukis, “Datadriven service orchestration for vehicular networks,”

IEEE Transactionson Intelligent Transportation Systems , pp. 1–10, 2020.[34] B. Coll-Perales, J. Gozalvez, and J. L. Maestre, “5G and beyond: Smartdevices as part of the network fabric,”

IEEE Network , vol. 33, no. 4,pp. 170–177, July 2019.[35] S. Ha, S. Sen, C. Joe-Wong, Y. Im, and M. Chiang, “TUBE: Time-dependent pricing for mobile data,” in

Proceedings of the ACM SIG-COMM 2012 Conference on Applications, Technologies, Architectures,and Protocols for Computer Communication , ser. SIGCOMM ’12.New York, NY, USA: Association for Computing Machinery, 2012, p.247–258.[36] C. Shi, K. Joshi, R. K. Panta, M. H. Ammar, and E. W. Zegura, “CoAST:Collaborative application-aware scheduling of last-mile cellular trafﬁc,”in

Proceedings of the 12th Annual International Conference on MobileSystems, Applications, and Services , ser. MobiSys ’14. New York, NY,USA: Association for Computing Machinery, 2014, p. 245–258.[37] A. Chakraborty, V. Navda, V. N. Padmanabhan, and R. Ramjee, “Coor-dinating cellular background transfers using loadsense,” in

Proceedingsof the 19th Annual International Conference on Mobile Computing &Networking , ser. MobiCom ’13. New York, NY, USA: Association forComputing Machinery, 2013, p. 63–74.[38] J. Lee, J. Lee, Y. Im, S. Dhawaskar Sathyanarayana, P. Rahimzadeh,X. Zhang, M. Hollingsworth, C. Joe-Wong, D. Grunwald, and S. Ha,“CASTLE over the air: Distributed scheduling for cellular data trans-missions,” in

Proceedings of the 17th Annual International Conferenceon Mobile Systems, Applications, and Services , ser. MobiSys ’19. NewYork, NY, USA: ACM, 2019, pp. 417–429.[39] B. Sliwa, R. Falkenberg, T. Liebig, N. Piatkowski, and C. Wietfeld,“Boosting vehicle-to-cloud communication by machine learning-enabledcontext prediction,”

IEEE Transactions on Intelligent TransportationSystems , Jul 2019.[40] G. Nikolov, M. Kuhn, A. McGibney, and B.-L. Wenning, “Reducedcomplexity approach for uplink rate trajectory prediction in mobilenetworks,” in ,Jun 2020.[41] J. Lee, S. Lee, J. Lee, S. D. Sathyanarayana, H. Lim, J. Lee, X. Zhu,S. Ramakrishnan, D. Grunwald, K. Lee, and S. Ha, “PERCEIVE: Deeplearning-based cellular uplink prediction using real-time schedulingpatterns,” in

Proceedings of the 18th International Conference on MobileSystems, Applications, and Services , ser. MobiSys ’20. New York, NY,USA: Association for Computing Machinery, 2020, p. 377–390.[42] 3GPP, , 3rd GenerationPartnership Project Technical Speciﬁcation, Rev. V15.2.0, Oct 2018. [43] A. Samba, Y. Busnel, A. Blanc, P. Dooze, and G. Simon, “Instanta-neous throughput prediction in cellular networks: Which informationis needed?” in , May 2017, pp. 624–627.[44] A. Herrera-Garcia, S. Fortes, E. Baena, J. Mendoza, C. Baena, andR. Barco, “Modeling of key quality indicators for end-to-end networkmanagement: Preparing for 5G,”

IEEE Vehicular Technology Magazine ,vol. 14, no. 4, pp. 76–84, Dec 2019.[45] J. Riihijarvi and P. Mahonen, “Machine learning for performance pre-diction in mobile cellular networks,”

IEEE Computational IntelligenceMagazine , vol. 13, no. 1, pp. 51–60, Feb 2018.[46] A. Zappone, M. D. Renzo, and M. Debbah, “Wireless networks designin the era of deep learning: Model-based, AI-based, or both?”

IEEECommunications Magazine , 2020.[47] F. Jomrich, A. Herzberger, T. Meuser, B. Richerzhagen, R. Steinmetz,and C. Wille, “Cellular bandwidth prediction for highly automateddriving - Evaluation of machine learning approaches based on real-world data,” in

Proceedings of the 4th International Conference onVehicle Technology and Intelligent Transport Systems 2018 , no. 4.SCITEPRESS, Mar 2018, pp. 121–131.[48] B. Sliwa and C. Wietfeld, “Empirical analysis of client-based networkquality prediction in vehicular multi-MNO networks,” in , Honolulu, Hawaii,USA, Sep 2019.[49] P. Domingos, “A few useful things to know about machine learning,”

Commun. ACM , vol. 55, no. 10, p. 78–87, Oct. 2012.[50] M. Akselrod, N. Becker, M. Fidler, and R. Luebben, “4G LTE onthe road - what impacts download speeds most?” in , Sep. 2017, pp. 1–6.[51] B. Sliwa, R. Falkenberg, and C. Wietfeld, “Towards cooperative datarate prediction for future mobile and vehicular 6G networks,” in , Levi, Finland, Mar 2020.[52] C. Ide, B. Dusza, and C. Wietfeld, “Client-based control of the inter-dependence between LTE MTC and human data trafﬁc in vehicularenvironments,”

IEEE Transactions on Vehicular Technology , vol. 64,no. 5, pp. 1856–1871, 2015.[53] B. Sliwa, T. Liebig, R. Falkenberg, J. Pillmann, and C. Wietfeld,“Efﬁcient machine-type communication using multi-metric context-awareness for cars used as mobile sensors in upcoming 5G networks,” in , Porto,Portugal, Jun 2018, Best student paper award.[54] Z. Zhou, X. Chen, E. Li, L. Zeng, K. Luo, and J. Zhang, “Edgeintelligence: Paving the last mile of artiﬁcial intelligence with edgecomputing,”

Proceedings of the IEEE , vol. 107, no. 8, pp. 1738–1762,2019.[55] L. Li, W. Chu, J. Langford, and R. E. Schapire, “A contextual-banditapproach to personalized news article recommendation,” in

Proceedingsof the 19th International Conference on World Wide Web , ser. WWW’10. New York, NY, USA: Association for Computing Machinery,2010, p. 661–670.[56] K. Satoda, E. Takahashi, T. Onishi, T. Suzuki, D. Ohta, K. Kobayashi,and T. Murase, “Passive method for estimating available throughputfor autonomous off-peak data transfer,”

Wireless Communications andMobile Computing , vol. 2020, pp. 1–12, 02 2020.[57] R. Falkenberg, B. Sliwa, N. Piatkowski, and C. Wietfeld, “Machinelearning based uplink transmission power prediction for LTE and up-coming 5G networks using passive downlink indicators,” in , Chicago, USA, Aug2018.[58] B. Sliwa, N. Piatkowski, and C. Wietfeld, “LIMITS: Lightweight ma-chine learning for IoT systems with resource limitations,” in , Dublin, Ireland,Jun 2020, Best paper award.[59] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, andI. H. Witten, “The WEKA data mining software: An update,”

SIGKDDExplorations , vol. 11, no. 1, pp. 10–18, 2009.[60] J. a. Gama, I. ˇZliobaite, A. Bifet, M. Pechenizkiy, and A. Bouchachia,“A survey on concept drift adaptation,”

ACM Comput. Surv. , vol. 46,no. 4, Mar. 2014.[61] B. Lakshminarayanan, D. M. Roy, and Y. W. Teh, “Mondrian forests:Efﬁcient online random forests,” in