[PDF] Acting Selfish for the Good of All: Contextual Bandits for Resource-Efficient Transmission of Vehicular Sensor Data

Abstract

as a novel client-based method for resource-efficient opportunistic transmission of delay-tolerant vehicular sensor data. BS-CB applies a hybrid approach which brings together all major machine learning disciplines - supervised, unsupervised, and reinforcement learning - in order to autonomously schedule vehicular sensor data transmissions with respect to the expected resource efficiency. Within a comprehensive real world performance evaluation in the public cellular networks of three Mobile Network Operators (MNOs), it is found that 1) The average uplink data rate is improved by 125%-195% 2) The apparently selfish goal of data rate optimization reduces the amount of occupied cell resources by 84%-89% 3) The average transmission-related power consumption can be reduced by 53%-75% 4) The price to pay is an additional buffering delay due to the opportunistic medium access strategy.

Full PDF

AActing Selfish for the Good of All: Contextual Bandits forResource-Efficient Transmission of Vehicular Sensor Data

Benjamin Sliwa

Communication Networks InstituteTU Dortmund University, [email protected]

Rick Adam

Communication Networks InstituteTU Dortmund University, [email protected]

Christian Wietfeld

Communication Networks InstituteTU Dortmund University, [email protected]

ABSTRACT

In this work, we present Black Spot-aware Contextual Bandit (BS-CB)as a novel client-based method for resource-efficient opportunistictransmission of delay-tolerant vehicular sensor data. BS-CB appliesa hybrid approach which brings together all major machine learningdisciplines – supervised, unsupervised, and reinforcement learning– in order to autonomously schedule vehicular sensor data trans-missions with respect to the expected resource efficiency. Withina comprehensive real world performance evaluation in the pub-lic cellular networks of three Mobile Network Operators (MNOs),it is found that 1) The average uplink data rate is improved by125%-195% 2) The apparently selfish goal of data rate optimizationreduces the amount of occupied cell resources by 84%-89% 3) Theaverage transmission-related power consumption can be reducedby 53%-75% 4) The price to pay is an additional buffering delay dueto the opportunistic medium access strategy.

CCS CONCEPTS • Networks → Network resources allocation; Network performancemodeling; Network measurement; Mobile networks; • Computingmethodologies → Mobile agents; Machine learning; Reinforcementlearning; Classification and regression trees;

Accepted for presentation in: Proceedings of the ACM MobiHoc Workshop on Cooperative Data Dissemination in Future Vehicular Networks (D2VNet)

Vehicular crowdsensing [22] is an emerging data acquisition para-digm which utilizes the various sensing and communication capa-bilities of modern vehicles and exploits their mobility for achievingdynamic sensor coverage of large regions. While it is expectedthat vehicular big data will stimulate the development of a mul-titude of novel data-driven services [23], the increase in massiveMachine-type Communication (mMTC) represents a massive chal-lenge for the cellular network where different users compete amongthe available cell resources. An important observation which mo-tivated our work is the high variance of the resource efficiency ofdata transmissions along the vehicular trajectories. On the onehand, vehicles encounter periods of high network quality – alsoreferred to as connectivity hotspots – where data transmissions areperformed highly resource efficiently. On the other hand, they arealso subject to low channel quality periods and encounter networkcongestion. Here, the mobile User Equipment (UE) applies a lowModulation and Coding Scheme (MCS) in order to avoid packeterrors and retransmissions. Moreover, also the power consumptionis often highly increased as the mobile UE needs to apply a hightransmission power to compensate challenging path loss situations.

AnticipatoryNetworkingMachineLearningChannelDynamicsCo-existence CellularNetwork

Vehicular Big Data ... Vehicle-as-asensor

Shared radiomediumResourcecompetition

Reinforcement Learning

Client-based Medium Access

Supervised Learning

Data Rate Prediction

UnsupervisedLearning

Black Spot Clustering f tSINR t

HD EnvironmentMapsTraﬃcMonitoringDistributedWeather SensingPredictiveMaintenanceRoadRoughness

Wireless NetworkingChallenges EnablingMethodsMobile Crowdsensing

DynamicLOS Opportunistic DataTransfermassiveMTCSmall ChannelCoherence Time Non-celular-centricNetworking

Figure 1: Overview about Applications, Challenges, and So-lution Approaches For Vehicular Crowdsensing

Since conventional data transfer methods access the radio mediumperiodically – without considering the channel conditions – a largeamount of resources is spend on improving the reliability of thedata transfer.

Non-cellular-centric networking is an emerging research fieldwhere client devices become part of the network fabric and partici-pate explicitly or implicitly in network management functions [7].Client-based opportunistic data transfer for delay-tolerant applica-tions schedule vehicular sensor data transmissions with respect tothe expected resource efficiency: Acquired data is buffered locallyuntil the mobility-dependent channel quality is considered suffi-cient. Due to the buffering-related delaying of the data transfer,this approach cannot be applied for safety-criticial data such ascooperate awareness messaging. However, since many vehicle-as-a-sensor applications – such as updates of High Definition (HD)environmental maps and traffic measurements – allow soft Ageof Information (AoI) deadlines, opportunistic medium access is apromising candidate for utilizing the existing network resources ina more efficient way. Fig. 1 summarizes the applications, challenges,and solution approaches for vehicular crowdsensing in cellular net-works.In this work, we present a novel client-based opportunistic datatransmission scheme that relies on a combination of multiple learn-ing models. The contributions are summarized as follows: • BS-CB is a novel hybrid machine learning -enabled trans-mission scheme for resource efficient transfer of vehicularsensor data. • Black spot-aware networking : Exploitation of knowl-edge about the geospatially-dependent uncertainties of theprediction model. • Real world performance evaluation and comparison ofthe novel approach to existing methods a r X i v : . [ c s . N I] J u l e remainder of the paper is structured as follows. After dis-cussing the related work in Sec. 2, we present the proposed BS-CBin Sec. 3. Afterwards, an overview about the methodological aspectsis given in Sec. 4. Finally, detailed results of real world experimentsand data-driven simulations are provided in Sec. 5. Anticipatory networking [6] is a novel communications para-digm which aims to optimize decision processes within mobilecommunication systems through proactive consideration of con-text information. Due to the inherent interdependency of mobilityand radio propagation dynamics, highly mobile systems such asvehicular networks are expected to benefit significantly from thisform of network optimization. As pointed out by a recent report ofthe 5G Automotive Association (5GAA) [2], predictive Quality ofService (QoS) along the vehicular trajectories will a key enabler forfuture connected and automated driving.

Machine learning allows to expose hidden interdependenciesbetween measurable variables and represents a key enabler foranticipatory networking. Machine learning models can be charac-terized into three major categories:

Supervised learning techniquestrain a model f on a training data set X with labeled data Y suchthat f : X → Y . Afterwards, the trained model can be utilized tomake predictions on unlabeled data sets. Unsupervised learning isapplied to detect patterns in unlabeled data sets. This allows tocluster data points with similar characteristics, e.g., through appli-cation of the popular k-means [4] method.

Reinforcement learning is an important step towards zero touch optimization of wirelesscommunication systems. Hereby, agents learn autonomous deci-sion making by performing actions within an environment throughobservation of the resulting rewards .A detailed summary about models and applications related toresearch questions in the wireless communication domain is givenby the authors of [21]. Within the emerging 5G networks, theintegration of machine learning methods mainly focuses on thenetwork infrastructure side. Manifestations of this developmentcan be seen in the Network Data Analytics Function (NWDAF)[1] for network load assessment (e.g., for dynamic slicing) and inthe architectural framework defined by the International Telecom-munication Union (ITU) [12] for utilizing machine learning-basednetwork management. It is expected that the trend of replacingmathematical models by machine learning functions will continuefurther and ultimately lead to pervasive machine learning in futurenetworks such as 6G [3].Different research works (e.g., [10, 19]) have analyzed client-based data rate prediction for mobile networks based on networkindicator measurements. An important observation is that Classifi-cation and Regression Tree (CART)-based methods such as RandomForests (RFs) [5] often achieve a better prediction accuracy thanmore complex methods such as deep learning which require a sig-nificantly higher amount of training data in order to overcome the curse of dimensionality [24].The advancements in machine learning-enabled networking havealso catalyzed the emergence of novel performance analysis meth-ods that focus on end-to-end modeling of wireless communicationsystems. In this work, we apply a corresponding setup for training

Opportunistic Data TransferMachine Learning Features R S R P R S R Q S I NR C Q I T A F r eq . S peed C e ll I d P a y l oad Measurements D a t a R a t e P o s i t i on Data RatePrediction(Sec )3.1 Black SpotClustering(Sec. )3.2 System Parameters

Feature set Label T a r ge t D a t a R a t e A o I D ead - li ne T r ade - o ﬀ F a c t o r ContextualBandit(Sec. )3.3 truefalse

IDLEHandleVehicle inBS region?

Figure 2: Overall System Architecture Model and parameterizing the reinforcement learning-based transmis-sion scheme (see Sec. 4): Data-driven Network Simulation (DDNS)[18] is a novel machine learning-enabled simulation method whichprovides fast and accurate modeling of end-to-end performanceindicators in concrete evaluation scenarios by replaying empiri-cal context traces. Hereby, multiple prediction models are appliedjointly in order to learn the end-to-end behavior of a target per-formance indicator as well as the statistical derivations betweenprediction model and ground truth measurements.

The overall system architecture model of the proposed solutionapproach is shown in Fig. 2. Instead of using a multi-dimensionalfeature vector of raw context measurements for the autonomousdecision making, we use an intermediate supervised learning stepto forecast the currently achievable data rate in order to reducethe dimensionality of the learning problem. Moreover, knowledgeabout the geospatial dependency of the prediction errors is utilizedto improve the opportunistic data transfer process. In the following,the different modules are explained in further details.

The overall feature set x is composed of measurements from differ-ent context domains • Network features x net : Reference Signal Received Power(RSRP), Reference Signal Received Quality (RSRQ), Signal-to-interference-plus-noise Ratio (SINR), Channel QualityIndicator (CQI), Timing Advance (TA) and carrier frequency • Mobility features x mob : Speed of the vehicle and cell idof the connected evolved Node B (eNB) • Application features x app : Payload size of the data packetto be transmittedDue to the findings of the in-depth comparison of different datarate prediction models in [18], we apply a RF model for predictingthe currently achievable data rate as ˜ S = f RF ( x ) . .2 Unsupervised Learning for Black SpotClustering In previous work [20], we have pointed out that the achievableaccuracy of prediction models has a geospatial dependency : Artifactsin the observed prediction performance often occur cluster-wiseand are mostly related to effects which are not covered by thefeature set (e.g., handovers, short term link loss). Although thisknowledge does not allow us to compensate the undesired effects,it can be utilized as a measurement of trust into the predictionmodel in order to strengthen the robustness of the context-awaredata transfer. With respect to its usage in traffic safety, wherethe term black spot corresponds to a geographical region with anincreased probability for collisions, we migrate its usage to thewireless communications domain and use it as a description forgeographical regions with exceptional high prediction uncertainty.The black spot-aware approach is divided into two phases:

Offline data analysis:

At first, k-means [4] is applied to per-form a geo-spatial clustering of the data points into a total amountof N c clusters. For each cluster c with N cluster points, the RootMean Squared Error (RMSE) is calculated based on the differencebetween predictions ˜ S and measurements S asRMSE = (cid:118)(cid:117)(cid:116) (cid:205) Ni = (cid:16) ˜ S i − S i (cid:17) N . (1)If the computed value exceeds a defined threshold RMSE max , thecluster c is considered as a black spot cluster . Finally, all black spotsclusters are fitted to ellipses based on the dominant intra-clusterdistance vector. Fig. 3 summarizes different steps for of the blackspot cluster determination. Online application : For the later exploitation of the derivedknowledge by the reinforcement learning-based data transmission,a vehicle needs to know if it is currently within a black spot region.For a given cartesian point P , an intersection test for an α -rotatedellipse centered at P is performed as ( c · v . x + s · v . y ) a + ( s · v . x − c · v . y ) b ≤ v = P − P , c = cos α , and s = sin α . An example for theblack spot regions for MNO A on the considered evaluation trackis shown in Fig. 4.

The actual opportunistic data transfer process is represented by aLinear Upper Confidence Bound (LinUCB) [13] contextual banditwith two arms which correspond to the possible actions : • a IDLE delays the data transfer in favor of an expected re-source efficiency improvement in the future. Acquiredsensor data is buffered locally. • a TX transmits the whole data buffer.The context-aware arm selection process is modeled as a t = arg max a ∈ A t (cid:169)(cid:173)(cid:173)(cid:173)(cid:173)(cid:171) ˆ θ Ta x t , a (cid:124) (cid:32) (cid:123)(cid:122) (cid:32) (cid:125) Estimated reward + α (cid:113) x Tt , a A − a x t , a (cid:124) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:123)(cid:122) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:125) UCB C a (cid:170)(cid:174)(cid:174)(cid:174)(cid:174)(cid:172) (3) Longitude La t i t ude Raw Measurements

Longitude La t i t ude Clustered Black SpotMeasurements

Longitude La t i t ude Fitted Ellipses

Figure 3: Steps for the Determination of Black Spot RegionsFigure 4: Resulting Black Spot Regions for

MNO A on theEvaluation Track (Map: OpenStreetMap contributors, CCBY-SA) whereas the estimated arm reward is derived through ridge regres-sion with ˆ θ a being the regression coefficients and x t , a = { ˜ S ( t ) , ∆ t } being the d -dimensional feature vector for arm a in time step t .The parameter α = + (cid:113) ln ( / δ ) controls the degree of explorationbased on the only system parameter δ . For the Upper ConfidenceBound (UCB) part, A a = D Ta D a + I a consists of a d -dimensionalidentity matrix I a and D a as a m × d matrix that contains the m rows of training inputs.After performing either the TX or the IDLE action, a real-valuedreward r t is observed and the regression coefficients are updatedas: ˆ θ a ← A − a b a (4)with b a t ← b a t + r t x t , a t (5)whereas b a t is set to a d -dimensional zero vector upon first ini-tialization. The reward is calculated action-specific based on thecorresponding reward functions: r TX ( S , ∆ t ) = ω · ( ˜ S − S ∗ ) S max + ∆ t · ( − ω ) ∆ t max (6) r IDLE ( ∆ t ) = (cid:40) Ω ∆ t ≥ ∆ t max hereas S ∗ represents an MNO-specific target data rate and ∆ t max corresponds to an application-specific upper bound for the tolerableAoI. w is a trade-off parameter for controlling the focus on eitherdata rate optimization or AoI focus. Ω is a negative number whichis used as a deadline violation punishment in order to ensure thatthe TX action is immediately if the deadline is violated. A two-state methodological approach is applied: At first, a DDNSsetup (see [18]) is utilized to train the reinforcement learning mech-anism. Afterwards, we perform a real world measurement studyfor comparing the novel approach with different existing methods: • Periodic transfer represents the typical Machine-typeCommunication (MTC) approach where data is transmit-ted based on a fixed interval (here ∆ t = s ) without con-sidering the current channel quality. • Channel-aware Transmission (CAT) [11] is a proba-bilistic data transmissions scheme which uses the mea-sured SINR for client-side scheduling of sensor data trans-missions. • Machine Learning CAT (ML-CAT) [15] is a machinelearning-based extension to CAT. Instead of only usinga single network quality indicator for the opportunisticmedium access, ML-CAT uses the predicted data rate (sim-ilar to Sec. 3) • Reinforcement Learning CAT (RL-CAT) [20] is a firstreinforcement learning-enabled data transfer method whichreplaces the probabilistic medium access with

Q-learning -based decision making.For the real world evaluation, we consider a 25 km long evaluationtrack which consists of highway and suburban parts. For eachtransmission scheme, five drive tests are performed where sensor istransmitted via Transmission Control Protocol (TCP) in the uplinkthrough the cellular network of three different German MNOs. Alltransmissions are performed with an Android-based UE (SamsungGalaxy S5 Neo, Model SM-G903F). The applied BS-CB parametersare summarized in Tab. 1.

Table 1: Default parameters of the evaluation setupParameter Value

Maximum buffering time ∆ t max

120 sTrade-off factor w Ω -1Exploration parameter δ N c max

3, 2.25, 2.5The prediction models are learned with the Waikato Environmentfor Knowledge Analysis (WEKA)-based [9] Lightweight MachineLearning for IoT Systems (LIMITS) [17] framework which providesautomatic generation of

C/C++ code for the trained models. Forunsupervised learning and the Gaussian Process Regression (GPR)models required for the DDNS setup, the

Statistics and MachineLearning Toolbox of MATLAB is utilized.For analyzing the communication-related power consumptionof the UE, the most important indicator is the applied transmission

Trade-oﬀ Factor00.20.40.60.81 N o r m a li z ed E ﬃ c i en cy I nd i c a t o r Target DataRate MarginAoI DeadlineMargin

Figure 5: Controllable Trade-off Between Data Rate and AoIOptimization D a t a R a t e [ M B i t/ s ] Proposed Contextual Bandit Approach

RL-CAT with Q-Learning [20]RL-CAT Variant with DeepReinforcement Learning [20] PeriodicML-CAT

Figure 6: Convergence of the Reinforcement Learning Pro-cess power P TX . Although Android -based UEs do not expose this infor-mation to the user space, it can be inferred from radio signal mea-surements due to a significant correlation with distance-dependentindicators such as RSRP [8]. In order to determine the power con-sumption as a function of the applied transmission power, we utilizelaboratory measurements of the device-specific power consump-tion behavior. A deeper discussion about the applied method canbe found in [15].For calculating the network resource efficiency of the transmis-sion schemes in the post processing, we revert the table lookupprocedure described in [14]. Based on the CQI measurements, therequired MCS and Transport Block Size (TBS) indices are obtainedfrom a lookup table.

In this section, the results for the DDNS-based system optimizationas well as for the real world performance evaluation are presented.

As discussed in Sec. 3.3, opportunistic data transfer is subject toa fundamental trade-off between data rate and AoI optimizationwhich can be controlled via the trade-off factor w . For the purposeof comparing the performance in both dimensions, we define twoefficiency indicators: • The data rate efficiency E S = ¯ S / S ∗ measures how goodthe average data rate ¯ S approaches the target data rate S ∗ • The

AoI efficiency E AoI = − ¯ ∆ t / ∆ t max is a measurefor the margin between the average AoI and the deadline ∆ t max e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B D a t a R a t e [ M B i t/ s ] MNO A MNO B MNO C + 195 % + 140 % + 125 % P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B N u m be r o f P R B s pe r M B MNO A MNO B MNO C x 10 - 84% - 86 % - 89 % P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P o w e r C on s u m p t i on pe r M B [ J ] MNO A MNO B MNO C - 74 %- 64 % - 53 % P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B P e r i od i c C A T M L - C A T R L - C A T B S - C B A ge o f I n f o r m a t i on [ s ] MNO A MNO B MNO C

Figure 7: Performance Comparison of Opportunistic Transmission Schemes for Multiple MNOs

Fig. 5 shows the normalized behavior of both indicators fordifferent values of w . Is can be seen that the data rate benefitsfrom larger packets – which correspond to a lower AoI efficiency– in order to achieve a better payload-overhead ratio and a bettercompensation of the slow start mechanism of TCP. In the following,we focus our analysis on data rate optimization and assume w = . epoch represents one virtual drive test on theevaluation track within the DDNS. Fig. 6 shows the resulting datarate of the proposed contextual bandit-based transmission scheme.For reference, the convergence behavior of a Q-learning approachaccording to [20] and a deep reinforcement learning variant ofthe latter are shown. Hereby, the corresponding Artificial NeuralNetwork (ANN) is set up according to [18] with two hidden layersand ten neurons per hidden layer. It can be seen that the proposedcontextual bandit-based method achieves the highest absolute datarate and provides an early convergence which is reached after ∼

200 epochs. For the considered deep reinforcement learning andQ-learning methods, the final data rate of the converged systemis significantly lower. Moreover, the Q-learning based approachshows a slow convergence behavior.

The performance of the converged transmission schemes is nowanalyzed in a real world scenario (see Sec. 4). Fig. 7 shows multiple performance indicators for the proposed transmission scheme aswell as for the considered references. It can be observed that theresulting data rate is continuously improved through the differentevolution stages of opportunistic data transfer: While the SINR-aware CAT method already outperforms the periodic approach, theintroduction of machine learning-based network quality assessmentby ML-CAT leads to significant performance improvement. Ulti-mately, reinforcement learning-based autonomous decision making(RL-CAT and BS-CB) achieves the highest data rate values. For

MNO A , BS-CB almost triples the resulting data rate. In addition,it can be seen that the apparently selfish goal of data rate opti-mization results in a significant reduction of MTC-related resourceoccupation – 84% to 89% – which contributes to a better overallcoexistence of different resource-consuming entities within the net-work. As a side effect, also the power consumption of the mobile UEis reduced as the opportunistic transmission approaches implicitlyprefer higher RSRP values which have a strong correlation with theapplied transmission power [8]. For

MNO B , it can be seen that thegeneral power consumption level is much higher than for the otherMNOs. In this scenario, the average distance to the eNBs is signifi-cantly higher for

MNO B then for the other MNOs. As a result, asignificantly higher transmission power is applied, which causesthe mobile UE to be in a less power-efficient amplification stagefor most of the time [8]. While the previous results have shownthat opportunistic sensor data transfer allows to achieve significantimprovements on the client and network side, the price to pay is anincreased AoI – about nine times the AoI of the periodic approach– which is the result of the buffering delay. However, the proposed

200 400 600 800Black Spot Distance [m]00.20.40.60.81 E CD F Multi-MNO MNO CMNO B MNO A E CD F Figure 8: Black Spot Statistics method allows to specify an upper limit for the acceptable AoI viathe parameter ∆ t max (see Sec. 3.3). Since the black spot-aware data transfer avoids transmissions if theUE is within a black spot region, it causes an additional bufferingdelay. Therefore, we now investigate the times and distances thevehicles spend within the black spot regions. Fig. 8 shows the cor-responding Empirical Cumulative Distribution Functions (ECDFs)for the three MNOs. In addition, the behavior of a potential futuremulti-MNO extension are shown where the vehicle dynamicallychanges the network if it is within a black spot region. For allMNOs, 50 % of the black spot regions spread no more than 100 mwhich only results in a slight additional delay. However, within theconsidered scenario, most of the black spots could be compensatedthrough a multi-MNO approach which massively reduces the sideeffects of the black spot-aware approach.

In this paper, we presented BS-CB as a novel approach for op-portunistic data transfer for vehicular sensor data. The proposedmethod makes use of a hybrid machine learning approach: Rein-forcement learning is applied to autonomously schedule data trans-missions with respect to the network quality based on data ratepredictions. In addition, knowledge about geographically clusteredblack spot regions is utilized for avoiding transmissions with highprediction uncertainties. In a comprehensive real world evaluation,it was shown that the novel method not only achieves significantimprovements for the uplink data rate and power consumptionof the mobile UE, but also contributes to optimizing the resourceefficiency of delay-tolerant MTC applications. In future work, wewant to extend BS-CB with a multi-MNO strategy which allowsdynamic network selection for compensating black spots regions.In addition, we plan to further analyze cooperative approaches –where the network infrastructure actively distributes network loadinformation to the mobile clients [16] – for data rate prediction inorder to optimize the resulting accuracy. Moreover, we aim to moveanother step forward towards zero touch optimization through inte-gration of online learning mechanisms for the data rate prediction.This would then allow the system to self-adapt to the concept drift caused by significant changes within the cellular network.

ACKNOWLEDGMENT

This work has been supported by the German Research Foundation (DFG) withinthe Collaborative Research Center SFB 876 “Providing Information by Resource-Constrained Analysis”, project B4.

REFERENCES [1] 3GPP. 2019. .Technical Report 29.520. 3rd Generation Partnership Project (3GPP).[2] 5GAA. 2020.

White paper: Making 5G proactive and predictive for the automotiveindustry . Technical Report. 5G Automotive Association.[3] S. Ali, W. Saad, N. Rajatheva, K. Chang, D. Steinbach, B. Sliwa, C. Wietfeld, K.Mei, H. Shiri, H. Zepernick, T. M. C. Chu, I. Ahmad, J. Huusko, J. Suutala, S.Bhadauria, V. Bhatia, R. Mitra, S. Amuru, R. Abbas, B. Shao, M. Capobianco, G.Yu, M. Claes, T. Karvonen, M. Chen, M. Girnyk, and H. Malik. 2020. 6G whitepaper on machine learning in wireless communication networks.[4] D. Arthur and S. Vassilvitskii. 2007. k-means++: The advantages of carefulseeding. In

In Proceedings of the 18th Annual ACM-SIAM Symposium on DiscreteAlgorithms .[5] L. Breiman. 2001. Random forests.

Mach. Learn.

45, 1 (oct, 2001), 5–32.[6] N. Bui, M. Cesana, S. A. Hosseini, Q. Liao, I. Malanchini, and J. Widmer. 2017. Asurvey of anticipatory mobile networking: Context-based classification, predic-tion methodologies, and optimization techniques.

IEEE Communications Surveys& Tutorials (2017).[7] B. Coll-Perales, J. Gozalvez, and J. L. Maestre. 2019. 5G and beyond: Smartdevices as part of the network fabric.

IEEE Network

33, 4 (July 2019), 170–177.[8] R. Falkenberg, B. Sliwa, N. Piatkowski, and C. Wietfeld. 2018. Machine learningbased uplink transmission power prediction for LTE and upcoming 5G net-works using passive downlink indicators. In . Chicago, USA.[9] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. 2009.The WEKA data mining software: An update.

SIGKDD Explorations

11, 1 (2009),10–18.[10] A. Herrera-Garcia, S. Fortes, E. Baena, J. Mendoza, C. Baena, and R. Barco.2019. Modeling of key quality indicators for end-to-end network management:Preparing for 5G.

IEEE Vehicular Technology Magazine

14, 4 (Dec 2019), 76–84.[11] C. Ide, B. Dusza, and C. Wietfeld. 2015. Client-based control of the interdepen-dence between LTE MTC and human data traffic in vehicular environments.

IEEE Transactions on Vehicular Technology

64, 5 (2015), 1856–1871.[12] ITU-T. 2019.

Architectural framework for machine learning in future networksincluding IMT-2020 . Recommendation Y.3172. International TelecommunicationUnion. Recommendation ITU-T Y.3172.[13] L. Li, W. Chu, J. Langford, and R. E. Schapire. 2010. A contextual-bandit approachto personalized news article recommendation. In

Proceedings of the 19th Interna-tional Conference on World Wide Web (WWW fi10) . Association for ComputingMachinery, New York, NY, USA, 661fi?!670.[14] K. Satoda, E. Takahashi, T. Onishi, T. Suzuki, D. Ohta, K. Kobayashi, and T. Murase.2020. Passive method for estimating available throughput for autonomous off-peak data transfer.

Wireless Communications and Mobile Computing

IEEE Transactions on Intelligent Transportation Systems (Jul 2019).[16] B. Sliwa, R. Falkenberg, and C. Wietfeld. 2020. Towards cooperative data rateprediction for future mobile and vehicular 6G networks. In . Levi, Finland.[17] B. Sliwa, N. Piatkowski, and C. Wietfeld. 2020. LIMITS: Lightweight machinelearning for IoT systems with resource limitations. In . Dublin, Ireland. Best paper award.[18] B. Sliwa and C. Wietfeld. 2019. Data-driven network simulation for performanceanalysis of anticipatory vehicular communication systems.

IEEE Access (Nov2019).[19] Benjamin Sliwa and Christian Wietfeld. 2019. Empirical analysis of client-basednetwork quality prediction in vehicular multi-MNO networks. In . Honolulu, Hawaii, USA.[20] B. Sliwa and C. Wietfeld. 2020. A reinforcement learning approach for efficientopportunistic vehicle-to-cloud data transfer. In . Seoul, South Korea.[21] J. Wang, C. Jiang, H. Zhang, Y. Ren, K. Chen, and L. Hanzo. 2020. Thirty years ofmachine learning: The road to pareto-optimal wireless networks.

IEEE Commu-nications Surveys Tutorials (2020), 1–1.[22] Tzu-Yang Yu, Xiru Zhu, and Muthucumaru Maheswaran. 2018.

Vehicular crowd-sensing for smart cities . Springer International Publishing, Cham, 175–204.[23] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi. 2014. Internet ofthings for smart cities.

IEEE Internet of Things Journal

1, 1 (2014), 22–32.[24] A. Zappone, M. D. Renzo, and M. Debbah. 2019. Wireless networks design inthe era of deep learning: Model-based, AI-based, or both?

IEEE Transactions onCommunications

67, 10 (2019), 7331–7376.67, 10 (2019), 7331–7376.