Transfer Learning and Organic Computing for Autonomous Vehicles
TTransfer Learning and Organic Computing forAutonomous Vehicles
Christofer Fellicious
University of Passau
Passau, [email protected]
Abstract —Autonomous Vehicles(AV) are one of the brightestpromises of the future which would help cut down fatalitiesand improve travel time while working in harmony. Autonomousvehicles will face with challenging situations and experiences notseen before. These experiences should be converted to knowledgeand help the vehicle prepare better in the future. Online TransferLearning will help transferring prior knowledge to a new taskand also keep the knowledge updated as the task evolves. Thispaper presents the different methods of transfer learning, onlinetransfer learning and organic computing that could be adaptedto the domain of autonomous vehicles.
Index Terms —online transfer learning, organic computing,autonomous driving
I. I
NTRODUCTION
Autonomous Vehicles(AV) or Driver-less Cars are one ofthe most widely discussed emerging technology in the presentday. An autonomous vehicle can be explained as a vehicle thatadapts to its surroundings and can navigate itself by sensingits environment and minimal human input. The advantageof autonomous vehicles is that they will be able to reducethe number of injuries or even motor accidents in general,and also reduce the transit times. Most automobile companieslike Volkswagen, Audi, BMW to tech companies like Google,nVidia, Tesla are engaging in research in autonomous vehi-cles. This shows that autonomous vehicles are the future oftransportation.The Society of Automotive Engineers(SAE) classifies au-tonomous vehicles into five different levels based of theircapabilities with level 0 being simple warnings to the driverto level 5 where even the steering column is not required inthe vehicle. While progressing through the stages, the carsaccumulate years of shared experience of different uniquetypes and scenarios that differ from region to region or evenseason to season. All this information from the sensors and thedriver input should be processed into knowledge to be usedto improve the autonomous functionalities of the vehicle. It istime consuming and inefficient to create models for specifictasks from scratch, whereas it is easier if the model adaptsitself and helps apply the existing knowledge to a new taskwith minimal input. It is also impractical to rebuild a modelevery time an autonomous vehicle encounters a new scenario.Another issue for any machine learning task is the lack ofavailability of large annotated datasets of adequate quality tobuild highly precise models. Transfer Learning is the method by which a preexisting pre-dictor is adapted for newer tasks without completely retrainingthe predictor. There are many real world examples that can beseen among humans such as the ability to ice skate couldbe transferred to learn to in-line skate, or learning to ride abicycle will help in learning to ride a motorcycle. ”Organiccomputing aims at mastering complexity in technical systemsby equipping technical systems with ’life-like’ properties, i.e.by means of characteristics observed in natural systems” [1]such as self-learning and self-organization. Online TransferLearning is a combination of transfer learning and the self-learning aspect of organic computing. The prior knowledgefrom an existing domain is transferred to a new domain togenerate a new model, and then the model is continuouslyrefined based on the self-learning ability of organic computing.Online Transfer Learning(OTL) helps build on the conceptof transfer learning where the predictor observes only a fewfeatures at a time. The advantage of online transfer learningis that it enables the model to be updated continuously basedon the arrival of new data.Transfer Learning and Organic Computing are relevant inhelping a vehicle learn and also assimilating the knowledgeand extrapolating the existing knowledge to different tasks.The need for these methods is that it is virtually impossible toforesee all the possible permutations that autonomous vehicleswill encounter during its use. The experiences could be froma tricky crossing to getting stuck in the sand. These methodswill not only help solve new tasks but also keep fine tuningthe model.The contribution made by this paper is a survey of • the different methods of transfer learning and onlinetransfer learning that could be used for autonomousvehicles • the recent approaches in organic computing and how self-learning and self-organizing could be used in the domainof autonomous vehicles • how the different methods relate and complement eachother to make autonomous vehicles more efficientII. R ELATED W ORK
A lot of research in recent years have been focused on theautonomous driving domain with nVidia releasing dedicatedcards like the nVidia PX2 for the purpose. Zhou et al. [2] has a r X i v : . [ c s . L G ] A ug escribed an interesting method that shows how semantic fea-tures learned by Convolutional Neural Networks(CNN) can betransferred. They were motivated by the absence of perfectlyannotated large datasets for the autonomous driving program.Pan et al. [3] have shown that reinforcement learning andtransfer learning can be used to train models for autonomousvehicles.A comprehensive paper by janai et al. [4] reveals theproblems faced by autonomous driving, the available datasetsand the state-of-art methods. The paper provides an in-depthanalysis of different methods used in autonomous vehiclessuch as pedestrian detection, optical flow, 3-D reconstruction,object recognition and segmentation etc. The paper highlightsproblems such as the limited availability of high quality opticalflow datasets that directly affect the quality of the trainedmodels.Semantic Segmentation plays a very important role in un-derstanding and interpreting a scene. Nigam et al. [5] considersan ensemble model incorporating knowledge transfer based ondrones for the semantic segmentation of aerial images.The Passive Aggressive(PA) algorithm is popular for theonline updation of models. Crammer et al. [6] discussesa family of algorithms for online learning predictions. Thealgorithm can be used from binary predictors to multi classpredictors. The algorithm uses linear kernels but with theapplication of the kernel trick, it can be made to predict highlynon-linear functions.III. M ETHODS OF L EARNING
This section explains the different methods in transferlearning and organic computing.
A. Transfer Learning
Most machine learning algorithms assume that the trainingand testing data would come from the same feature space. Insome situations, there might not be enough training data fora particular scenario. Consider that a vehicle needs to learnto drive in the mud. Instead of learning from scratch, it ispossible that the vehicle can learn how to drive on the roadusing a large dataset and then fine tune that training by usinga smaller dataset of driving in mud. The synopsis is that thecar does not need to learn to drive from scratch, instead itcould learn to drive and then adapt that knowledge to drive inmud. Transfer learning relies on the fact that there might bebasic similarities between the tasks, that can be carried overto the new one. This difference in traditional learning andtransfer learning can be seen in the Fig 1. Transfer Learningcan be classified into two different sections based on the sourcedomain and the target domain. They are • Homogeneous Transfer Learning • Heterogeneous Transfer LearningHomogeneous transfer learning takes place when the sourceand the target domains are present in the same featurespace while in heterogeneous transfer learning they are indifferent feature spaces. Heterogeneous transfer learning ismore difficult than the homogeneous mode because of the
Fig. 1. Comparison between traditional and transfer learning difference in feature spaces. ”Most methods for heterogeneoustransfer learning aim to learn a common feature representationbased on some correspondences between domains such thatboth source and target domain data can be represented byhomogeneous features” [7].
1) Attention Mechanism based Transfer Learning:
Moonand Carbonell [8] proposes a novel approach where theyassume that not all knowledge gained in the source domain canbe transferred to the target domain. Their approach selectivelytransfers knowledge from the source domain to the targetdomain by using only a subset of the source knowledge andsuppressing the rest which may have an adverse effect on thelearning in the target domain. In this approach, a componentknown as the attention mechanism learns a set of parametersthat contribute to a weight vector over a discrete subset ofdata. This weight vector describes the relevance of a particularfeature vector for transferring knowledge.Their method first projects the source features and targetfeatures into a joint latent space via linear transformation.These features are then mapped into an embedded label spaceusing a shared transformation.The authors use an autoencoderfor generating the learned model. Since the source and targetvectors share the same feature space, a parameter µ is usedto penalize the attention mechanism and the joint spacelearning function if they learn only for the source domain.The advantage of this model is that by extracting only thoseparts of the source data that is beneficial for transfer, the modelbecomes better streamlined.
2) Simultaneous multi task transfer learning:
Wong et al.[9] proposes an automated transfer learning method where itis possible for the algorithm to learn several models suitablefor different tasks. The authors assume that most trainingtasks in neural networks have common design decisions suchas network depth, training iterations, learning rate etc. Thecontroller which is a Recurrent Neural Network(RNN), iscapable of simultaneous multi-task training using two keyfeatures, (i) learned task representations and (ii) task-specificbaseline and normalization. The initial feature consists ofmapping each task to a unique embedding. The generation ofnetwork configurations is controlled by feeding the controllerthe embeddings at every time step, along with the actionembeddings of the previous step. The controller then decideshe appropriate action and passes it onwards to the NeuralNetwork layers. The tasks are trained using policy gradientmethods. Each task is also able to define a performance metricto be used as a reward. The reward affects the gradients appliedto update the controller’s policy for that specific task. Thecontroller has to simultaneously keep track of multiple tasksand their rewards, and so it is necessary to ensure that thereward distributions have the same scale and variance. Theauthors claim that when a new task is given, the explorationcan be significantly sped up by leveraging the learned biasesabout what combinations of parameter choices worked welltogether. The controller could learn an embedding for a newtask and learn a representation that biases towards actions thatperformed well on other similar tasks. The advantage of thismethod is the ability to train multiple tasks simultaneously andthe ability to establish a connection between a new task anda previously learned task. The downside to this algorithm isits memory intensive nature and that it will not perform wellunless a suitably large number of tasks have been trained onit.
B. Organic Computing ”Organic Computing is a research field emerging aroundthe conviction that problems of organization in complexsystems in computer science, telecommunications, neurobi-ology, molecular biology, ethology, and possibly even soci-ology can be tackled scientifically in a unified way. Fromthe computer science point of view, the apparent ease inwhich living systems solve computationally difficult problemsmakes it inevitable to adopt strategies observed in naturefor creating information processing machinery” [10]. In otherwords, through organic computing researchers attempt to bringinherent strategies in nature such as self-learning and self-organizing to machines.The organic computing machines form a sort of ensemblethat could cooperate with each other and evolve over time.In such systems, each individual unit could be autonomousbut when viewed as a whole, they could be seen as selforganizing entity. An organic computing system will responddynamically to changes in the environment and will also havethe sufficient freedom to do so. When we consider connectedautonomous vehicles as the basic entities, the application oforganic computing methods and its benefits become clear.When considering autonomous vehicles of the future, it isexpected that all vehicles will be interconnected and willcommunicate with each other to optimize travel times. Theentire system can be seen as a sort of swarm with eachvehicle having a distinct function for that particular time, i.etraveling from Point A to Point B. When considering sucha swarm, it needs to self-organize itself dynamically, basedon the inputs from individual components. This swarm can beseen as more than the sum of the individual elements. Suppose,there is heavy traffic on a particular path, the other vehiclesshould be rerouted. The whole system should self-organizein such a way that the adverse effects of any unintendedscenario is minimized. Organic computing helps in this area of autonomous vehicles by making the whole network as aself-organizing mechanism.Another application of organic computing is the self-learning paradigm. Once an entity, or in this case, an au-tonomous vehicle has learned a new skill or experience, theknowledge can be transfered to the other vehicles. When anautonomous vehicle is confronted with an entirely new tasksuch as an unfamiliar terrain situation, the knowledge froma similar situation could be transferred for this particulartask and then improved upon using organic computing. Forexample, trying to navigate a complex intersection where thereare vehicles coming at varying speeds and densities.
1) Self Learning with Dynamic Navigation Maps:
Lu et al.[11] demonstrates the self-learning capabilities of UnmannedAerial Vehicle(UAV) for taxiing in the aerodromes. They con-sider the UAV to be present in a highly dynamic environmentwhere there could be foreign objects, airport vehicles movingaround or even maintenance work taking place. To handle suchproblems,they propose the concept of ”dynamic navigationmap”, which is a collection of two maps, the aerodrome mapand the obstacle map. The aerodrome map provides visualclues to what the camera expects to see, and the obstacle mapis used to store the previously learned obstacle distribution.The obstacle map can be combined with the images from thesensor array to prepare a much more robust obstacle map.The self-learning part is handled by a Bayesian approach. TheBayesian inference is applied to the aerodrome map and theobstacle map separately. In this method, the probability ofobstacles from the camera images are considered and theirlevel of uncertainty is calculated. As the value of uncertaintyincreases, more relevance is placed on the aerodrome mapwhich means that in such an uncertain situation more weightis given to prior knowledge. Using this method, if an objectrepeatedly appears in the images, it will be confirmed andlearned with increasing confidence. The confidence measureis an advantage to the Bayesian method as the confidencemeasure cannot be obtained through normal Computer Visionalgorithms.
2) Optimizing vehicle behavior through reinforcement in-put:
M¨uller-Schloer and Tomforde. [1] illustrates how ordercan be obtained as an effect of reinforcement. The authorexplains how the University of Michigan chose to put onconcrete walkways in the most optimized paths betweenbuildings. The University initially planted grass all over thecampus and let the students choose their own path. After sometime, the most frequently used trails would emerge where thegrass will be almost non-existent. In this example, even withmultiple options students would choose the most optimal paththat would take them from building to building, and once thetrails emerged it is reinforced as the students would continueto choose the most optimal trail. In the beginning, there maybemultiple trails emerging but only the most convenient arefavored and reinforced. This could be used to solve issues withthe autonomous vehicles as it progresses through the differentlevels of automation. Initially, the vehicle would be free tomake any choice, but when an unfavorable choice is made theriver would intervene and rectify a mistake or the driver mighttake control in a previously unseen situation. In this context,we consider that the driver has prior knowledge and the choicemade by the driver to be optimal. This intervention can bereinforced over time, changing the choice of the autonomousvehicle over time [12].
C. Online Transfer Learning
Online Transfer Learning is an extension of the TransferLearning Framework and Organic Computing methods wherethe problem is addressed using an online learning framework.In an online learning framework, the algorithm observesinstances in a sequential manner. After observing the instance,the algorithm makes a prediction based on its knowledge. Thealgorithm then might receive feedback indicating the correctoutput. Using the feedback it received, the algorithm mayimprove itself so that future classifications may have betteraccuracy.
1) Ensemble based Online Transfer Learning:
Zhao andHoi [13] propose an ensemble based Online Transfer Learn-ing(OTL) framework. The authors experiment with both ho-mogeneous and heterogeneous data. In the homogeneous on-line transfer learning scenario, the authors first create a modelbased only on the target data. Later, an ensemble model iscreated which is a mixture of both the source and target data.A problem associated with homogeneous transfer learning isthat the target variable to be predicted might change over timeduring training and this phenomenon is known as concept drift.In order to remove the problem of concept drift, the ensemblemodel features both predictors from source and target domains.The predictions of both the functions are combined usingweights. There is a two step updating in the framework.Initially, the prediction function (f), updated by using onlinelearning method which is the Passive Aggressive algorithm[6]. The second step is to update the prediction weightsdynamically based on the current weights and a function ofthe square loss of the prediction.In the heterogeneous online transfer learning experiment,Zhao and Hoi assume that the source data is a proper subset ofthe target data. The authors propose a multi-view approach fortackling the heterogeneous data problem as the source domainand target domain are very different. The authors also assumethat the first m-dimensions of the target dataset featuresrepresents the source dataset features. Each data instance issplit into two, where the first part represents the source domainand the second part represents the new target domain. Thishelps the ensemble classifiers to classify the new observeddata sample correctly and forces the multi-view method not todeviate too much from the previous classifiers.
2) Learning from multiple source domains:
Wu et al.[14] describes an online transfer learning method where theknowledge from multiple source domains are considered andtransferred to a single target domain. The authors assertthat by building a model from multiple but related sourcedomains for homogeneous transfer of knowledge the finalmodel becomes more refined and learns to identify the core elements of knowledge that is to be transferred to the targetdomain. This approach is valid for the heterogeneous transferlearning too, where the authors assume that if the targetdomain feature space is split into two sets, the first setwould contain homogeneous features which are shared byboth the source and target domains while the second setwould contain heterogeneous features. An image classificationexample stated by the authors is to obtain the subset of featuresof the target images from multiple source domains and thentransfer the source knowledge to a new model. The authorscombine multiple classifiers created from the source domainsto form an ensemble classifier for the target domain. Thesource domain data is given in advance and so for each sourcedomain a classifier is built in an offline learning paradigm. Thetarget data is acquired in an online fashion and the PassiveAggressive algorithm is used for learning the representationof the target data. The loss for the decision is calculated by ahinge loss function and is added to the prior learned samples.In the scenario of heterogeneous transfer learning, thefeature space is divided into two. The first part is assumed tobe homogeneous with the source domain while the second partis heterogeneous with the source domain. A set of three baseclassifiers ( f s i , f T i,t , f T i,t ) are learned for each source domain”,, where f s i is the source domain classifier and f T i,t , f T i,t ) are the target domain classifiers. f T i,t and f T i,t are learned bycombining the first section and the second section in targetdomain with the source domain, respectively” [14]. In thenext step, the weights pertaining to each classifier is learned.The combination of the base classifiers and learned weightsproduces a robust classifier that can perform well in the targetdomain.
3) Object tracking with Convolutional Neural Networks:
Wu et al. [15] proposes a method ”Online discriminativeobject tracking via deep convolutional neural network” wherea neural network learns the discriminative features of an objectand tracks the location and size of the object. The wholelearning process consists of an initial transfer learning andthe tracking is done as an online learning framework. Theauthors select a deep neural network because of the networkscapability of learning high level feature representations oftargets. The key idea of the authors for this experiment is to usethe layers of the deep neural network as a generic and middlelevel image representation. The neural network is trained onthe CIFAR-10 dataset. Once the training is completed theparameters of all layers except the fully connected layer istransferred to the target task with only one output neuron. Thenetwork is then trained on the dataset of the object trackingtask. The object tracking is done via a Particle FilteringNetwork that implements the Bayesian filter by Monte Carlosampling. There are two main components to the particle filter,the dynamic model which generates candidate samples basedon prior experience and the observation model that computesthe similarity between the prediction and the actual value. Fortraining, a manually annotated first image is given into the model, and then the coordinates of the object is obtained and apatch is extracted. The patch is rescaled to 32x32 pixel size(thesize of the CIFAR-10 dataset images). Positive and negativesamples are generated in by this method and the network istrained based on the generated data.A likelihood value is computed from the earlier trainedneural network where the output neuron gives out a score.The likelihood is calculated by the equation p ( y t | x t ) = exp ( d t ) (1)While tracking an object, the appearance of the object mightchange due to its motion. So the likelihood function needsto adapt over time by fine tuning the neural network model.A main drawback of the appearance based model is itssusceptance to drift, the model may slowly start to adapt to nontargeted objects. To alleviate this problem, the authors acceptthe likelihood value only if it is above a certain threshold T and the likelihood modifies the neural network only if thelikelihood value is above a higher threshold T .IV. C ONCLUSION AND D ISCUSSION
We outline methods in Transfer Learning, Organic Comput-ing and online Transfer learning that can be used in the domainof autonomous vehicles. The paper shows how the differentmethods could work in tandem which would result in an outputthat would be better than what each individual componentachieved. It is shown that online transfer learning is able toaddress the issues of limited availability of annotated dataand the dynamically changing environments by self-learning.It is also possible that the models used in natural languageprocessing can be extrapolated to the vision and audio domaineasily. We believe online transfer learning would be one of thebest options to look into for the development of autonomousvehicles. R
EFERENCES[1] C. M¨uller-Schloer and S. Tomforde,
Organic Computing-Technical Sys-tems for Survival in the Real World . Springer, 2017.[2] V. Rausch, A. Hansen, E. Solowjow, C. Liu, E. Kreuzer, and J. K.Hedrick, “Learning a deep neural net policy for end-to-end control ofautonomous vehicles,” in
American Control Conference (ACC), 2017 .IEEE, 2017, pp. 4914–4919.[3] X. Pan, Y. You, Z. Wang, and C. Lu, “Virtual to real reinforcementlearning for autonomous driving,” arXiv preprint arXiv:1704.03952 ,2017. [4] J. Janai, F. G¨uney, A. Behl, and A. Geiger, “Computer vision forautonomous vehicles: Problems, datasets and state-of-the-art,” arXivpreprint arXiv:1704.05519 , 2017.[5] I. Nigam, C. Huang, and D. Ramanan, “Ensemble knowledge transferfor semantic segmentation,” in . IEEE, 2018, pp. 1499–1508.[6] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer,“Online passive-aggressive algorithms,”
Journal of Machine LearningResearch , vol. 7, no. Mar, pp. 551–585, 2006.[7] J. T. Zhou, S. J. Pan, I. W. Tsang, and Y. Yan, “Hybrid heterogeneoustransfer learning through deep learning.” in
AAAI , 2014, pp. 2213–2220.[8] S. Moon and J. Carbonell, “Completely heterogeneous transfer learningwith attention-what and what not to transfer,” in
Proceedings of the 26thInternational Joint Conference on Artificial Intelligence . AAAI Press,2017, pp. 2508–2514.[9] C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, “Transfer automaticmachine learning,” arXiv preprint arXiv:1803.02780 , 2018.[10] R. P. Wrtz, Ed.,
Organic Computing . Springer-Verlag Berlin Heidel-berg, 2008.[11] B. Lu, M. Coombes, B. Li, and W. H. Chen, “Improved situation aware-ness for autonomous taxiing through self-learning,”
IEEE Transactionson Intelligent Transportation Systems , vol. 17, no. 12, pp. 3553–3564,Dec 2016.[12] Y. Prakash, K. Prabhu, S. Kamtekar, and S. Gadhe, “Incorporation ofswarm intelligence in autonomous cars.”[13] P. Zhao and S. C. Hoi, “Otl: A framework of online transfer learning,”in
Proceedings of the 27th international conference on machine learning(ICML-10) , 2010, pp. 1231–1238.[14] Q. Wu, H. Wu, X. Zhou, M. Tan, Y. Xu, Y. Yan, and T. Hao, “Onlinetransfer learning with multiple homogeneous or heterogeneous sources,”
IEEE Transactions on Knowledge and Data Engineering , vol. 29, no. 7,pp. 1494–1507, July 2017.[15] Y. Chen, X. Yang, B. Zhong, S. Pan, D. Chen, andH. Zhang, “Cnntracker: Online discriminative object tracking viadeep convolutional neural network,”