[PDF] Multi-Sensor Data and Knowledge Fusion -- A Proposal for a Terminology Definition

Abstract

Fusion is a common tool for the analysis and utilization of available datasets and so an essential part of data mining and machine learning processes. However, a clear definition of the type of fusion is not always provided due to inconsistent literature. In the following, the process of fusion is defined depending on the fusion components and the abstraction level on which the fusion occurs. The focus in the first part of the paper at hand is on the clear definition of the terminology and the development of an appropriate ontology of the fusion components and the fusion level. In the second part, common fusion techniques are presented.

Full PDF

MMulti-Sensor Data and Knowledge Fusion

A Proposal for a Terminology Deﬁnition

Silvia Beddar-Wiesing, Maarten Bieshaar

Intelligent Embedded Systems, University of Kassel, Germany {s.beddarwiesing,mbieshaar}@uni-kassel.de

Abstract.

Fusion is a common tool for the analysis and utilization ofavailable datasets and so an essential part of data mining and machinelearning processes. However, a clear deﬁnition of the type of fusion isnot always provided due to inconsistent literature. In the following, theprocess of fusion is deﬁned depending on the fusion components and theabstraction level on which the fusion occurs. The focus in the ﬁrst partof the paper at hand is on the clear deﬁnition of the terminology andthe development of an appropriate ontology of the fusion componentsand the fusion level. In the second part, common fusion techniques arepresented.

The generic term fusion describes the combination of diﬀerent fusion componentsthat consist of available datasets. A more speciﬁc deﬁnition of fusion is onlypossible regarding the context and the purpose of the fusion, i.e., in particularthe components to fuse. In general, the assumption behind the application offusion is, that fusing datasets from diﬀerent sources improves the performanceof the subsequent data processing.Consider the task of tracking a pedestrian at a crossroad with the help ofa set of cameras. The aim is to generate a position prediction of a pedestrianat the next timestep. For this purpose, predictions based on the images fromdiﬀerent cameras can be combined to obtain a more robust prediction. Thefusion on a previous stage of the data processing can improve the predictionaccuracy. Fusing knowledge in form of models for predestrian behavior, fusinginformation about the location and velocity of the pedestrian or fusing images ofthe same pedestrian to obtain less noisy images can contribute to a more preciseprediction.Unfortunately, inconsistent vocabulary often complicates the clear deﬁnitionand description of fusion algorithms and, as a result, makes the categorization ofthe datasets and the selection of a corresponding fusion technique diﬃcult. Theaim of the paper at hand is to clearify the terms of possible fusion components,the terms used for the diﬀerent process levels and the deﬁnition of the totalprocess. Several deceptive deﬁnitions are discussed and afterwards, an ordereddeﬁnition for the fusion terms is proposed that combines the common deﬁnitionsof the fusion components and the fusion level. In the last section, a selected setof fusion techniques for the diﬀerent fusion level are listed. a r X i v : . [ c s . A I] J a n Subdivision of Fusion Techniques

To deﬁne a categorization of fusion techniques, it is necessary to ﬁrst deﬁne theterms used for the fusion components. The deﬁnition of the fusion componentscorresponds to the level of abstraction that can be determined by means of thedata-information-knowledge-wisdom (DIKW) hierarchy [1].Furthermore, the constitution of the fusion components per level restrains thepossible fusion algorithms to a speciﬁc family of techniques. In the following, anextension of the DIKW hierarchy is illustrated and afterwards, the fusion levelsare speciﬁed.

There have been wide studies about the categorization of data. In this section,two popular concepts are examined considering the application of the data asfusion components.One common categorization has been published by Ackoﬀ [1] and divides datainto ﬁve categories that can be transferred into each other: data, information,knowledge, understanding, and wisdom.Ackoﬀ describes data as representations of objects or events. Processing thedata to improve the usability leads to information that is used in descriptionsand answers questions that begin with what, who, where, and how many. Theapplication of data and information generates knowledge that can transforminformation into instructions and answers questions that begin with how. Ifrelations and patterns in the information are identiﬁed, the context has beencaptured and, as a result, understanding has been reached. Understanding helpswith questions that begin with why. At least, wisdom includes the ability ofjudgement and the competence of dealing with the value of the data, informa-tion and knowledge. According to the further elaboration of the deﬁnitions from

Data Information Knowledge Wisdom understandingrelations understandingpatterns understandingprinciples

Fig. 1.

The adjusted DIKW hierarchy.

Ackoﬀ in [4], Bellinger et al. take the view that understanding is not a stage inthe hierarchy but the condition for the transition from a lower to an upper levelas schematized in Figure 1. Thus understanding relations between data leadsto information, understanding patterns in information generates knowledge, andunderstanding the underlying principles of knowledge results in wisdom.With regard to the following fusion techniques that originate mostly from amachine learning context, the extended version of the DIKW-hierarchy providesa more applicable deﬁnition of the fusion components. In Section 3, the relationbetween the extended DIKW-hierarchy and the fusion levels will be clariﬁed.But ﬁrst, the fusion process and the levels of fusion are deﬁned in the following. .2 Deﬁnition of Fusion and Fusionlevel

Especially in terms of fusion deﬁnitions, the literature varies. In general, thereexist two kinds of perspectives: On the one hand, there are universal deﬁnitionsof fusion in sense of a whole process. On the other hand, fusion is performedon components from diﬀerent abstraction levels. But due to the inconsistentclassiﬁcation of fusion techniques, the comparison of literature is often diﬃcult.In particular, the deﬁnition of data and information fusion diﬀers.In the following sections, the terminology of diﬀerent approaches is discussedand in the conclusion, an ontology of the terms for the fusion components andthe fusion level is proposed in form of the fusion level rainbow as illustrated inFigure 4.

Fusion.

In [6], several diﬀering previous deﬁnitions as well as a new deﬁnition ofsensor, data and information fusion are listed. Staying with the attempt to cat-egorize the general deﬁnitions, it is noticeable, that they focus on three diﬀerentaspects in particular:Firstly, the fusion components are a central aspect. In some literature, theyare explicitly identiﬁed as observations or measurements respectively raw data[11,29,38]. In most cases, they are referred to as data or information, or it isspeciﬁed that they originate from multiple sources, but there is no detailed dis-cussion about the representation of the data.Secondly, some of the deﬁnitions also focus on the process behind the fusion.A popular precedent for this is one of the ﬁrst deﬁnitions of fusion published bythe Joint Directors of Laboratories (JDL) in form of a data fusion lexicon [39].The fusion process is described as the „association, correlation, and combinationof data and information from single and multiple sources“[39]. It is additionallymentioned, that the fusion has several levels that are explained separately aslisted in the next section. A diﬀerent description of fusion in [26] interprets theprocess as the application of prior knowledge to a concrete realization.Thirdly, most authors focus on the purpose of the fusion or consider the com-bination of the three aspects presented. The most common goal that is statedincludes an improvement of the obtained information due to fusion. Furtheradvantages are reﬂected in its applicability for many purposes: The resultingunderstanding of the observed situation [36], smoothed data and reduction inuncertainty [7], an optimal estimate of a hidden state [14], an improved perfo-mance of inference [19], improvement in prediction [34] or decision tasks [12] orin general richer and more useful information [37],[38].

Fusionlevel.

Many deﬁnitions of the level of fusion refer to the published datafusion lexicon of the JDL [39]. Here, the level speciﬁcation originally has beenanalysed from the point of view of a military application, but it can be extendedto several fusion applications. It consists of three interrelated levels, but it isore common to use the extended JDL deﬁnition. The latter is illustrated inFigure 2 and includes the three levels from the JDL model (levels 1 - 3), extendedby three more levels as described in [20]. The six levels are deﬁned as follows:

Level 0:

Fusion of raw data in form of signal reﬁnement to obtain preliminary infor-mation about the characteristics of the observed object or situation.

Level 1:

Data is processed to specify the position or identity of an entity, or to classifycharacteristics of it. This is called the object reﬁnement.

Level 2:

Relationships between objects and events considering the environment leadsto a situation reﬁnement by, e.g., analyzing relation structures.

Level 3:

Characterized as the threat reﬁnement. In general, this can be interpreted asa risk estimation by drawing inferences or predictions for application-speciﬁcoperations.

Level 4:

The performance improvement of the entire fusion process by reﬁning theelements of it during a suitable type of monitoring.

Level 5:

A process of cognitive reﬁnement via optimizing the interaction of the processwith the user, that is dissociated from the previous levels.

Data-bases

NationalDistributedLocal

Human/Computer Interface

Data Fusion Domain

SourcePreprocessingLevel 1ProcessingObjectRefinement Level 2ProcessingSituationRefinement Level 3ProcessingThreatRefinementLevel 4ProcessingProcessRefinement

Database ManagementSystemSupportDatabase FusionDatabase

Sources

Fig. 2.

The extended fusion level deﬁnition of the JDL includes six diﬀerent levels thatcan be passed through during an entire fusion process (sketched here based on[20]).

For a more diﬀerentiated categorization of fusion processes, the goal of thepaper at hand is to combine the characterization of the fusion components fromSection 2.1 and the fusion levels (and eventually the constitution of the fusioncomponents). But, from this point of view, the level deﬁnitions from the JDLare too broad and the interest of the type of fusion components emerges only inthe ﬁrst four levels.In contrast to the JDL model, the model proposed by Dasarathy [9], wherefusion processes are divided into ﬁve levels, is more associated to the DIKW-hierarchy from Section 2.1. The reason is, that the choice of devision is di-rectly related to the fusion components and the fusion emissions. Interpretinghe knowledge as a feature of the observed event and wisdom as decisions, thatcan exist, e.g., as classiﬁcation decisions, evaluation of a regression model or aprediction, the following model is directly based on the DIKW-hierarchy.In addition, Varshney [36] added a sixth level, so that all possible ascending(relating to the components presented in 2.1) or equal pairs of inputs and out-puts of a fusion are considered. The all-encompassing extension of Dasarathy’smodel is illustrated in [20] and will be brieﬂy listed in the following. For thisrepresentation, the shortcuts DAI (input: data), DAO (output: data), FEI (in-put: feature), FEO (output: feature), DEI (input: decision) and DEO (output:decision) are used. All pairs of inputs and outputs during a fusion process canbe interpreted as follows:1 DAI/DAO: signal detection (fusion of raw datasets generates data with lessnoise, a sharper signal can, e.g., be used to detect signal sections)2 DAI/FEO: feature extraction (fusion of datasets generates data that can beused to extract relevant features)3 DAI/DEO: Gestalt-based object characterization (fusion of raw datasets canlead to a better characterization of an object or a decision)4 FEI/DAO: model-based detection and feature extraction (fusion of featuresleads to reﬁned features from which data can be generated, e.g., the fusionof Gaussians and subsequent sampling)5 FEI/FEO: feature reﬁnement (fusion of diﬀerent models that describe thesame feature generates a more conﬁdent feature model)6 FEI/DEO: feature-based object characterization (feature level fusion reﬁnesthe description of an object or a decision)7 DEI/DAO: model-based detection and estimation (decision fusion sharpensthe decision model from which data can be generated e.g. mixture of gener-ative experts and subsequent sampling with the Gibb’s Sampler [5])8 DEI/FEO: model-based feature extraction (decision fusion can lead to adecision from which a feature can be derived)9 DEI/DEO: object/decision reﬁnement (e.g. mixture of experts leads to abetter decision)But in fact, the combination of data as input and decision as output doesnot often occur in common tasks. Additionally, cases 4, 7 and 8 are featurerespectively decion level fusions from which can be sampled or conclusions drawnafterwards. Depending on the form of the feature representation, samples canbe generated by common sampling methods as described in [5].urthermore, there is a set of models that apply an additional backwardconnection from the top level to the data acquisition process as in the OODA(Observe, Orient, Decide, Act) loop [3]. As a result, the generation and collectionprocess of input data can be adapted with regard to the conclusive decisionevaluation.One last model has to be mentioned here that is also common, but especiallyused in image fusion applications and is similar to the described deﬁnitionsbefore. Here the fusion is divided into the pixel, feature and decision level [8].Analogous to the data, feature and decision fusion, this model is applied toimage processing, which is a subset of the sensor fusion. An illustration is givenin Figure 3.

Fig. 3.

For image data, the data level fusion is known as pixel level fusion and usedfor image processing (ﬁgure inspired by the depiction from [8]).

In order to combine the delineated fusion level terms, the following orderedontology describes the relations between the terms: ◦ (Pixel, Sensor) Data Fusion ◦ Information Fusion ◦ Knowledge/Feature Fusion ◦ Decision Fusionand is illustrated in Figure 4.t the bottom of the entire fusion process, raw data given by sensors orother sources are fused while understanding characteristics and relations of theinput. The reﬁned data with a low information loss compared to the originaldata provides an updated representation for further applications.The obtained information is a fundamental basis for the feature extraction inthe next level that proposes an underlying model for the data. This can provideknowledge that can be used to understand patterns in the data and thus createawareness of underlying principles in the source data.On top of the fusion process, the aim is to gain wisdom in form of performanceimprovement in decision making and thus the choice of action. Depending on theimpact of the action, the entire fusion process can be adapted at the diﬀerentstages.

ImagesSensoryDataData P i x e l F u s i o n S e n s o r F u s i o n D a t a F u s i o n Information I n f o r m a t i o n F u s i o n Knowledge K n o w l e d g e / F e a t u r e F u s i o n Awareness A b s t r a c t i o n U n d e rs t a n d i n g A d j u s t m e n t o f d a t a a c qu i s i t i o n D e c i s i o n F u s i o n Wisdom

Fig. 4.

The Fusion Level Rainbow. Based on the extended Dasarathy model, the fusionwithin a component leads to a reﬁned element of the corresponding component or toa more abstract emission.

Fusion Techniques of the Particular Levels

In this section, selected techniques from diﬀerent fusion levels will be presented.At the beginning of 3.1, a ﬁne structure of sensor fusion techniques is outlined.Afterwards, statistical fusion methods are described in more detail. To get ashort overview of algorithms used in data and information fusion, the focus inSection 3.2 is on the constitutions of data that require diﬀerent fusion processes.At the end in Section 3.3 and 3.4, diﬀerent forms of knowledge and decisionfusion techniques are characterized.

In the fusion level rainbow (Fig. 4), the sensor fusion is a subset of the lowestlevel of fusion, the data fusion. Sensor data is a special case of data that can berepresented as a data point in a high dimensional space and is produced by (mul-tiple) sensors. Sensor fusion techniques once again can be categorized accordingto the information ﬂow between the available sensor network and diﬀerent sensorconﬁgurations.Firstly, the fusion can be implemented centralized or decentralized [18]. In thecentralized architecture, the measurements of all sensors are available during thefusion process, so a batch method is used. In contrast to this, in the decentralizedfusion, the measurements of each sensor is fused within a seperate fusion model.Then during the global fusion process, only the model information of each sensoris available and processed sequential. The decentralized fusion is preferred sincethe fusion process is considered as being more robust and reliable [14].Secondly, sensor fusion can be furthermore divided into three cases dependingon the sensor conﬁguration as listed in [10]: Competitive Sensor Fusion. (homogeneous) Either data from sensors of thesame modality are fused or the sensors can be transformed to the same baselinepreviously and are fused afterwards. Data fusion of competitive sensors can beused to reduce noise respectively uncertainty.In connection to the initial example, competitive sensors in form of camerasused for the pedestrian tracking produce images of the same person at the sametime. Fusing the images that may contain a degree of uncertainty, the resultingimages are less noisy and more applicable for the tracking task.

Complementary Sensor Fusion. (heterogeneous) Sensors observe the sameevent and fusion of them generates a complemented image of the observation.This means, the sensors can measure diﬀerent and disjunct parts of the sameevent and the combination leads to a complete characterization of it.A complementary set of cameras, e.g., can provide an extended picture ofa crossroad in contrast of the image of only one camera which simpliﬁes thesubsequent tracking of a pedestrian. ooperative Sensor Fusion.

A sensor is conﬁgurated depending on the in-formation from other sensors to generate more useful information. This form ofsensor network conﬁguration includes some sort of temporal delay and depen-dency on a decision of an expert. The tracking task can require the possibilityto adapt the camera angles after observing a certain behaviour of the pedestrian.Sensor fusion techniques also diﬀer in the assumptions about the systemunder consideration. For processing sensor data with uncertainties, statisticalsensor fusion techniques for static and dynamic systems are presented by FredrikGustafsson in [18]:

Statistical Sensor Fusion

The main idea behind statistical models is, thatthe sensors are noisy and the „true“ characterization of the event is given by astate vector x . Furthermore, it is expected, that by fusing diﬀerent sensors, thestate vector gets more precise.The assumption here is that N given observations y n ∈ R n y , n ∈ [ N ] , stackedin y ∈ R N × n y , can be described by a model y = h ( x ) + e . Gustafsson discusses linear models h ( x ) = Hx with a stacked factormatrix H ∈ R N · n y × n x , as well as nonlinear models, that relate the observations to thehidden state vector x ∈ R n x . Most of the time, the stacked error e ∈ R N · n y isassumed to be Gaussian, but the non-Gaussian case is mentioned too.In the static case , the hidden state x ∈ R n x is time-invariant, so that themodel describes, e.g., the observation of the same event by means of severalsensors or of the same sensor at diﬀerent timestamps.In the dynamic case , the state x k ∈ R n x is additionally varying with timeaccording to a sequential update model x k +1 = f ( x k ) + v k , with a linear or non-linear mapping f : R n x → R n x and a Gaussian or non-Gaussian noise v k ∈ R n x .For providing a new state estimation, Gustafsson gives an insight into avariety of least squares approaches that can be applied, while the measurementsare independent.In case of correlated measurements in the static case, Gustafsson presentsthe safe fusion algorithm , which will be discussed in 3.2.For dynamic systems, Gustafsson lists popular ﬁltering algorithms, such asseveral variations of the Kalman Filter [23], that make inference on the state fromthe observations using dynamic linear or nonlinear models. Numerical methodsapproximating the nonlinear ﬁlter models are also discussed. In the grid-basedmethods, e.g., parts of the calculation during the ﬁltering process are approxi-mated by discretising the state space [21] or replacing integrals by ﬁnite sums[27]. .2 Data and Information Fusion In this section, the case of data respectively information fusion in the secondlowest levels of the fusion level rainbow (Fig. 4) in general are under considera-tion. When selecting a data fusion technique, the ﬁrst aspect to focus on is theconstitution of the data. In [25], a detailed overview of fusion techniques for thefollowing types of data quality is given:

Imperfect Data.

The case of imperfect data occurs in nearly all applicationsof data fusion. To deal with a certain degree of imperfect data, the followingalgorithms, that are described together with former extensions in [25], can beused: – Probabilistic Fusion:

The main idea is to represent the imperfection of databy means of uncertainty in form of probability distributions. Fusing themaccording to the Bayesian fusion formula from [5]: p ( X | Z ) = p ( Z | X ) · p ( X ) p ( Z ) (1)determines the posterior probability distribution of the (real, underlying)state X depending on the observations Z = { z , . . . , z t } , the likelihood p ( Z | X ) and the (chosen) prior distribution p ( X ) . Analogously, the Bayesianfusion can be formulated for the dynamic case by using observations up totime t [18].Based on this Bayes theorem, several fusion techniques have been formulated,that are used in algorithms for numerous applications, such as the Kalmanﬁlter. For more information see [25]. – Evidential Belief Reasoning:

In the Dempster-Shafer evidential theory (DSET),possible measurement hypotheses obtain correspoding beliefs and plausibili-ties. Usually, Dempster’s rule of combination is used to fuse two belief massesof the same set by summing up the product of the belief masses of every pairof supersets regarding conﬂicting subsets. Consequential, the DSET providesa fusion technique that can be seen as a generalization of the Bayesian fusionwhere probability mass functions are used as belief functions [33].The DSET has been proposed to represent incomplete data, analogously tothe probability theory by modeling the membership uncertainty of an ele-ment in a well-deﬁned class. – Fusion and Fuzzy Reasoning:

In contrast, the fuzzy set theory is mainly de-signed to represent and to operate on vague data and to model the fuzzymembership of an element in an ill-deﬁned class [25]. For this purpose, a grad-ual membership function is introduced that deﬁnes a fuzzy set by assigninga membership degree between 0 and 1 to each element of the (discrete) uni-verse. The higher the degree, the more the element belongs to the fuzzy set.Fusing of membership degrees can be done in form of conjunctive and dis-junctive fusion rules to obtain a fuzzy fusion output (i.e., the fusion functionis bounded from above by the minimum in the former and from below bythe maximum in the latter case), as further described in [40].

Possibilistic Fusion:

Based on the fuzzy set theory, possibility theory is con-ceived to again represent incomplete data by modeling an uncertain member-ship of the elements in the universe in well-deﬁned classes with a possibilitydistribution [25]. Another diﬀerence to fuzzy theory is to require normalizedmembership functions that can be used to deﬁne a possibility and a necessitydegree. The former determines the plausibility and the latter the certaintyof a subset of the universe. So the possibility theory diﬀers from probabilitytheory by the use of the two dual set-functions. The fusion rules are identicalto the ones for the fuzzy fusion. – Rough Set Based Fusion:

The idea of the rough set theory is to approximatea data set by an upper and lower bound of sets to obtain a rough represen-tation of the original set. The lower bound set includes subsets of the datathat deﬁnitely belong to the original set. The diﬀerence of the upper andlower bound sets includes subsets that cannot be classiﬁed as belonging ornot belonging to the original set. In the approximation, the granularity ofthe data can be considered by choosing an appropriate size of the unitedsubsets.Datasets can then be fused by combining the rough sets of their approxima-tions with the aid of classical conjunctive or disjunctive fusion techniques ofset theory such as the union or intersection of sets [25]. – Random Set Theory:

The theory of random sets uses a form of generalizedrandom variables, the random sets, which take sets as random values. In [17],random sets and their characteristics are described in detail. In addition,Goodman et al. discuss applications of random sets in single- and multi-target data fusion which can be used for the data association problem.The handling with coarse data in general by using the random set theory isdescribed in [30] extensively.

Correlated Data.

As mentioned in Section 3.1, correlated data can be fusedby the safe fusion algorithm [18] or evolved algorithms as listed in [25]. Here it isassumed, that the correlation is unknown and the data respectively informationis represented as a point in a high dimensional space.The safe fusion algorithm presumes a decentralized network and is basedon the covariance intersection (CI) algorithm proposed by Julier and Uhlman[22]. This means in addition, it is assumed that the models of the dependingmeasurements include Gaussian noise and thus possess individual means andcovariances.The algorithm determines a covariance that deﬁnes the minimal ellipsoid thatencloses the intersection of (sequentially entering) covariances by making use ofa transformation deﬁned by the sequentially emerging singular value decompo-sition (SVD, [15]) as illustrated in Figure 5.In case of a known correlation, it can be eliminated before fusing the data[25], e.g., by the principal component analysis [5]. ig. 5.

Safe fusion algorithm visualized (similar as illustrated in [18]) in the fusionof two covariances. In the ﬁrst step, both covariances are transformed correspondingto the SVD of one covariance (blue). The SVD of the obtained manipulated secondcovariance (green) yields in the second transformation of both covariances. The ﬁnalintersection of the covariances is then covered by the target covariance (thick ellipse).

Inconsistent Data.

There are several forms of inconsistencies, which must bedealt with in diﬀerent ways. These are listed in the following: – spurious data: Sensor data can be distorted by permanent failures or slowlyevolving errors of a sensor. The most common ideas of dealing with spuriousdata is the identiﬁcation or prediction of systematic errors. After that, sus-picious data can be excluded from the fusion process. A statistical approachfor detecting spurious data can be found in [28]. Kumar et al. formulated anextension of the basic Bayesian fusion formula (1) by augmenting a randomvariable which describes whether the data is spurious or not depending onthe data and the true state. – out-of-sequence-measurements (OOSM): There are two diﬀerent aspects thathave to be considered in dynamic systems: the validity and the out-of-sequence arrival of data. When a history of measurements is necessary forthe fusion process, the data should be updated at appropriate intervals toguarantee a valid fusion result. In addition, an implementation of an entiresensor network can entail delayed data deliveries for the fusion process. Apossible approach for these problems has been proposed by Kaugerand et al.in [24], that deﬁnes a fusion interval in which incoming data is permitted tobe fused. Also in [25], some strategies for dealing with out-of-sequence dataare presented. – conﬂicting data: The problem of conﬂicting data results in misleading con-clusions as discussed in [41]. Extensions of the DSET have been developed toespecially address the problem of inconsistent data and are mentioned in [25].n a statistical environment, in complement to the CI algorithm, the covari-ance union (CU) algorithm can be applied to deviating measurements for aconsistent fusion with the assumption of a Gaussian uncertainty [35]. Theunion process consists of the computation of a mean and a covariance, sothat the previous covariances added to the deviation of the previous meanfrom the new mean constitute a lower limit of e.g. the determinant of thenew covariance.

Disparate Data.

An aspect, that has not been considered yet is the fusion ofdata from sensors and sensor intelligence (hard information), data from humanintelligence, open source intelligence and communications intelligence (soft infor-mation) [31]. Hard information can be represented in a mathematical frameworkand can therefore be used for the fusion techniques as presented above.Whereas soft information is produced by human sources and is therefore avail-able in „context-dependent languages over bandwidth-limited channels“ [31]. Theresearch on modeling uncertainty of such soft information is quite young, butthere are some models for linguistic data described in [2].

When the abstraction of the fusion components increases along the fusion levelrainbow (Fig. 4), the components can be available in form of models that includeknowledge from the observed event.Knowledge fusion itself has two diﬀerent levels: it can be performed at modelor parameter level.

Model Fusion.

Knowledge can be represented in form of diﬀerent models. Asimple example of such a model is a Gaussian distribution that includes infor-mation about the distribution of the data [5]. In addition, artiﬁcial data can begenerated with it. This kind of model is called a generative model.To fuse knowledge from several models in form of mixture models , they haveto be in the same modality [5], i.e., they consist of the same base model butare trained diﬀerently or model diﬀerent aspects of the underlying data. A highdiversity in knowledge is prefered for a fusion as discussed for the case of ensem-bles in [32]. In case of fusing models, the training can diﬀer in the training setor the prior parameter setting.The fused knowledge is then composed of a linear combination of the distinctmodels. The mixture of Gaussian distributions, e.g., results in a Gaussian mix-ture model (GMM). It descibes the distribution of a data set, that is assumedto be more complex than a unimodal Gaussian, by a convex combination ofa selected number of Gaussians. The concluding mixture model contains moreprecise knowledge about the overall distribution and the mixture coeﬃcientsimply additive knowledge about the responsibility of a mixture component forgenerating a given data point.urther examples for model fusion techniques are Convolutional Neural Net-works [5] and Multiple Kernel Learning based Ensemble Methods [16]. The latterapproach permits diﬀerent kernels representing knowledge that are fused by a(non-) linear combination function.

Parameter Fusion.

Moreover, knowledge given by classiﬁcation models canbe available as classiﬁcation rules or decisions (as the outputs of the classiﬁers).The combination of classiﬁers at the component level is equivalent to the mixturemodels and the combination at the output level will be presented in the nextsection.The parameter fusion is a more complex fusion form and is applied on theparameter level of the classiﬁcation models. In [12], a knowledge fusion techniquefor generative classiﬁers based on mixture models (CMMs) is presented.For the fusion algorithm, two or more probabilistic generative classiﬁer thatdescribe the same process have to be given. Each one is divided by the numberof classes into parts that are mixtures of probabilty densities conditioned by theclass and the mixture component. Additionally, the densities have to be deﬁnedon the same input space, i.e., especially on the same number of continuous andcategorical dimensions and are trained on diﬀerent training sets.Before applying the algorithm, conjugate hyperdistributions for all hyperpa-rameters per component are introduced and trained via the variational inference(VI) algorithm as described in [12].Having deﬁned the hyperdistributions, the ﬁrst step of the algorithm deter-mines similar hyperdistributions of each component via an appropriate similaritymeasure. Assuming that a posterior distribution for a class is calculated by theBayesian formula (1) and the classiﬁers use the same prior knowledge, the fusionrule of two similar hyperdistributions is determined by the multiplication of thetwo posteriors divided by the prior. The derivation of this rule can be found in[12].Using conjugate priors, the fused posterior distribution has the same func-tional form as the given classiﬁers, so the estimation of the fused parameters arededucible from a sequence of mathematical transformations of the fusion rule.For setting the distributions in the continuous dimensions to multivariateGaussians and the distributions of the categorical ones to multinomial distri-butions, the hyperdistributions are Dirichlet respectively normal-Wishart dis-tributions. The fusion formulae for the corresponding parameters are listed in[12]. .4 Decision Fusion

The decision fusion of multiple classiﬁers may consist of the direct combinationof decisions or the selection of one suitable classiﬁer for a speciﬁc input area. – Committee/Ensemble : The decision is given as a combination of decisionsfrom distinct models [5]. Boosting is an ensemble technique that trains themodels iteratively and a decision is calculated by, e.g., weighting the modelsdepending on their performance as in the adaptive boosting (AdaBoost)algorithm [13]. – Decomposition-Based Ensemble Methods:

In case of time series data, severalensemble methods based on the lossless decomposition of the input signalcan be used for forecasting [32], which can be interpreted as decision making.After decomposing the time series into a set of signals that fully representthe original signal, the predictions from the components can be combined toone predition for the origin. – Mixture of Experts:

Depending on the input domain, a decision is made fromone (hard or soft) selected model from the mixture of experts. The selectioncan be implemented in form of decision trees. Whereas in the Bayesian ModelAveraging, one model is determined for the entire input space by introducinga prior probability for each model [5].

The ideal deﬁnition should have no space for interpretation or double allocationof the terminology. The goal of this paper was to provide a clear deﬁnition forfusion and its components for the applicability in varous contexts. After havingdiscussed several deﬁnitions, the fusion levels have been deﬁned considering theprevious speciﬁed fusion components. The resulting fusion level rainbow (Fig. 4)includes the entire range of fusion components and levels and provides a cleardeﬁnition of fusion for further applications. Finally, an overview of commonfusion techniques was given for each of the previously deﬁned levels.In the presented deﬁnitions, decisions are often associated with predictionsor classiﬁcations. The deﬁnitions are analogously valid for regression problems.The concurrent fusion of components from diﬀerent levels has not been con-sidered in the paper at hand and is left for future work. In addition, the hybridiza-tion of the fusion techniques are in common use and have not been addressedhere.

Acknowledgment

This work results from the project DeCoInt , supported by the German Re-search Foundation (DFG) within the priority program SPP 1835: "Kooperativinteragierende Automobile", grant number SI 674/11-1. eferences Ackoff , R. L.: From Data to Wisdom. In:

Journal of applied Systems analysis

16 (1989), Nr. 1, S. 3–92.

Auger , A. ;

Roy , J. : Expression of Uncertainty in Linguistic Data. In:

Proceedingsof the 11th International Conference on Information Fusion . Cologne, Germany,Jul. 2008, S. 1–83.

Bedworth , M. ;

O’Brien , J. : The Omnibus Model: A New Model of DataFusion? In:

IEEE Aerospace and Electronic Systems Magazine

15 (2000), Nr. 4, S.30–364.

Bellinger , G. ;

Castro , D. ;

Mills , A. : Data, Information, Knowledge, andWisdom. (2004)5.

Bishop , C. M.:

Pattern Recognition and Machine Learning . Springer Science &Business Media, 20066.

Boström , H. ;

Andler , S. F. ;

Brohede , M. ;

Johansson , R. ;

Karlsson , A. ;

Van Laere , J. ;

Niklasson , L. ;

Nilsson , M. ;

Persson , A. ;

Ziemke , T. :

Onthe Deﬁnition of Information Fusion as a Field of Research . 20077.

Challa , S. ;

Gulrez , T. ;

Chaczko , Z. ;

Paranesha , T. N.: OpportunisticInformation Fusion: A New Paradigm for Next Generation Networked SensingSystems. In:

Bd. 2. Philadelphia, PA, USA, Jul. 2005, S. 720–7278.

Chang , N.-B. ;

Bai , K. :

Multisensor Data Fusion and Machine Learning forEnvironmental Remote Sensing . CRC Press, 20189.

Dasarathy , B. V.: Sensor Fusion Potential Exploitation-Innovative Architecturesand Illustrative Applications. In:

Proceedings of the IEEE

Bd. 85, IEEE, 1997, S.24–3810.

Durrant-Whyte , H. F.: Sensor Models and Multisensor Integration. In:

Theinternational Journal of Robotics Research

Durrant-Whyte , H. F.:

Integration, Coordination and Control of Multi-SensorRobot Systems . Bd. 36. Springer Science & Business Media, 201212.

Fisch , D. ;

Kalkowski , E. ;

Sick , B. : Knowledge Fusion for Probabilistic Gener-ative Classiﬁers with Data Mining Applications. In:

IEEE Transactions on Knowl-edge and Data Engineering

26 (2014), März, Nr. 3, S. 652–66613.

Freund , Y. ;

Schapire , R. E.: A Decision-Theoretic Generalization of On-lineLearning and an Application to Boosting. In:

Journal of Computer and SystemSciences

55 (1997), Nr. 1, S. 119–13914.

Gao , S. ;

Zhong , Y. ;

Zhang , X. ;

Shirinzadeh , B. : Multi-Sensor Optimal DataFusion for INS/GPS/SAR Integrated Navigation System. In:

Aerospace Scienceand Technology

13 (2009), Nr. 4-5, S. 232–23715.

Golub , G. H. ;

Reinsch , C. : Singular Value Decomposition and Least SquaresSolutions. In:

Numerische Mathematik

14 (1970), Nr. 5, S. 403–42016.

Gönen , M. ;

Alpaydın , E. : Multiple Kernel Learning Algorithms. In:

Journal ofMachine Learning Research

12 (2011), Nr. Jul, S. 2211–226817.

Goodman , I. R. ;

Mahler , R. P. ;

Nguyen , H. T.:

Mathematics of Data Fusion .Bd. 37. Springer Science & Business Media, 201318.

Gustafsson , F. :

Statistical Sensor Fusion . Lund : Studentlitteratur, 201219.

Hall , D. L. ;

Llinas , J. : An Introduction to Multisensor Data Fusion. In:

Proceedings of the IEEE

85 (1997), Jan., Nr. 1, S. 6–2320.

Hall , D. L. ;

Liggins , M. E. ;

Llinas , J. :

Handbook of Multisensor Data Fusion:Theory and Practice . CRC press, 20091.

Jazwinski , A. H. ;

Corporation , C. (Hrsg.):

Stochastic Processes and FilteringTheory . Courier Corporation, 200722.

Julier , S. J. ;

Uhlmann , J. K.: A Non-Divergent Estimation Algorithm in thePresence of Unknown Correlations. In:

Proceedings of the 1997 American ControlConference

Bd. 4. Albuquerque, NM, USA, Jun. 1997, S. 2369–237323.

Kalman , R. E.: A New Approach to Linear Filtering and Prediction Problems.In:

Journal of basic Engineering

82 (1960), Nr. 1, S. 35–4524.

Kaugerand , J. ;

Ehala , J. ;

Mõtus , L. ;

Preden , J.-S. : Time-Selective DataFusion for In-Network Processing in Ad Hoc Wireless Sensor Networks. In:

Inter-national Journal of Distributed Sensor Networks

14 (2018), Nr. 1125.

Khaleghi , B. ;

Khamis , A. ;

Karray , F. O. ;

Razavi , S. N.: Multisensor DataFusion: A Review of the State-of-the-Art. In:

Information Fusion

14 (2013), Nr.1, S. 28–4426.

Koch , W. :

Tracking and Sensor Data Fusion . Springer, Berlin, Heidelberg, 201627.

Kramer , S. C. ;

Sorenson , H. W.: Recursive Bayesian Estimation Using Piece-Wise Constant Approximations. In:

Automatica

24 (1988), Nr. 6, S. 789–80128.

Kumar , M. ;

Garg , D. P. ;

Zachery , R. A.: A Method for Judicious Fusion ofInconsistent Multiple Sensor Data. In:

IEEE Sensors Journal

McKendall , R. ;

Mintz , M. : Robust Fusion of Location Information. In:

Pro-ceedings of the 1988 IEEE International Conference on Robotics and Automation

Bd. 2. Philadelphia, PA, USA, Apr. 1988, S. 1239–124430.

Nguyen , H. T.:

An Introduction to Random Sets . Chapman and Hall/CRC, 200631.

Pravia , M. A. ;

Prasanth , R. K. ;

Arambel , P. O. ;

Sidner , C. ;

Chong , C.-Y.: Generation of a Fundamental Data Set for Hard/Soft Information Fusion. In: . Cologne, Germany, Jun.2008, S. 1–832.

Ren , Y. ;

Zhang , L. ;

Suganthan , P. N.: Ensemble Classiﬁcation and Regression-Recent Developments, Applications and Future Directions. In:

IEEE Comp. Int.Mag.

11 (2016), Nr. 1, S. 41–5333.

Shafer , G. :

A Mathematical Theory of Evidence . Bd. 42. Princeton, New Jersey,USA : Princeton University Press, 197634.

Steinberg , A. N. ;

Bowman , C. L.: Revisions to the JDL Data Fusion Model.In:

Handbook of Multisensor Data Fusion . CRC Press, 200835.

Uhlmann , J. K.: Covariance Consistency Methods for Fault-Tolerant DistributedData Fusion. In:

Information Fusion

Varshney , P. K.: Multisensor Data Fusion. In:

Electronics & CommunicationEngineering Journal

Wald , L. : Some Terms of Reference in Data Fusion. In:

IEEE Transactions onGeoscience and Remote Sensing

37 (1999), Mai, Nr. 3, S. 1190–119338.

Waltz , E. ;

Llinas , J. u. a.:

Multisensor Data Fusion . Norwood, MA, USA :Artech House, Inc., 199039.

White , F. E.: Data Fusion Lexicon / Joint Directors of Labs Washington DC.Naval Ocean Systems Center, Okt. 1986. – Forschungsbericht40.

Zadeh , L. A.: Fuzzy Sets. In:

Information and Control

Zadeh , L. A.: Review of a Mathematical Theory of Evidence. In: