Multi-Sensor Data and Knowledge Fusion -- A Proposal for a Terminology Definition
MMulti-Sensor Data and Knowledge Fusion
A Proposal for a Terminology Definition
Silvia Beddar-Wiesing, Maarten Bieshaar
Intelligent Embedded Systems, University of Kassel, Germany {s.beddarwiesing,mbieshaar}@uni-kassel.de
Abstract.
Fusion is a common tool for the analysis and utilization ofavailable datasets and so an essential part of data mining and machinelearning processes. However, a clear definition of the type of fusion isnot always provided due to inconsistent literature. In the following, theprocess of fusion is defined depending on the fusion components and theabstraction level on which the fusion occurs. The focus in the first partof the paper at hand is on the clear definition of the terminology andthe development of an appropriate ontology of the fusion componentsand the fusion level. In the second part, common fusion techniques arepresented.
The generic term fusion describes the combination of different fusion componentsthat consist of available datasets. A more specific definition of fusion is onlypossible regarding the context and the purpose of the fusion, i.e., in particularthe components to fuse. In general, the assumption behind the application offusion is, that fusing datasets from different sources improves the performanceof the subsequent data processing.Consider the task of tracking a pedestrian at a crossroad with the help ofa set of cameras. The aim is to generate a position prediction of a pedestrianat the next timestep. For this purpose, predictions based on the images fromdifferent cameras can be combined to obtain a more robust prediction. Thefusion on a previous stage of the data processing can improve the predictionaccuracy. Fusing knowledge in form of models for predestrian behavior, fusinginformation about the location and velocity of the pedestrian or fusing images ofthe same pedestrian to obtain less noisy images can contribute to a more preciseprediction.Unfortunately, inconsistent vocabulary often complicates the clear definitionand description of fusion algorithms and, as a result, makes the categorization ofthe datasets and the selection of a corresponding fusion technique difficult. Theaim of the paper at hand is to clearify the terms of possible fusion components,the terms used for the different process levels and the definition of the totalprocess. Several deceptive definitions are discussed and afterwards, an ordereddefinition for the fusion terms is proposed that combines the common definitionsof the fusion components and the fusion level. In the last section, a selected setof fusion techniques for the different fusion level are listed. a r X i v : . [ c s . A I] J a n Subdivision of Fusion Techniques
To define a categorization of fusion techniques, it is necessary to first define theterms used for the fusion components. The definition of the fusion componentscorresponds to the level of abstraction that can be determined by means of thedata-information-knowledge-wisdom (DIKW) hierarchy [1].Furthermore, the constitution of the fusion components per level restrains thepossible fusion algorithms to a specific family of techniques. In the following, anextension of the DIKW hierarchy is illustrated and afterwards, the fusion levelsare specified.
There have been wide studies about the categorization of data. In this section,two popular concepts are examined considering the application of the data asfusion components.One common categorization has been published by Ackoff [1] and divides datainto five categories that can be transferred into each other: data, information,knowledge, understanding, and wisdom.Ackoff describes data as representations of objects or events. Processing thedata to improve the usability leads to information that is used in descriptionsand answers questions that begin with what, who, where, and how many. Theapplication of data and information generates knowledge that can transforminformation into instructions and answers questions that begin with how. Ifrelations and patterns in the information are identified, the context has beencaptured and, as a result, understanding has been reached. Understanding helpswith questions that begin with why. At least, wisdom includes the ability ofjudgement and the competence of dealing with the value of the data, informa-tion and knowledge. According to the further elaboration of the definitions from
Data Information Knowledge Wisdom understandingrelations understandingpatterns understandingprinciples
Fig. 1.
The adjusted DIKW hierarchy.
Ackoff in [4], Bellinger et al. take the view that understanding is not a stage inthe hierarchy but the condition for the transition from a lower to an upper levelas schematized in Figure 1. Thus understanding relations between data leadsto information, understanding patterns in information generates knowledge, andunderstanding the underlying principles of knowledge results in wisdom.With regard to the following fusion techniques that originate mostly from amachine learning context, the extended version of the DIKW-hierarchy providesa more applicable definition of the fusion components. In Section 3, the relationbetween the extended DIKW-hierarchy and the fusion levels will be clarified.But first, the fusion process and the levels of fusion are defined in the following. .2 Definition of Fusion and Fusionlevel
Especially in terms of fusion definitions, the literature varies. In general, thereexist two kinds of perspectives: On the one hand, there are universal definitionsof fusion in sense of a whole process. On the other hand, fusion is performedon components from different abstraction levels. But due to the inconsistentclassification of fusion techniques, the comparison of literature is often difficult.In particular, the definition of data and information fusion differs.In the following sections, the terminology of different approaches is discussedand in the conclusion, an ontology of the terms for the fusion components andthe fusion level is proposed in form of the fusion level rainbow as illustrated inFigure 4.
Fusion.
In [6], several differing previous definitions as well as a new definition ofsensor, data and information fusion are listed. Staying with the attempt to cat-egorize the general definitions, it is noticeable, that they focus on three differentaspects in particular:Firstly, the fusion components are a central aspect. In some literature, theyare explicitly identified as observations or measurements respectively raw data[11,29,38]. In most cases, they are referred to as data or information, or it isspecified that they originate from multiple sources, but there is no detailed dis-cussion about the representation of the data.Secondly, some of the definitions also focus on the process behind the fusion.A popular precedent for this is one of the first definitions of fusion published bythe Joint Directors of Laboratories (JDL) in form of a data fusion lexicon [39].The fusion process is described as the „association, correlation, and combinationof data and information from single and multiple sources“[39]. It is additionallymentioned, that the fusion has several levels that are explained separately aslisted in the next section. A different description of fusion in [26] interprets theprocess as the application of prior knowledge to a concrete realization.Thirdly, most authors focus on the purpose of the fusion or consider the com-bination of the three aspects presented. The most common goal that is statedincludes an improvement of the obtained information due to fusion. Furtheradvantages are reflected in its applicability for many purposes: The resultingunderstanding of the observed situation [36], smoothed data and reduction inuncertainty [7], an optimal estimate of a hidden state [14], an improved perfo-mance of inference [19], improvement in prediction [34] or decision tasks [12] orin general richer and more useful information [37],[38].
Fusionlevel.
Many definitions of the level of fusion refer to the published datafusion lexicon of the JDL [39]. Here, the level specification originally has beenanalysed from the point of view of a military application, but it can be extendedto several fusion applications. It consists of three interrelated levels, but it isore common to use the extended JDL definition. The latter is illustrated inFigure 2 and includes the three levels from the JDL model (levels 1 - 3), extendedby three more levels as described in [20]. The six levels are defined as follows:
Level 0:
Fusion of raw data in form of signal refinement to obtain preliminary infor-mation about the characteristics of the observed object or situation.
Level 1:
Data is processed to specify the position or identity of an entity, or to classifycharacteristics of it. This is called the object refinement.
Level 2:
Relationships between objects and events considering the environment leadsto a situation refinement by, e.g., analyzing relation structures.
Level 3:
Characterized as the threat refinement. In general, this can be interpreted asa risk estimation by drawing inferences or predictions for application-specificoperations.
Level 4:
The performance improvement of the entire fusion process by refining theelements of it during a suitable type of monitoring.
Level 5:
A process of cognitive refinement via optimizing the interaction of the processwith the user, that is dissociated from the previous levels.
Data-bases
NationalDistributedLocal
Human/Computer Interface
Data Fusion Domain
SourcePreprocessingLevel 1ProcessingObjectRefinement Level 2ProcessingSituationRefinement Level 3ProcessingThreatRefinementLevel 4ProcessingProcessRefinement
Database ManagementSystemSupportDatabase FusionDatabase
Sources
Fig. 2.
The extended fusion level definition of the JDL includes six different levels thatcan be passed through during an entire fusion process (sketched here based on[20]).
For a more differentiated categorization of fusion processes, the goal of thepaper at hand is to combine the characterization of the fusion components fromSection 2.1 and the fusion levels (and eventually the constitution of the fusioncomponents). But, from this point of view, the level definitions from the JDLare too broad and the interest of the type of fusion components emerges only inthe first four levels.In contrast to the JDL model, the model proposed by Dasarathy [9], wherefusion processes are divided into five levels, is more associated to the DIKW-hierarchy from Section 2.1. The reason is, that the choice of devision is di-rectly related to the fusion components and the fusion emissions. Interpretinghe knowledge as a feature of the observed event and wisdom as decisions, thatcan exist, e.g., as classification decisions, evaluation of a regression model or aprediction, the following model is directly based on the DIKW-hierarchy.In addition, Varshney [36] added a sixth level, so that all possible ascending(relating to the components presented in 2.1) or equal pairs of inputs and out-puts of a fusion are considered. The all-encompassing extension of Dasarathy’smodel is illustrated in [20] and will be briefly listed in the following. For thisrepresentation, the shortcuts DAI (input: data), DAO (output: data), FEI (in-put: feature), FEO (output: feature), DEI (input: decision) and DEO (output:decision) are used. All pairs of inputs and outputs during a fusion process canbe interpreted as follows:1 DAI/DAO: signal detection (fusion of raw datasets generates data with lessnoise, a sharper signal can, e.g., be used to detect signal sections)2 DAI/FEO: feature extraction (fusion of datasets generates data that can beused to extract relevant features)3 DAI/DEO: Gestalt-based object characterization (fusion of raw datasets canlead to a better characterization of an object or a decision)4 FEI/DAO: model-based detection and feature extraction (fusion of featuresleads to refined features from which data can be generated, e.g., the fusionof Gaussians and subsequent sampling)5 FEI/FEO: feature refinement (fusion of different models that describe thesame feature generates a more confident feature model)6 FEI/DEO: feature-based object characterization (feature level fusion refinesthe description of an object or a decision)7 DEI/DAO: model-based detection and estimation (decision fusion sharpensthe decision model from which data can be generated e.g. mixture of gener-ative experts and subsequent sampling with the Gibb’s Sampler [5])8 DEI/FEO: model-based feature extraction (decision fusion can lead to adecision from which a feature can be derived)9 DEI/DEO: object/decision refinement (e.g. mixture of experts leads to abetter decision)But in fact, the combination of data as input and decision as output doesnot often occur in common tasks. Additionally, cases 4, 7 and 8 are featurerespectively decion level fusions from which can be sampled or conclusions drawnafterwards. Depending on the form of the feature representation, samples canbe generated by common sampling methods as described in [5].urthermore, there is a set of models that apply an additional backwardconnection from the top level to the data acquisition process as in the OODA(Observe, Orient, Decide, Act) loop [3]. As a result, the generation and collectionprocess of input data can be adapted with regard to the conclusive decisionevaluation.One last model has to be mentioned here that is also common, but especiallyused in image fusion applications and is similar to the described definitionsbefore. Here the fusion is divided into the pixel, feature and decision level [8].Analogous to the data, feature and decision fusion, this model is applied toimage processing, which is a subset of the sensor fusion. An illustration is givenin Figure 3.
Fig. 3.
For image data, the data level fusion is known as pixel level fusion and usedfor image processing (figure inspired by the depiction from [8]).
In order to combine the delineated fusion level terms, the following orderedontology describes the relations between the terms: ◦ (Pixel, Sensor) Data Fusion ◦ Information Fusion ◦ Knowledge/Feature Fusion ◦ Decision Fusionand is illustrated in Figure 4.t the bottom of the entire fusion process, raw data given by sensors orother sources are fused while understanding characteristics and relations of theinput. The refined data with a low information loss compared to the originaldata provides an updated representation for further applications.The obtained information is a fundamental basis for the feature extraction inthe next level that proposes an underlying model for the data. This can provideknowledge that can be used to understand patterns in the data and thus createawareness of underlying principles in the source data.On top of the fusion process, the aim is to gain wisdom in form of performanceimprovement in decision making and thus the choice of action. Depending on theimpact of the action, the entire fusion process can be adapted at the differentstages.
ImagesSensoryDataData P i x e l F u s i o n S e n s o r F u s i o n D a t a F u s i o n Information I n f o r m a t i o n F u s i o n Knowledge K n o w l e d g e / F e a t u r e F u s i o n Awareness A b s t r a c t i o n U n d e rs t a n d i n g A d j u s t m e n t o f d a t a a c qu i s i t i o n D e c i s i o n F u s i o n Wisdom
Fig. 4.
The Fusion Level Rainbow. Based on the extended Dasarathy model, the fusionwithin a component leads to a refined element of the corresponding component or toa more abstract emission.
Fusion Techniques of the Particular Levels
In this section, selected techniques from different fusion levels will be presented.At the beginning of 3.1, a fine structure of sensor fusion techniques is outlined.Afterwards, statistical fusion methods are described in more detail. To get ashort overview of algorithms used in data and information fusion, the focus inSection 3.2 is on the constitutions of data that require different fusion processes.At the end in Section 3.3 and 3.4, different forms of knowledge and decisionfusion techniques are characterized.
In the fusion level rainbow (Fig. 4), the sensor fusion is a subset of the lowestlevel of fusion, the data fusion. Sensor data is a special case of data that can berepresented as a data point in a high dimensional space and is produced by (mul-tiple) sensors. Sensor fusion techniques once again can be categorized accordingto the information flow between the available sensor network and different sensorconfigurations.Firstly, the fusion can be implemented centralized or decentralized [18]. In thecentralized architecture, the measurements of all sensors are available during thefusion process, so a batch method is used. In contrast to this, in the decentralizedfusion, the measurements of each sensor is fused within a seperate fusion model.Then during the global fusion process, only the model information of each sensoris available and processed sequential. The decentralized fusion is preferred sincethe fusion process is considered as being more robust and reliable [14].Secondly, sensor fusion can be furthermore divided into three cases dependingon the sensor configuration as listed in [10]: Competitive Sensor Fusion. (homogeneous) Either data from sensors of thesame modality are fused or the sensors can be transformed to the same baselinepreviously and are fused afterwards. Data fusion of competitive sensors can beused to reduce noise respectively uncertainty.In connection to the initial example, competitive sensors in form of camerasused for the pedestrian tracking produce images of the same person at the sametime. Fusing the images that may contain a degree of uncertainty, the resultingimages are less noisy and more applicable for the tracking task.
Complementary Sensor Fusion. (heterogeneous) Sensors observe the sameevent and fusion of them generates a complemented image of the observation.This means, the sensors can measure different and disjunct parts of the sameevent and the combination leads to a complete characterization of it.A complementary set of cameras, e.g., can provide an extended picture ofa crossroad in contrast of the image of only one camera which simplifies thesubsequent tracking of a pedestrian. ooperative Sensor Fusion.
A sensor is configurated depending on the in-formation from other sensors to generate more useful information. This form ofsensor network configuration includes some sort of temporal delay and depen-dency on a decision of an expert. The tracking task can require the possibilityto adapt the camera angles after observing a certain behaviour of the pedestrian.Sensor fusion techniques also differ in the assumptions about the systemunder consideration. For processing sensor data with uncertainties, statisticalsensor fusion techniques for static and dynamic systems are presented by FredrikGustafsson in [18]:
Statistical Sensor Fusion
The main idea behind statistical models is, thatthe sensors are noisy and the „true“ characterization of the event is given by astate vector x . Furthermore, it is expected, that by fusing different sensors, thestate vector gets more precise.The assumption here is that N given observations y n ∈ R n y , n ∈ [ N ] , stackedin y ∈ R N × n y , can be described by a model y = h ( x ) + e . Gustafsson discusses linear models h ( x ) = Hx with a stacked factormatrix H ∈ R N · n y × n x , as well as nonlinear models, that relate the observations to thehidden state vector x ∈ R n x . Most of the time, the stacked error e ∈ R N · n y isassumed to be Gaussian, but the non-Gaussian case is mentioned too.In the static case , the hidden state x ∈ R n x is time-invariant, so that themodel describes, e.g., the observation of the same event by means of severalsensors or of the same sensor at different timestamps.In the dynamic case , the state x k ∈ R n x is additionally varying with timeaccording to a sequential update model x k +1 = f ( x k ) + v k , with a linear or non-linear mapping f : R n x → R n x and a Gaussian or non-Gaussian noise v k ∈ R n x .For providing a new state estimation, Gustafsson gives an insight into avariety of least squares approaches that can be applied, while the measurementsare independent.In case of correlated measurements in the static case, Gustafsson presentsthe safe fusion algorithm , which will be discussed in 3.2.For dynamic systems, Gustafsson lists popular filtering algorithms, such asseveral variations of the Kalman Filter [23], that make inference on the state fromthe observations using dynamic linear or nonlinear models. Numerical methodsapproximating the nonlinear filter models are also discussed. In the grid-basedmethods, e.g., parts of the calculation during the filtering process are approxi-mated by discretising the state space [21] or replacing integrals by finite sums[27]. .2 Data and Information Fusion In this section, the case of data respectively information fusion in the secondlowest levels of the fusion level rainbow (Fig. 4) in general are under considera-tion. When selecting a data fusion technique, the first aspect to focus on is theconstitution of the data. In [25], a detailed overview of fusion techniques for thefollowing types of data quality is given:
Imperfect Data.
The case of imperfect data occurs in nearly all applicationsof data fusion. To deal with a certain degree of imperfect data, the followingalgorithms, that are described together with former extensions in [25], can beused: – Probabilistic Fusion:
The main idea is to represent the imperfection of databy means of uncertainty in form of probability distributions. Fusing themaccording to the Bayesian fusion formula from [5]: p ( X | Z ) = p ( Z | X ) · p ( X ) p ( Z ) (1)determines the posterior probability distribution of the (real, underlying)state X depending on the observations Z = { z , . . . , z t } , the likelihood p ( Z | X ) and the (chosen) prior distribution p ( X ) . Analogously, the Bayesianfusion can be formulated for the dynamic case by using observations up totime t [18].Based on this Bayes theorem, several fusion techniques have been formulated,that are used in algorithms for numerous applications, such as the Kalmanfilter. For more information see [25]. – Evidential Belief Reasoning:
In the Dempster-Shafer evidential theory (DSET),possible measurement hypotheses obtain correspoding beliefs and plausibili-ties. Usually, Dempster’s rule of combination is used to fuse two belief massesof the same set by summing up the product of the belief masses of every pairof supersets regarding conflicting subsets. Consequential, the DSET providesa fusion technique that can be seen as a generalization of the Bayesian fusionwhere probability mass functions are used as belief functions [33].The DSET has been proposed to represent incomplete data, analogously tothe probability theory by modeling the membership uncertainty of an ele-ment in a well-defined class. – Fusion and Fuzzy Reasoning:
In contrast, the fuzzy set theory is mainly de-signed to represent and to operate on vague data and to model the fuzzymembership of an element in an ill-defined class [25]. For this purpose, a grad-ual membership function is introduced that defines a fuzzy set by assigninga membership degree between 0 and 1 to each element of the (discrete) uni-verse. The higher the degree, the more the element belongs to the fuzzy set.Fusing of membership degrees can be done in form of conjunctive and dis-junctive fusion rules to obtain a fuzzy fusion output (i.e., the fusion functionis bounded from above by the minimum in the former and from below bythe maximum in the latter case), as further described in [40].
Possibilistic Fusion:
Based on the fuzzy set theory, possibility theory is con-ceived to again represent incomplete data by modeling an uncertain member-ship of the elements in the universe in well-defined classes with a possibilitydistribution [25]. Another difference to fuzzy theory is to require normalizedmembership functions that can be used to define a possibility and a necessitydegree. The former determines the plausibility and the latter the certaintyof a subset of the universe. So the possibility theory differs from probabilitytheory by the use of the two dual set-functions. The fusion rules are identicalto the ones for the fuzzy fusion. – Rough Set Based Fusion:
The idea of the rough set theory is to approximatea data set by an upper and lower bound of sets to obtain a rough represen-tation of the original set. The lower bound set includes subsets of the datathat definitely belong to the original set. The difference of the upper andlower bound sets includes subsets that cannot be classified as belonging ornot belonging to the original set. In the approximation, the granularity ofthe data can be considered by choosing an appropriate size of the unitedsubsets.Datasets can then be fused by combining the rough sets of their approxima-tions with the aid of classical conjunctive or disjunctive fusion techniques ofset theory such as the union or intersection of sets [25]. – Random Set Theory:
The theory of random sets uses a form of generalizedrandom variables, the random sets, which take sets as random values. In [17],random sets and their characteristics are described in detail. In addition,Goodman et al. discuss applications of random sets in single- and multi-target data fusion which can be used for the data association problem.The handling with coarse data in general by using the random set theory isdescribed in [30] extensively.
Correlated Data.
As mentioned in Section 3.1, correlated data can be fusedby the safe fusion algorithm [18] or evolved algorithms as listed in [25]. Here it isassumed, that the correlation is unknown and the data respectively informationis represented as a point in a high dimensional space.The safe fusion algorithm presumes a decentralized network and is basedon the covariance intersection (CI) algorithm proposed by Julier and Uhlman[22]. This means in addition, it is assumed that the models of the dependingmeasurements include Gaussian noise and thus possess individual means andcovariances.The algorithm determines a covariance that defines the minimal ellipsoid thatencloses the intersection of (sequentially entering) covariances by making use ofa transformation defined by the sequentially emerging singular value decompo-sition (SVD, [15]) as illustrated in Figure 5.In case of a known correlation, it can be eliminated before fusing the data[25], e.g., by the principal component analysis [5]. ig. 5.
Safe fusion algorithm visualized (similar as illustrated in [18]) in the fusionof two covariances. In the first step, both covariances are transformed correspondingto the SVD of one covariance (blue). The SVD of the obtained manipulated secondcovariance (green) yields in the second transformation of both covariances. The finalintersection of the covariances is then covered by the target covariance (thick ellipse).
Inconsistent Data.
There are several forms of inconsistencies, which must bedealt with in different ways. These are listed in the following: – spurious data: Sensor data can be distorted by permanent failures or slowlyevolving errors of a sensor. The most common ideas of dealing with spuriousdata is the identification or prediction of systematic errors. After that, sus-picious data can be excluded from the fusion process. A statistical approachfor detecting spurious data can be found in [28]. Kumar et al. formulated anextension of the basic Bayesian fusion formula (1) by augmenting a randomvariable which describes whether the data is spurious or not depending onthe data and the true state. – out-of-sequence-measurements (OOSM): There are two different aspects thathave to be considered in dynamic systems: the validity and the out-of-sequence arrival of data. When a history of measurements is necessary forthe fusion process, the data should be updated at appropriate intervals toguarantee a valid fusion result. In addition, an implementation of an entiresensor network can entail delayed data deliveries for the fusion process. Apossible approach for these problems has been proposed by Kaugerand et al.in [24], that defines a fusion interval in which incoming data is permitted tobe fused. Also in [25], some strategies for dealing with out-of-sequence dataare presented. – conflicting data: The problem of conflicting data results in misleading con-clusions as discussed in [41]. Extensions of the DSET have been developed toespecially address the problem of inconsistent data and are mentioned in [25].n a statistical environment, in complement to the CI algorithm, the covari-ance union (CU) algorithm can be applied to deviating measurements for aconsistent fusion with the assumption of a Gaussian uncertainty [35]. Theunion process consists of the computation of a mean and a covariance, sothat the previous covariances added to the deviation of the previous meanfrom the new mean constitute a lower limit of e.g. the determinant of thenew covariance.
Disparate Data.
An aspect, that has not been considered yet is the fusion ofdata from sensors and sensor intelligence (hard information), data from humanintelligence, open source intelligence and communications intelligence (soft infor-mation) [31]. Hard information can be represented in a mathematical frameworkand can therefore be used for the fusion techniques as presented above.Whereas soft information is produced by human sources and is therefore avail-able in „context-dependent languages over bandwidth-limited channels“ [31]. Theresearch on modeling uncertainty of such soft information is quite young, butthere are some models for linguistic data described in [2].
When the abstraction of the fusion components increases along the fusion levelrainbow (Fig. 4), the components can be available in form of models that includeknowledge from the observed event.Knowledge fusion itself has two different levels: it can be performed at modelor parameter level.
Model Fusion.
Knowledge can be represented in form of different models. Asimple example of such a model is a Gaussian distribution that includes infor-mation about the distribution of the data [5]. In addition, artificial data can begenerated with it. This kind of model is called a generative model.To fuse knowledge from several models in form of mixture models , they haveto be in the same modality [5], i.e., they consist of the same base model butare trained differently or model different aspects of the underlying data. A highdiversity in knowledge is prefered for a fusion as discussed for the case of ensem-bles in [32]. In case of fusing models, the training can differ in the training setor the prior parameter setting.The fused knowledge is then composed of a linear combination of the distinctmodels. The mixture of Gaussian distributions, e.g., results in a Gaussian mix-ture model (GMM). It descibes the distribution of a data set, that is assumedto be more complex than a unimodal Gaussian, by a convex combination ofa selected number of Gaussians. The concluding mixture model contains moreprecise knowledge about the overall distribution and the mixture coefficientsimply additive knowledge about the responsibility of a mixture component forgenerating a given data point.urther examples for model fusion techniques are Convolutional Neural Net-works [5] and Multiple Kernel Learning based Ensemble Methods [16]. The latterapproach permits different kernels representing knowledge that are fused by a(non-) linear combination function.
Parameter Fusion.
Moreover, knowledge given by classification models canbe available as classification rules or decisions (as the outputs of the classifiers).The combination of classifiers at the component level is equivalent to the mixturemodels and the combination at the output level will be presented in the nextsection.The parameter fusion is a more complex fusion form and is applied on theparameter level of the classification models. In [12], a knowledge fusion techniquefor generative classifiers based on mixture models (CMMs) is presented.For the fusion algorithm, two or more probabilistic generative classifier thatdescribe the same process have to be given. Each one is divided by the numberof classes into parts that are mixtures of probabilty densities conditioned by theclass and the mixture component. Additionally, the densities have to be definedon the same input space, i.e., especially on the same number of continuous andcategorical dimensions and are trained on different training sets.Before applying the algorithm, conjugate hyperdistributions for all hyperpa-rameters per component are introduced and trained via the variational inference(VI) algorithm as described in [12].Having defined the hyperdistributions, the first step of the algorithm deter-mines similar hyperdistributions of each component via an appropriate similaritymeasure. Assuming that a posterior distribution for a class is calculated by theBayesian formula (1) and the classifiers use the same prior knowledge, the fusionrule of two similar hyperdistributions is determined by the multiplication of thetwo posteriors divided by the prior. The derivation of this rule can be found in[12].Using conjugate priors, the fused posterior distribution has the same func-tional form as the given classifiers, so the estimation of the fused parameters arededucible from a sequence of mathematical transformations of the fusion rule.For setting the distributions in the continuous dimensions to multivariateGaussians and the distributions of the categorical ones to multinomial distri-butions, the hyperdistributions are Dirichlet respectively normal-Wishart dis-tributions. The fusion formulae for the corresponding parameters are listed in[12]. .4 Decision Fusion
The decision fusion of multiple classifiers may consist of the direct combinationof decisions or the selection of one suitable classifier for a specific input area. – Committee/Ensemble : The decision is given as a combination of decisionsfrom distinct models [5]. Boosting is an ensemble technique that trains themodels iteratively and a decision is calculated by, e.g., weighting the modelsdepending on their performance as in the adaptive boosting (AdaBoost)algorithm [13]. – Decomposition-Based Ensemble Methods:
In case of time series data, severalensemble methods based on the lossless decomposition of the input signalcan be used for forecasting [32], which can be interpreted as decision making.After decomposing the time series into a set of signals that fully representthe original signal, the predictions from the components can be combined toone predition for the origin. – Mixture of Experts:
Depending on the input domain, a decision is made fromone (hard or soft) selected model from the mixture of experts. The selectioncan be implemented in form of decision trees. Whereas in the Bayesian ModelAveraging, one model is determined for the entire input space by introducinga prior probability for each model [5].
The ideal definition should have no space for interpretation or double allocationof the terminology. The goal of this paper was to provide a clear definition forfusion and its components for the applicability in varous contexts. After havingdiscussed several definitions, the fusion levels have been defined considering theprevious specified fusion components. The resulting fusion level rainbow (Fig. 4)includes the entire range of fusion components and levels and provides a cleardefinition of fusion for further applications. Finally, an overview of commonfusion techniques was given for each of the previously defined levels.In the presented definitions, decisions are often associated with predictionsor classifications. The definitions are analogously valid for regression problems.The concurrent fusion of components from different levels has not been con-sidered in the paper at hand and is left for future work. In addition, the hybridiza-tion of the fusion techniques are in common use and have not been addressedhere.
Acknowledgment
This work results from the project DeCoInt , supported by the German Re-search Foundation (DFG) within the priority program SPP 1835: "Kooperativinteragierende Automobile", grant number SI 674/11-1. eferences Ackoff , R. L.: From Data to Wisdom. In:
Journal of applied Systems analysis
16 (1989), Nr. 1, S. 3–92.
Auger , A. ;
Roy , J. : Expression of Uncertainty in Linguistic Data. In:
Proceedingsof the 11th International Conference on Information Fusion . Cologne, Germany,Jul. 2008, S. 1–83.
Bedworth , M. ;
O’Brien , J. : The Omnibus Model: A New Model of DataFusion? In:
IEEE Aerospace and Electronic Systems Magazine
15 (2000), Nr. 4, S.30–364.
Bellinger , G. ;
Castro , D. ;
Mills , A. : Data, Information, Knowledge, andWisdom. (2004)5.
Bishop , C. M.:
Pattern Recognition and Machine Learning . Springer Science &Business Media, 20066.
Boström , H. ;
Andler , S. F. ;
Brohede , M. ;
Johansson , R. ;
Karlsson , A. ;
Van Laere , J. ;
Niklasson , L. ;
Nilsson , M. ;
Persson , A. ;
Ziemke , T. :
Onthe Definition of Information Fusion as a Field of Research . 20077.
Challa , S. ;
Gulrez , T. ;
Chaczko , Z. ;
Paranesha , T. N.: OpportunisticInformation Fusion: A New Paradigm for Next Generation Networked SensingSystems. In:
Bd. 2. Philadelphia, PA, USA, Jul. 2005, S. 720–7278.
Chang , N.-B. ;
Bai , K. :
Multisensor Data Fusion and Machine Learning forEnvironmental Remote Sensing . CRC Press, 20189.
Dasarathy , B. V.: Sensor Fusion Potential Exploitation-Innovative Architecturesand Illustrative Applications. In:
Proceedings of the IEEE
Bd. 85, IEEE, 1997, S.24–3810.
Durrant-Whyte , H. F.: Sensor Models and Multisensor Integration. In:
Theinternational Journal of Robotics Research
Durrant-Whyte , H. F.:
Integration, Coordination and Control of Multi-SensorRobot Systems . Bd. 36. Springer Science & Business Media, 201212.
Fisch , D. ;
Kalkowski , E. ;
Sick , B. : Knowledge Fusion for Probabilistic Gener-ative Classifiers with Data Mining Applications. In:
IEEE Transactions on Knowl-edge and Data Engineering
26 (2014), März, Nr. 3, S. 652–66613.
Freund , Y. ;
Schapire , R. E.: A Decision-Theoretic Generalization of On-lineLearning and an Application to Boosting. In:
Journal of Computer and SystemSciences
55 (1997), Nr. 1, S. 119–13914.
Gao , S. ;
Zhong , Y. ;
Zhang , X. ;
Shirinzadeh , B. : Multi-Sensor Optimal DataFusion for INS/GPS/SAR Integrated Navigation System. In:
Aerospace Scienceand Technology
13 (2009), Nr. 4-5, S. 232–23715.
Golub , G. H. ;
Reinsch , C. : Singular Value Decomposition and Least SquaresSolutions. In:
Numerische Mathematik
14 (1970), Nr. 5, S. 403–42016.
Gönen , M. ;
Alpaydın , E. : Multiple Kernel Learning Algorithms. In:
Journal ofMachine Learning Research
12 (2011), Nr. Jul, S. 2211–226817.
Goodman , I. R. ;
Mahler , R. P. ;
Nguyen , H. T.:
Mathematics of Data Fusion .Bd. 37. Springer Science & Business Media, 201318.
Gustafsson , F. :
Statistical Sensor Fusion . Lund : Studentlitteratur, 201219.
Hall , D. L. ;
Llinas , J. : An Introduction to Multisensor Data Fusion. In:
Proceedings of the IEEE
85 (1997), Jan., Nr. 1, S. 6–2320.
Hall , D. L. ;
Liggins , M. E. ;
Llinas , J. :
Handbook of Multisensor Data Fusion:Theory and Practice . CRC press, 20091.
Jazwinski , A. H. ;
Corporation , C. (Hrsg.):
Stochastic Processes and FilteringTheory . Courier Corporation, 200722.
Julier , S. J. ;
Uhlmann , J. K.: A Non-Divergent Estimation Algorithm in thePresence of Unknown Correlations. In:
Proceedings of the 1997 American ControlConference
Bd. 4. Albuquerque, NM, USA, Jun. 1997, S. 2369–237323.
Kalman , R. E.: A New Approach to Linear Filtering and Prediction Problems.In:
Journal of basic Engineering
82 (1960), Nr. 1, S. 35–4524.
Kaugerand , J. ;
Ehala , J. ;
Mõtus , L. ;
Preden , J.-S. : Time-Selective DataFusion for In-Network Processing in Ad Hoc Wireless Sensor Networks. In:
Inter-national Journal of Distributed Sensor Networks
14 (2018), Nr. 1125.
Khaleghi , B. ;
Khamis , A. ;
Karray , F. O. ;
Razavi , S. N.: Multisensor DataFusion: A Review of the State-of-the-Art. In:
Information Fusion
14 (2013), Nr.1, S. 28–4426.
Koch , W. :
Tracking and Sensor Data Fusion . Springer, Berlin, Heidelberg, 201627.
Kramer , S. C. ;
Sorenson , H. W.: Recursive Bayesian Estimation Using Piece-Wise Constant Approximations. In:
Automatica
24 (1988), Nr. 6, S. 789–80128.
Kumar , M. ;
Garg , D. P. ;
Zachery , R. A.: A Method for Judicious Fusion ofInconsistent Multiple Sensor Data. In:
IEEE Sensors Journal
McKendall , R. ;
Mintz , M. : Robust Fusion of Location Information. In:
Pro-ceedings of the 1988 IEEE International Conference on Robotics and Automation
Bd. 2. Philadelphia, PA, USA, Apr. 1988, S. 1239–124430.
Nguyen , H. T.:
An Introduction to Random Sets . Chapman and Hall/CRC, 200631.
Pravia , M. A. ;
Prasanth , R. K. ;
Arambel , P. O. ;
Sidner , C. ;
Chong , C.-Y.: Generation of a Fundamental Data Set for Hard/Soft Information Fusion. In: . Cologne, Germany, Jun.2008, S. 1–832.
Ren , Y. ;
Zhang , L. ;
Suganthan , P. N.: Ensemble Classification and Regression-Recent Developments, Applications and Future Directions. In:
IEEE Comp. Int.Mag.
11 (2016), Nr. 1, S. 41–5333.
Shafer , G. :
A Mathematical Theory of Evidence . Bd. 42. Princeton, New Jersey,USA : Princeton University Press, 197634.
Steinberg , A. N. ;
Bowman , C. L.: Revisions to the JDL Data Fusion Model.In:
Handbook of Multisensor Data Fusion . CRC Press, 200835.
Uhlmann , J. K.: Covariance Consistency Methods for Fault-Tolerant DistributedData Fusion. In:
Information Fusion
Varshney , P. K.: Multisensor Data Fusion. In:
Electronics & CommunicationEngineering Journal
Wald , L. : Some Terms of Reference in Data Fusion. In:
IEEE Transactions onGeoscience and Remote Sensing
37 (1999), Mai, Nr. 3, S. 1190–119338.
Waltz , E. ;
Llinas , J. u. a.:
Multisensor Data Fusion . Norwood, MA, USA :Artech House, Inc., 199039.
White , F. E.: Data Fusion Lexicon / Joint Directors of Labs Washington DC.Naval Ocean Systems Center, Okt. 1986. – Forschungsbericht40.
Zadeh , L. A.: Fuzzy Sets. In:
Information and Control
Zadeh , L. A.: Review of a Mathematical Theory of Evidence. In: