IEEE Transactions on Signal Processing | 2021

Geometric Multimodal Learning Based on Local Signal Expansion for Joint Diagonalization

 
 
 
 
 

Abstract


Multimodal learning, also known as multi-view learning, data integration, or data fusion, is an emerging field in signal processing, machine learning, and pattern recognition domains. It aims at building models, learned from several related and complementary modalities, in order to increase the generalization performances of a predictive learning model. Multimodal manifold learning extends spectral or diffusion geometry-aware data analysis to multiple modalities. This can be performed through the definition of undirected graph Laplacian matrices in different modalities. However, finding common eigenbasis of multiple Laplacians is not always a relevant solution for multimodal manifold learning problems. As a matter of fact, the Laplacians of all modalities are not simultaneously diagonalizable in many real-world problems due to the major differences between the different modalities. In this paper, we propose a multimodal manifold learning approach based on intrinsic local tangent spaces of underlying data manifolds in order to discover the local geometrical structure around matching and mismatching samples in different modalities in sparse diagonalization problems. This approach searches for approximate common eigenbasis of Laplacian matrices by expanding the signal of limited existing information about matching and mismatching samples of different modalities to their on-manifold neighbors. Experiments on synthetic and real-world datasets in supervised, unsupervised, and semi-supervised problems demonstrate the superiority of our proposed approach over existing state-of-the-art related methods.

Volume 69
Pages 1271-1286
DOI 10.1109/TSP.2021.3053513
Language English
Journal IEEE Transactions on Signal Processing

Full Text