IEEE Transactions on Multimedia | 2019
TPCKT: Two-Level Progressive Cross-Media Knowledge Transfer
Abstract
As multimedia data have been the main form of big data, cross-media retrieval becomes a research hotspot. It provides a flexible retrieval paradigm across different media types, such as using an image query to retrieve the relevant text, video, and audio. An effective model to establish cross-media correlation is indispensable for retrieval. Existing methods usually rely on labeled data for model training, but it is extremely labor consuming to collect and label cross-media data. Under this situation, it is a key issue toward the real application to transfer knowledge from existing data to new data, for reducing the human labor. However, little attention has been paid to knowledge transfer between two cross-media domains. Therefore, this paper proposes the approach of two-level progressive cross-media knowledge transfer (TPCKT), which transfers knowledge from large-scale cross-media data, to boost the retrieval accuracy on cross-media data of another domain. Its contributions are: First, two-level adversarial transfer architecture is proposed with domain discriminators in media-specific level and media-shared level, which have partially shared parameters to preserve cross-media consistency of transfer. The domain discrepancy between cross-media domains is fully reduced for boosting the retrieval accuracy. Second, progressive semantic transfer mechanism is proposed to iteratively select semantically related categories in two cross-media domains for transfer. This drives the transfer process with ascending difficulties, for addressing the difficulty from different label spaces, and ensuring the robustness of transfer. For the experiment, the large-scale cross-media dataset PKU XMediaNet serves as the source domain, and three widely used small-scale datasets are adopted as the target domains to perform retrieval. Experimental results show the promising improvement gained by the proposed TPCKT.