IEEE transactions on pattern analysis and machine intelligence | 2021

Improving Deep Metric Learning by Divide and Conquer

 
 
 
 

Abstract


Deep metric learning aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on novel images. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data-subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Our approach significantly improves upon the state-of-the-art in image retrieval and clustering on CUB200-2011, CARS196, SOP, In-shop Clothes, and VehicleID datasets.

Volume PP
Pages None
DOI 10.1109/TPAMI.2021.3113270
Language English
Journal IEEE transactions on pattern analysis and machine intelligence

Full Text