2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) | 2019

Fusing Visual and Textual Information to Determine Content Safety

Abstract

In advertising, identifying the content safety of web pages is a significant concern since advertisers do not want brands to be associated with threatening content. At the same time, publishers would like to maximize the number of web pages on which they can place ads. Thus, a fine balance must be achieved while classifying content safety in order to satisfy both advertisers and publishers. In this paper, we propose a multimodal machine learning framework that fuses visual and textual information from web pages to improve current predictions of content safety. The primary focus is on late fusion, which involves combining final model outputs of separate modalities, such as images and text, to arrive at a single decision. This paper presents a fully automated machine learning framework that performs binary and multilabel classification using late fusion techniques. We also introduce additional work in early fusion, which involves extracting and fusing intermediate features from the two separate models. Our algorithms are applied to data extracted from relevant web pages in the advertising industry. Both of our late and early fusion methods obtain significant improvements over algorithms currently in use.

Volume None

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) | 2019

Fusing Visual and Textual Information to Determine Content Safety

Abstract

Volume None

Pages 2026-2031

DOI 10.1109/ICMLA.2019.00324

Language English

Journal 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)

Full Text