Neurocomputing | 2019
A numerical analysis of allocation strategies for the multi-armed bandit problem under delayed rewards conditions in digital campaign management
Abstract
Abstract In this paper, we analyze the most representative allocation strategies to deal with the multi-armed bandit problem in a context with delayed rewards by means of a numerical study based on a discrete event simulation. The scenario that we address is a digital marketing content recommendation system, called campaign management, used by marketers to create specific digital content that can be issued or configured for viewing by certain population segments according to a series of business variables, user profile or behavior. Both batch mode and online update architectures are considered for feedback from the different contents displayed to users. The results show that possibilistic reward (PR) methods outperform other allocation strategies in this scenario with delayed rewards.