Gábor Bartók
University of Alberta
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gábor Bartók.
algorithmic learning theory | 2013
Gergely Neu; Gábor Bartók
We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is \(O(m\sqrt{dT\log d})\). As a side result, we also improve the best known regret bounds for FPL, in the full information setting to \(O(m^{3/2}\sqrt{T\log d})\), gaining a factor of \(\sqrt{d/m}\) over previous bounds for this algorithm.
Mathematics of Operations Research | 2014
Gábor Bartók; Dean P. Foster; Dávid Pál; Alexander Rakhlin; Csaba Szepesvári
In a partial monitoring game, the learner repeatedly chooses an action, the environment responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his regret, which is the difference between his total cumulative loss and the total loss of the best fixed action in hindsight. In this paper we characterize the minimax regret of any partial monitoring game with finitely many actions and outcomes. It turns out that the minimax regret of any such game is either zero or scales as T1/2, T2/3, or T up to constants and logarithmic factors. We provide computationally efficient learning algorithms that achieve the minimax regret within a logarithmic factor for any game. In addition to the bounds on the minimax regret, if we assume that the outcomes are generated in an i.i.d. fashion, we prove individual upper bounds on the expected regret.
Archive | 2012
Gábor Bartók
In a partial-monitoring game a player has to make decisions in a sequential manner. In each round, the player suffers some loss that depends on his decision and an outcome chosen by an opponent, after which he receives “some” information about the outcome. The goal of the player is to keep the sum of his losses as low as possible. This problem is an instance of online learning: By choosing his actions wisely the player can figure out important bits about the opponent’s strategy that, in turn, can be used to select actions that will have small losses. Surprisingly, up to now, very little is known about this fundamental online learning problem. In this thesis, we investigate this problem. In particular, we investigate to what extent the information received influences the best achievable cumulative loss suffered by an optimal player. We present algorithms that have theoretical guarantees for achieving low cumulative loss, and prove their optimality by providing matching, algorithm independent lower bounds. Our new algorithms represent new ways of handling the exploration-exploitation trade-off, while some of the lower bound proofs introduce novel proof techniques.
algorithmic learning theory | 2008
Gábor Bartók; Csaba Szepesvári; Sandra Zilles
The question investigated in this paper is to what extent an input representation influences the success of learning, in particular from the point of view of analyzing agents that can interact with their environment. We investigate learning environments that have a group structure. We introduce a learning model in different variants and study under which circumstances group structures can be learned efficiently from experimenting with group generators (actions). Negative results are presented, even without efficiency constraints, for rather general classes of groups showing that even with group structure, learning an environment from partial information is far from trivial. However, positive results for special subclasses of Abelian groups turn out to be a good starting point for the design of efficient learning algorithms based on structured representations.
international conference on machine learning | 2014
Adish Singla; Ilija Bogunovic; Gábor Bartók; Amin Karbasi; Andreas Krause
conference on learning theory | 2011
Gábor Bartók; Dávid Pál; Csaba Szepesvári
neural information processing systems | 2013
Navid Zolghadr; Gábor Bartók; Russell Greiner; András György; Csaba Szepesvári
conference on learning theory | 2013
Gábor Bartók
algorithmic learning theory | 2012
Gábor Bartók; Csaba Szepesvári
neural information processing systems | 2013
Adish Singla; Ilija Bogunovic; Gábor Bartók; Amin Karbasi; Andreas Krause