Gábor Bartók | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gábor Bartók is active.

Explore More

Publication

Featured researches published by Gábor Bartók.

algorithmic learning theory | 2013

An Efficient Algorithm for Learning with Semi-bandit Feedback

Gergely Neu; Gábor Bartók

We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is \(O(m\sqrt{dT\log d})\). As a side result, we also improve the best known regret bounds for FPL, in the full information setting to \(O(m^{3/2}\sqrt{T\log d})\), gaining a factor of \(\sqrt{d/m}\) over previous bounds for this algorithm.

Mathematics of Operations Research | 2014

Partial Monitoring—Classification, Regret Bounds, and Algorithms

Gábor Bartók; Dean P. Foster; Dávid Pál; Alexander Rakhlin; Csaba Szepesvári

In a partial monitoring game, the learner repeatedly chooses an action, the environment responds with an outcome, and then the learner suffers a loss and receives a feedback signal, both of which are fixed functions of the action and the outcome. The goal of the learner is to minimize his regret, which is the difference between his total cumulative loss and the total loss of the best fixed action in hindsight. In this paper we characterize the minimax regret of any partial monitoring game with finitely many actions and outcomes. It turns out that the minimax regret of any such game is either zero or scales as T1/2, T2/3, or T up to constants and logarithmic factors. We provide computationally efficient learning algorithms that achieve the minimax regret within a logarithmic factor for any game. In addition to the bounds on the minimax regret, if we assume that the outcomes are generated in an i.i.d. fashion, we prove individual upper bounds on the expected regret.

Archive | 2012

The Role of Information in Online Learning

Gábor Bartók

In a partial-monitoring game a player has to make decisions in a sequential manner. In each round, the player suffers some loss that depends on his decision and an outcome chosen by an opponent, after which he receives “some” information about the outcome. The goal of the player is to keep the sum of his losses as low as possible. This problem is an instance of online learning: By choosing his actions wisely the player can figure out important bits about the opponent’s strategy that, in turn, can be used to select actions that will have small losses. Surprisingly, up to now, very little is known about this fundamental online learning problem. In this thesis, we investigate this problem. In particular, we investigate to what extent the information received influences the best achievable cumulative loss suffered by an optimal player. We present algorithms that have theoretical guarantees for achieving low cumulative loss, and prove their optimality by providing matching, algorithm independent lower bounds. Our new algorithms represent new ways of handling the exploration-exploitation trade-off, while some of the lower bound proofs introduce novel proof techniques.

algorithmic learning theory | 2008

Active Learning of Group-Structured Environments

Gábor Bartók; Csaba Szepesvári; Sandra Zilles

The question investigated in this paper is to what extent an input representation influences the success of learning, in particular from the point of view of analyzing agents that can interact with their environment. We investigate learning environments that have a group structure. We introduce a learning model in different variants and study under which circumstances group structures can be learned efficiently from experimenting with group generators (actions). Negative results are presented, even without efficiency constraints, for rather general classes of groups showing that even with group structure, learning an environment from partial information is far from trivial. However, positive results for special subclasses of Abelian groups turn out to be a good starting point for the design of efficient learning algorithms based on structured representations.

international conference on machine learning | 2014