BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//project/author//NONSGML v1.0//EN
CALSCALE:GREGORIAN
BEGIN:VEVENT
DTEND:20200103T120000Z
UID:dbf7b78ee1355ec1ed901e595ea5479f-27
DTSTAMP:19700101T120016Z
DESCRIPTION:Ballooning Multi-Armed Bandits
URL;VALUE=URI:https://www.csa.iisc.ac.in/newweb/event/27/ballooning-multi-armed-bandits/
SUMMARY:Many common web-based applications such as online q&amp;A forums and online review portals need a scientific way of identifying high quality answers or opinions and distinguishing them from ordinary ones. To tackle this problem, we introduce a new model which we call &quot;ballooning multi-armed bandits&quot; (B-MAB), a novel extension to the classical stochastic MAB model. In the BMAB model, the set of available arms grows (or balloons) over time. In contrast to the classical MAB setting where the regret is computed with respect to the best arm overall, the regret in a BMAB setting is computed with respect to the best available arm at each time. We first observe that the existing stochastic MAB algorithms are not regret-optimal for the B-MAB model. In fact, if the best arm is equally likely to arrive at any time, a sublinear regret cannot be achieved, irrespective of the arrival of other arms. We next present our main result that if the best arm is more likely to arrive in the early rounds, one can achieve sub-linear regret. Making reasonable assumptions on the arrival distribution of the best arm in terms of the thinness of the distribution’s tail, we prove that the proposed algorithm achieves sub-linear, instance-independent regret. We further quantify explicit dependence of regret on the arrival distribution parameters. Application to online Q&amp;A forums, online review platforms, and many other settings is immediate.(Joint work with Ganesh Ghalme, Swapnil Dhamal, Shweta Jain, and Sujit Gujar)
DTSTART:20200103T120000Z
END:VEVENT
END:VCALENDAR