BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//project/author//NONSGML v1.0//EN
CALSCALE:GREGORIAN
BEGIN:VEVENT
DTEND:20230405T120000Z
UID:0342b8aff8a80efd8f8765539f44b00f-442
DTSTAMP:19700101T120016Z
DESCRIPTION:Sequential learning in a stochastic multi armed bandit framework
URL;VALUE=URI:https://www.csa.iisc.ac.in/newweb/event/442/sequential-learning-in-a-stochastic-multi-armed-bandit-framework/
SUMMARY:The classic stochastic multi armed bandit framework involves finitely many unknown probability distributions that can be sequentially sampled to generate independent rewards. In this talk we consider two foundational problems: First one corresponds to sampling to minimize the expected regret, or equivalently, to maximize the expected total reward. The second one corresponds to the best arm identification, i.e., identifying the arm with the largest mean, or any other performance measure, using as few samples as possible while providing explicit probabilistically correct selection guarantees.

These problems form the bedrock of algorithms used in web design and advertising, recommendation systems, clinical trials and many other exciting applications. In this talk we review some of the popular algorithms used for these problems emphasizing the intuition underlying the elegant ideas. Technically speaking, these problems have been well studied under the restrictive assumption that arm distributions belong to a single parameter exponential family, that includes distributions such as Bernoulli and Gaussian with known variance. Under these settings, lower bounds on samples needed are developed using ideas from hypothesis testing, and algorithms are proposed that match the lower bound. We propose optimal algorithms that match the lower bounds even to a constant for general probability distributions under minimal restrictions. We further discuss how the proposed methodology leads to near optimal confidence intervals for distribution means. We discuss further enhancements in the presence of offline data that needs to be combined with online data. We further propose some new algorithms in the best arm identification setting that along with minimising sample complexity, are also computationally efficient.

Speaker Website https://www.tcs.tifr.res.in/~sandeepj/

Organizers Note: The talk will be in two halves. We will have the first session from 16.00 to 17.00 followed by a short break for snacks, post which we have the second session.


Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d


We are grateful to the Kirani family for generously supporting the theory seminar series


Hosts: Rahul Madhavan, Rameesh Paul, Aditya Subramanian and Aditya Abhay Lonkar
DTSTART:20230405T120000Z
END:VEVENT
END:VCALENDAR