SeminarsView all Seminars | Download ICal for this event
Achieving Fairness in the Stochastic Multi-armed Bandit Problem
Series: M.Tech (Research) Thesis Defense
Speaker: Ms. Vishakha Patil M.Tech. (Research) Student Dept. of CSA IISc
Date/Time: Dec 12 15:00:00
Location: CSA Seminar Hall (Room No. 254, First Floor)
Faculty Advisor: Prof. Y Narahari
The classical Stochastic Multi-armed Bandit (MAB) problem provides an abstraction for many real-world decision making problems such as sponsored-search auctions, crowd-sourcing, wireless communication, etc. In this work, we study Fair-MAB, a variant of the MAB problem, where, in addition to the goal of maximizing the sum of expected rewards, the algorithm also has to ensure that each arm is pulled for at least a given fraction of the total number of rounds which imposes an additional fairness constraint on the algorithm. The non-trivial aspect of Fair-MAB arises when the time horizon T is unknown to the algorithm. The treatment of fairness in the MAB problem is either procedural or outcome-based. Procedural notions of fairness require that the decision-making process of the algorithm must satisfy some fairness guarantees. In contrast to this, we consider an outcome-based fairness notion which can be validated based on the outcome of the decisions of the algorithm. Our primary contribution is characterizing a class of Fair-MAB algorithms by two parameters: the unfairness tolerance and the learning algorithm used as a black-box. We define an appropriate notion of fairness and show that our algorithms guarantee fairness independent of the choice of the learning algorithm. We define the notion of fairness-aware regret which naturally extends the conventional notion of regret, and show that the fairness-aware regret of our algorithm matches in order the regret of the black-box learning algorithm in the absence of fairness constraints. Finally, we show via detailed simulations that our algorithm outperforms the best known algorithm for the Fair-MAB problem, both in terms of the regret and in terms of the fairness guarantee that it provides. We also evaluate the cost of fairness in the MAB setting in terms of the conventional notion of regret.