Home » Event

Seminars

View all Seminars | Download ICal for this event

Value-based RL with function approximation and Îµ-greedy exploration: a differential inclusion analysis

Series: Bangalore probability seminar - https://www.isibang.ac.in/~d.yogesh/BPS.html (Second Talk)

Speaker: Aditya Gopalan (ECE, IISc, Bengaluru)

Date/Time: Aug 22 15:00:00

Location: CSA Seminar Hall (Room No. 254, First Floor)

Abstract:
The value-based method of Q-learning with $epsilon$-greedy exploration is one of the most widely used Reinforcement Learning (RL) algorithms. While its tabular form converges to the optimal Q-function under mild conditions, the behavior of its function approximation variant is quite mysterious. Sometimes, the tactic of function approximation with greedy exploration appears to speed up learning. However, at other times, it seems to cause complex behaviors such as i.) instability, ii.) policy oscillation and chattering, iii.) multiple attractors, and iv.) worst policy convergence. Accordingly, a formal recipe to explain these phenomena has been a long-standing open problem (Sutton, 1999). In this talk, we shall provide the first pathway, based on differential inclusions, to systematically identify and explain the range of limiting phenomena that an approximate value-based RL method with greedy exploration can exhibit, thereby answering this open question.

This talk is based on our recent work titled ``Approximate Q-learning and SARSA(0) under the ϵ-greedy Policy: a Differential Inclusion Analysis

Host Faculty: Prof. Gugan Thoppe and Prof. Aditya Gopalan.

Department of Computer Science and Automation

Seminars

Value-based RL with function approximation and Îµ-greedy exploration: a differential inclusion analysis

Explore

Quick Links

Resources

Seminars Calendar