Seminars

View all Seminars  |  Download ICal for this event

Optimal Sample Complexity of Single-Time-Scale Actor-Critic, and Non-rectangular Robust Reinforcement Learning

Series: Bangalore Theory Seminars

Speaker: Navdeep Kumar, Technion, Haifa , Israel

Date/Time: Feb 09 16:00:00

Location: CSA Auditorium, (Room No. 104, Ground Floor)

Abstract:
In this talk, I will discuss our work on single-timescale actor??critic methods, improving the sample complexity for finding an $varepsilon$-close global optimal policy from $O(varepsilon^{-4})$ to the optimal $O(varepsilon^{-2})$ using STOchastic Recursive Momentum (STORM). In the second part, I will present our work on robust reinforcement learning, with applications in high-stakes domains such as robotics and healthcare. We introduce a novel value-regularizer and, for the first time, develop an efficient algorithm for a class of non-rectangular uncertainty sets, which are NP-hard in general.


Microsoft teams link:

Link


We are grateful to the Kirani family (Link and the Walmart Center for Tech Excellence (Link for generously supporting this seminar series


Hosts: Rameesh Paul, Debajyoti Kar, KVN Sreenivas, Nirjhar Das, Rahul Madhavan