Seminars
View all Seminars | Download ICal for this eventOptimal Sample Complexity of Single-Time-Scale Actor-Critic, and Non-rectangular Robust Reinforcement Learning
Series: Bangalore Theory Seminars
Speaker: Navdeep Kumar, Technion, Haifa , Israel
Date/Time: Feb 09 16:00:00
Location: CSA Auditorium, (Room No. 104, Ground Floor)
Abstract:
In this talk, I will discuss our work on single-timescale actor??critic methods, improving the sample complexity for finding an $varepsilon$-close global optimal policy from $O(varepsilon^{-4})$ to the optimal $O(varepsilon^{-2})$ using STOchastic Recursive Momentum (STORM). In the second part, I will present our work on robust reinforcement learning, with applications in high-stakes domains such as robotics and healthcare. We introduce a novel value-regularizer and, for the first time, develop an efficient algorithm for a class of non-rectangular uncertainty sets, which are NP-hard in general.
Microsoft teams link:
Link
We are grateful to the Kirani family (Link and the Walmart Center for Tech Excellence (Link for generously supporting this seminar series
Hosts: Rameesh Paul, Debajyoti Kar, KVN Sreenivas, Nirjhar Das, Rahul Madhavan
