BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//project/author//NONSGML v1.0//EN
CALSCALE:GREGORIAN
BEGIN:VEVENT
DTEND:20220304T120000Z
UID:d536135b4e9b09a4b5a4b79ab7a7d46f-253
DTSTAMP:19700101T120016Z
DESCRIPTION:Reinforcement Learning Via Sequence Modeling
URL;VALUE=URI:https://www.csa.iisc.ac.in/newweb/event/253/reinforcement-learning-via-sequence-modeling/
SUMMARY:I will introduce a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This allows us to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. I will present Decision Transformer (DT), an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, DT simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, DT can generate future actions that achieve the desired return. I will also present our recent work proposing entropy regularizers to extend DT to online learning with hindsight learning and entropy-based regularization. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline and online RL baselines on benchmark environments.
DTSTART:20220304T120000Z
END:VEVENT
END:VCALENDAR