Seminars

View all Seminars  |  Download ICal for this event

Preference Learning from Human Response Time 

Series: Bangalore Theory Seminars

Speaker: Ayush Sawarni Stanford University

Date/Time: Jun 27 11:00:00

Location: CSA Auditorium, (Room No. 104, Ground Floor)

Abstract:
In this talk, I will present our recent work on enhancing human preference learning by utilizing response-time information??an implicit signal that reveals the strength of a users choice. Under the EZ diffusion model, we model learning the reward function from binary preferences and response times with a Neyman-orthogonal loss function. While using the standard preference learning approaches results in exponential scaling of errors (with reward magnitude), incorporating response time with our approach reduces this to a polynomial scaling. We provide finite-sample guarantees in rich nonparametric function spaces (including RKHS). We validate our theory across three experimental settings- synthetic linear rewards, random neural-network rewards, and a semi-synthetic text-to-image preference task -- and observe improved sample efficiency. I will conclude by discussing some techniques from semi-parametric statistics used in our proofs, which may be broadly applicable beyond preference learning. Paper - Link


Microsoft teams link:

Link


We are grateful to the Kirani family (Link and the Walmart Center for Tech Excellence (Link for generously supporting this seminar series


Hosts: Rameesh Paul, KVN Sreenivas, Rahul Madhavan, Debajyoti Kar, Nirjhar Das