Seminars
View all Seminars | Download ICal for this eventAdvancing Safe Multimodal Intelligence
Series: Department Seminar
Speaker: Dr. Pritam Sarkar, Vector Institute & University of British Columbia, Canada
Date/Time: Apr 20 10:00:00
Location: CSA Auditorium, (Room No. 104, Ground Floor)
Abstract:
Building artificial intelligence with meaningful real-world impact requires models that can understand and interact in both the virtual and physical world. This talk outlines four key capabilities necessary to achieve this: foundational world knowledge, alignment with human values and expectations, reasoning ability, and the capacity to self-improve. The talk begins with an overview of our past contributions toward these capabilities, with a particular focus on building foundational world knowledge in multimodal models and improving their alignment with human values and expectations. The next part of the talk delves into a fundamental challenge in current alignment approaches: the alignment tax. While existing methods aim to make models safer and more reliable, they often degrade general capabilities??reducing response diversity and making models overly cautious or less useful. To address this, we introduce Refined Regularized Preference Optimization (RRPO), a fine-grained alignment method. By penalizing only specific error tokens rather than entire responses, RRPO mitigates harmful behaviors while simultaneously improving performance across diverse vision tasks. Finally, the talk concludes with an overview of ongoing work on enabling stronger visual reasoning and outlines a research vision for developing machines that can adapt and improve over time, fostering long-term usefulness and human??AI trust.
Speaker Bio:
Pritam Sarkar is a Distinguished Postdoctoral Fellow at the Vector Institute and
a Postdoctoral Research Fellow at the University of British Columbia, where he
works on multimodal AI, especially with video, image, audio, and language. He
completed his PhD in September 2025 at Queen’s University, Canada and during
this time, he was an intern at Google, USA. His research has been recognized at
leading venues including NeurIPS, ICLR, and AAAI, with multiple Oral and
Spotlight presentations. He received the IEEE Research Excellence Award in 2023
for his work on self-supervised learning. He actively serves the research
community as an Area Chair and a Reviewer for leading conferences and journals
such as NeurIPS, CVPR, and PAMI, and is a strong proponent of open-source
research. He is interested in developing safe and generalizable multimodal
intelligence through algorithms that learn effectively with minimal human
supervision. Find more: https://pritamsarkar.com/
Host Faculty: R Govindarajan
