Home » Event

Seminars

View all Seminars | Download ICal for this event

Advancing Safe Multimodal Intelligence

Series: Department Seminar

Speaker: Dr. Pritam Sarkar, Vector Institute & University of British Columbia, Canada

Date/Time: Apr 20 10:00:00

Location: CSA Auditorium, (Room No. 104, Ground Floor)

Abstract:
Building artificial intelligence with meaningful real-world impact requires models that can understand and interact in both the virtual and physical world. This talk outlines four key capabilities necessary to achieve this: foundational world knowledge, alignment with human values and expectations, reasoning ability, and the capacity to self-improve. The talk begins with an overview of our past contributions toward these capabilities, with a particular focus on building foundational world knowledge in multimodal models and improving their alignment with human values and expectations. The next part of the talk delves into a fundamental challenge in current alignment approaches: the alignment tax. While existing methods aim to make models safer and more reliable, they often degrade general capabilities�??reducing response diversity and making models overly cautious or less useful. To address this, we introduce Refined Regularized Preference Optimization (RRPO), a fine-grained alignment method. By penalizing only specific error tokens rather than entire responses, RRPO mitigates harmful behaviors while simultaneously improving performance across diverse vision tasks. Finally, the talk concludes with an overview of ongoing work on enabling stronger visual reasoning and outlines a research vision for developing machines that can adapt and improve over time, fostering long-term usefulness and human�??AI trust.

Speaker Bio:
Pritam Sarkar is a Distinguished Postdoctoral Fellow at the Vector Institute and a Postdoctoral Research Fellow at the University of British Columbia, where he works on multimodal AI, especially with video, image, audio, and language. He completed his PhD in September 2025 at Queenâ€™s University, Canada and during this time, he was an intern at Google, USA. His research has been recognized at leading venues including NeurIPS, ICLR, and AAAI, with multiple Oral and Spotlight presentations. He received the IEEE Research Excellence Award in 2023 for his work on self-supervised learning. He actively serves the research community as an Area Chair and a Reviewer for leading conferences and journals such as NeurIPS, CVPR, and PAMI, and is a strong proponent of open-source research. He is interested in developing safe and generalizable multimodal intelligence through algorithms that learn effectively with minimal human supervision. Find more: https://pritamsarkar.com/

Host Faculty: R Govindarajan

Department of Computer Science and Automation

Seminars

Advancing Safe Multimodal Intelligence

Explore

Quick Links

Resources

Seminars Calendar