Seminars
View all Seminars | Download ICal for this eventLeveraging Large Language Models for Causal Inference
Series: Bangalore Theory Seminars
Speaker: Rohit Bhattacharya, Williams College
Date/Time: Jul 11 11:00:00
Location: CSA Lecture Hall (Room No. 112, Ground Floor)
Abstract:
Recent text-based causal methods attempt to mitigate confounding bias by estimating proxies of confounding variables that are partially or imperfectly measured from unstructured text data. These approaches, however, assume analysts have supervised labels of the confounders given text for a subset of instances, a constraint that is sometimes infeasible due to data privacy or annotation costs. In this work, we leverage the zero-shot capabilities of large language models (LLMs) for causal inference in settings where an important confounding variable is completely unobserved. We propose a new causal inference method that uses multiple instances of pre-treatment text data, infers two proxies from two zero-shot models on the separate instances, and applies these proxies in the so-called proximal g-formula. We prove that our particular text-based proxy design satisfies identification conditions required by this formula while other seemingly reasonable proposals for applying zero-shot models do not. We evaluate our method in synthetic and semi-synthetic settings and find that it produces estimates with low bias. To address untestable assumptions associated with the proximal g-formula, we further propose an odds ratio falsification heuristic. This new combination of proximal causal inference and zero-shot classifiers expands the set of text-specific causal methods available to practitioners.
Microsoft teams link:
Link
We are grateful to the Kirani family for generously supporting the theory seminar series
Hosts: Rameesh Paul, KVN Sreenivas, Rahul Madhavan, Debajyoti Kar