Seminars

View all Seminars  |  Download ICal for this event

Dense Retrieval for Tables, Databases, and Quantities

Series: Department Seminar

Speaker: Soumen Chakrabarti, Professor, Department of Computer Science, Indian Institute of Technology Bombay.

Date/Time: Jun 25 16:00:00

Location: CSA Lecture Hall (Room No. 112, Ground Floor)

Abstract:
Starting from sparse inverted indices and vector space models, information retrieval has made major strides in the era of deep learning, by evolving to use dense word and passage representations and approximate nearest neighbor search. However, dense retrieval is needed beyond linear passages, and for rich data types. In this talk, we will focus on retrieval from tables and enhanced representation for quantities and comparisons, strongly motivated by financial or engineering reports as well as e-commerce. In the first part, we will describe TabSegNet, a retrieval system to support complex text+table question answering. TabSegNet represents tables as a fine-grained, richly-featured graph. The query is segmented by an LLM, and the segments influence parameters of a graph neural network that retrieves nodes from the table-derived graph. Extending the paradigm from textual table retrieval to text2sql, we will show that a hallucinated schema can be effective for retrieving real schema fragments needed to generate SQL.  In the second part, I will talk about DeepQuant, a dense retrieval system with enhanced representation for quantities (numerals and units) and comparison intent in queries (42 in shirt under 500). Our experiments suggest that monolithic language models can benefit from asserting such specific inductive biases to better represent tabular layout and quantities.

Speaker Bio:
Soumen Chakrabarti is a Professor of Computer Science at IIT Bombay. He works on linking unstructured text to knowledge bases and exploiting these links for better search and ranking. Other interests include link formation and influence propagation in social networks, and personalized proximity search in graphs. He has published extensively in WWW, SIGIR, ACL, EMNLP, NeurIPS, ICML, ICLR, AAAI, IJCAI, SIGKDD, VLDB, ICDE and other conferences. He won the best paper award at WWW 1999. He was coauthor on the best student paper at ECML 2008. His work on keyword search in databases got the 10-year influential paper award at ICDE 2012. He got his PhD from University of California, Berkeley and worked on the Clever Web search and Focused Crawling at IBM Almaden Research Center. He has also worked at Carnegie-Mellon University and Google. He received the Bhatnagar Prize in 2014 and the Jagadis Bose Fellowship in 2019.

Host Faculty: Prof. Chiranjib Bhattacharyya