Seminars
View all Seminars | Download ICal for this eventFragile Interpretations and Intepretable Models in NLP
Series: M.Tech (Research) Colloquium
Speaker: Ayushi Kohli,M.Tech (Research) student,Dept. of CSA,
Date/Time: Feb 17 12:00:00
Location: CSA Seminar Hall (Room No. 254, First Floor)
Faculty Advisor: Dr. V. Susheela Devi
Abstract:
Deployment of deep learning models in critical areas is still an issue of concern, as the cost of making a wrong decision is very high in these areas. As a result, the final decision in these settings is human-centric. Also, these models act as black boxes, and we are unaware of their internal workings. Therefore the models must be explainable to know their internal workings. So, if the model is explainable, is it safe to deploy it in real-world settings involving huge risks?
<br>
Our work centers around the concept of fragile interpretations considering the models robustness and the robustness of interpretations. We have proposed an algorithm that perturbs the input text and generates adversarial examples with the same prediction as the input but with different interpretations. Through our experiments, we have provided a detailed analysis of whether these interpretations are reliable and whether to trust the model or the interpretations. We have provided the reason for the fragility in the case of NLP. Taking this into account, we have proposed two interpretable models, one for a multi-task offensive language detection task and the other for a sentence-pair similarity detection task.