Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

Series: CSA Golden Jubilee Frontier Lecture Series
Title: 1. The Power of Encounters 2. Algorithmic fairness in online decision-making

  • Speaker: 1. Prof. Peter Druschel
                    Director
                    Max Planck Institute for Software Systems
                   
                   2. Prof. Avrim Blum
                    Professor and Chief Academic Officer.
                    Toyota Technological Institute at Chicago (TTIC)
                    Chicago, USA
  • Date and Time: Wednesday, January 29, 2020, 3:00 PM
  • Venue: Faculty Hall, Indian Institute of Science

Abstract
1. A secure encounter is an agreement by two anonymous devices to have met at a given time and place. An associated shared secret enables the devices to subsequently confirm their encounter and communicate securely. In this talk, I will sketch how this simple idea enables fascinating new forms of privacy-preserving, contextual, secure communication among personal and IoT devices, and enables users to produce selective evidence of their personhood and physical whereabouts. Encounters enable powerful forms of secure group communication among devices connected by chains of encounters, subject to spatial, temporal, and causality constraints. Applications range from connecting event attendees and virtual guest books to disseminating targeted health risk warnings, soliciting information and witnesses related to an incident, and tracing missing persons, all while maintaining users’ privacy. Encounters also enable selective proofs of device (co-)location at a given time and place. Finally, encounters can provide evidence of a unique physical trajectory, which suggests a human user and promises a new defense to Sybil attacks.

2. There is growing concern about fairness in algorithmic decision making: Are algorithmic decisions treating different groups fairly? How can we make them fairer? What do we even mean by fair? In this talk I will discuss some of our work on this topic, focusing on the setting of online decision making. For instance, a classic result states that given a collection of predictors, one can adaptively combine them to perform nearly as well as the best of those predictors in hindsight (achieve “no regret”) without any stochastic assumptions. Can one extend this guarantee so that if the predictors are themselves fair (according to a given definition), then the overall combination is fair as well (according to the same definition)? I will discuss this and other issues. This is joint work with Suriya Gunasekar, Thodoris Lykouris, and Nati Srebro.

Speaker Bio:
1. Peter Druschel is the founding director of the Max Planck Institute for Software Systems (MPI-SWS) and Associate Chair of the Chemistry, Physics, and Technology Section of the Max Planck Society in Germany. Previously, he was a Professor of Computer Science and Electrical and Computer Engineering at Rice University in Houston, Texas. His research interests include distributed systems, mobile systems, privacy and compliance. He is the recipient of an NSF CAREER Award, an Alfred P. Sloan Fellowship, the ACM SIGOPS Mark Weiser Award, a Microsoft Research Outstanding Collaborator Award, and the EuroSys Lifetime Achievement Award. Peter is a member of Academia Europaea and the German Academy of Sciences Leopoldina. 2. Avrim Blum received his BS, MS, and PhD from MIT in 1987, 1989, and 1991 respectively. He then served on the faculty in the Computer Science Department at Carnegie Mellon University from 1992 to 2017. In 2017 he joined the Toyota Technological Institute at Chicago as Chief Academic Officer. Prof. Blum’s main research interests are in Theoretical Computer Science and Machine Learning, including Machine Learning Theory, Approximation Algorithms, Algorithmic Game Theory, and Database Privacy, as well as connections among them. Some current specific interests include multi-agent learning, multi-task learning, semi-supervised learning, and the design of incentive systems. He is also known for his past work in AI Planning. Prof. Blum has served as Program Chair for the IEEE Symposium on Foundations of Computer Science (FOCS), the Innovations in Theoretical Computer Science Conference (ITCS), and the Conference on Learning Theory (COLT). He has served as Chair of the ACM SIGACT Committee for the Advancement of Theoretical Computer Science and on the SIGACT Executive Committee. Prof. Blum is recipient of the AI Journal Classic Paper Award, the ICML/COLT 10-Year Best Paper Award, the Sloan Fellowship, the NSF National Young Investigator Award, and the Herbert Simon Teaching Award, and he is a Fellow of the ACM.

Host Faculty: Prof. Shalabh Bhatnagar

Video Archive Go Top

 

PAST SEMINARS

Series: M.Tech (Research) Colloquium
Title: Modern Combinatorial Optimization Problems; Balanced Clustering and Fair Knapsack

  • Speaker: Mr. Deval Patel
                   M.Tech (Research)
                   Dept. of CSA
  • Faculty Advisor: Dr. Anand Louis
  • Date and Time: Monday, January 27, 2020, 9:00 AM
  • Venue: CSA Class (Room No. 252, First Floor)

Abstract
In many classical optimization problems, it is desirable to have an output that is equitable in some sense. The property equitability could be defined differently for different optimization problems. We study this property in two classical optimization problems, clustering and knapsack. In the clustering problem, we desire to have a cost of the clustering evenly distributed among the clusters. We study this problem under the name cost-balanced k-clustering. In the knapsack problem, we desire to have a packing which is fair in some sense. We study this problem under the name fair knapsack. In most of the clustering objectives like k-median or k-center, the cost of assigning a client to the cluster is considered to be borne by a client. Algorithms optimizing such objectives might output a solution where few clusters have very large cost and few clusters have a very small cost. Cost-balanced k-clustering problem aims to obtain the clustering which is cost balanced. We consider objective of minimizing maximum cost of each cluster, where the cost of a cluster is sum of distances of all the points in that cluster from the center of that cluster. We define the notion of γ-stability, for γ > 1, for the problem and give a poly time algorithm for 1.5-stable instances of the problem. We give hardness result. We also define the more general version of the problem and name it "lp" cost-balanced k-clustering. Given a black-box algorithm which gives constant factor approximation to the "lp" cost k-clustering problem, we show a procedure that runs in poly time and gives bi-criteria approximation to the "lp" cost-balanced k-clustering problem.

In this work, we also consider the notion of group fairness in knapsack problems. In this setting, each item belongs to a category, and our goal is to compute a subset of items such that each category is fairly represented, in addition to the total weight of the subset not exceeding the capacity, and the total value of the subset being maximized. We study various notions of group fairness, such as the number of items from each category, the total value of items from each category, and the total weight of items from each category. We give algorithms and hardness results for these problems.

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: Fault Aware Read-Copy-Update

  • Speaker: Mr. Abhishek Dubey
                   M.Tech. (Research) Student
                   Dept. of CSA
  • Faculty Advisor: Prof. K Gopinath
  • Date and Time: Friday, January 24, 2020, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Deferred freeing is the fundamental technique used in Read-Copy-Update (RCU) synchronization technique where reclamation of resources is deferred until the completion of all active RCU read-side critical sections. We observe that faults inside an RCU read-side critical section can indefinitely block writers that are waiting for the completion of RCU readers and also lead to system failures by preventing the reclamation of deferred resources. We show that the impact of such faults in the Linux kernel is global; a fault in one subsystem can propagate and exhaust critical resources in other unrelated subsystems opening a window of opportunity for DoS-based attacks. For example, a fault in a filesystem can exhaust the process ulimit resulting in fork failures. Since, guaranteeing the absence of faults is practically impossible, it is imperative to harden RCU to tolerate faults.

We first study the impact of mitigating lockup by termination of the faulty thread, as thread termination is standard approach used by Linux as recovery strategy. We demonstrate the impact of faults in RCU read-side critical sections and present RCU recovery techniques that use novel approaches to detect and isolate effect of such faults. We also discuss system consistency once the fault is handled by our approach. Timely recovery results in a usable system, preserving the user application state and increasing the system's availability. Our evaluation in the Linux kernel shows that our solution can prevent resource exhaustion in the presence of faults with no additional overhead in the absence of faults.

Video Archive Go Top

 

Series: CSA Golden Jubilee Frontier Lecture Series
Title: Law and Algorithms: A Cryptographic Lens

  • Speaker: Professor Ran Canetti
                   IACR Fellow
                   Boston University
  • Date and Time: Thursday, January 23, 2020, 4:00 PM
  • Venue: Faculty Hall, Indian Institute of Science

Abstract
Computer Science has had a complex relationship with Law since its early days. On the one hand, the disciplines are similar in that they both have deep theoretical roots and at the same time are all-encompassing and very applied. On the other hand, one discipline is founded on mathematics, while the other is purely humanistic in nature. Traditionally, the main points of contact between the discipline were centered around intellectual property for algorithms (software and hardware), and regulating the use and sale of products that include encryption algorithms. Recently, however, many more meeting points have emerged, including (but certainly not limited to) regulating the use of statistical risk-assessment and prediction algorithms; Applying traditionally-humanistic concepts such as privacy, bias, transparency, or individuality, to algorithms; Adjudicating and balancing the protection, sharing, and confinement of data; Determining algorithmic intent, awareness, and liability in automated contracts. Furthermore, cryptographic thinking and tools such as computation over encrypted data, multiparty computation, and zero-knowledge proofs emerge as game-changers in various realistic legal scenarios. This talk is a personal account of the emerging landscape of “Law and Algorithms”, shaped by my interactions with Law scholars and fellow Computer Scientists in the past few years. Three classes co-taught to a mixed audience of CS and Law students, where we tried to build bridges between the disciplines, were particularly influential, and also provide starting points for exciting new research. The classes were co-taught with Daniela Caruso, Aloni Cohen, Stacey Dogan, Cynthia Dwork, Ahmed Ghappour, Shafi Goldwasser, Martha Minow, Frank Partnoy, Pat Williams.

Speaker Bio:
Ran Canetti is a professor of Computer Science and the director of the center for Reliable Information System and Cyber Security at Boston University. He is also a Fellow of the International Association for Cryptologic Research, an incumbent of the RSA Award in Mathematics 2018. Canetti graduated from the Weizmann Institute of Science in 1996, was a researcher at IBM Watson Research Center, and a professor of Computer Science at Tel Aviv University. Canetti’s research interests span multiple aspects of cryptography and information security, with emphasis on the design, analysis and use of cryptographic protocols. Recently he has been interested in finding common ground for Law and Cryptography to collaborate towards enabling a better future for society.

Host Faculty: Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: Theory Seminar Series
Title: Rethinking the role of optimization in learning.

  • Speaker: Dr. Suriya Gunasekar
                   Senior Researcher
                   ML & Optimization Group
                   Microsoft Research
                   Redmond
  • Date and Time: Thursday, January 23, 2020, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In this talk, I will overview our recent progress towards understanding how we learn large capacity machine learning models. In the modern practice of machine learning, especially deep learning, many successful models have far more trainable parameters compared to the number of training examples. Consequently, the optimization objective for training such models have multiple minimizers that perfectly fit the training data. More problematically, while some of these minimizers generalize well to new examples, most minimizers will simply overfit or memorize the training data and will perform poorly on new examples. In practice though, when such ill-posed objectives are minimized using local search algorithms like (stochastic) gradient descent ((S)GD), the "special" minimizers returned by these algorithms have remarkably good performance on new examples. In this talk, we will explore the role optimization algorithms like (S)GD in learning overparameterized models in simpler setting of learning linear predictors.

Speaker Bio:
Suriya Gunasekar is a senior researcher in the ML & Optimization group at Microsoft Research, Redmond. Previously, she was a research assistant professor at the Toyota Technological Institute at Chicago. Her research interests are broadly driven by statistical, algorithmic, and societal aspects of machine learning including topics of optimization, high dimensional learning, and algorithmic fairness.

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: Model Extraction and Active Learning

  • Speaker: Mr. Aditya Shukla
                   M.Tech (Research) Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Vinod Ganapathy
  • Date and Time: Wednesday, January 22, 2020, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Machine learning models trained on a confidential dataset are increasingly being deployed for profit. Machine Learning as a Service (MLaaS) has made such models easily accessible to end-users. They can use it directly as a back-box module to query an input sample and get its corresponding prediction. Prior work has shown that it is possible to extract these models. They developed model extraction attacks that extract an approximation of the MLaaS model by making black-box queries to it. However, none of them satisfy all the four criteria essential for practical model extraction: (i) the ability to extract deep learning models, (ii) non-requirement of domain knowledge, (iii) the ability to work with a limited query budget and (iv) non-requirement of annotations. In this work, we propose a novel model extraction framework that makes use of existing active learning techniques and unannotated public data to satisfy all of them. By using only 30% (30,000 samples) of the unannotated public data, our model extraction framework on an average achieves a performance of 4.70x over uniform noise baseline.

We further introduce an ensemble active learning technique by combining two existing state-of-the-art active learning techniques, i.e., DeepFool based Active Learning (DFAL) and Coreset active learning. We empirically show that the ensemble active learning technique, in general, performs better than DFAL and it turns out to be a winner in the majority of our experiments.

Finally, we show that our proposed model extraction attack cannot be detected by a state-of-the-art detection method, PRADA, that monitors the distribution of distances between queries for deviation from the normal distribution.

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: Novel Neural Architecture for Multi-Hop Question Answering

  • Speaker: Mr. G P Shrivatsa Bhargav
                   M.Tech. (Research) Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Shirish K Shevade
  • Date and Time: Tuesday, January 21, 2020, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Natural language understanding has been one of the key drivers responsible for advancing the field of AI. To this end, automated Question Answering (QA) has served as an effective way of measuring the language understanding capabilities of AI systems. Our focus in this thesis is on Reading Comprehension style Question Answering (RCQA) task. Reading comprehension is the ability to understand natural language text and answer questions over it. Specifically, we focus on complex questions that require multi-hop reasoning over facts spread across multiple passages. Recently, there has been a surge in the research activities surrounding RCQA task, primarily due to the emergence of large-scale public datasets. For single-hop RCQA datasets, majority of the proposed solutions are based on massively pre-trained Transformer-style models such as BERT. Some of these solutions have exhibited human level performance. Similar solutions have been proposed for the multi-hop RCQA datasets and they have also improved the state-of-the-art. However, we believe that the core challenges involved in the multi-hop RCQA task have not been addressed effectively by existing solutions and hence there is an opportunity to advance the state-of-the-art. We present a novel deep neural architecture, called TAP (Translucent Answer Prediction), to identify answers and evidence (in the form of supporting facts) in an RCQA task requiring multi-hop reasoning. TAP comprises two loosely coupled networks -- Local and Global Interaction eXtractor (LoGIX) and Answer Predictor (AP). LoGIX predicts supporting facts, whereas AP consumes these predicted supporting facts to predict the answer span. The design of LoGIX is inspired by two key design desiderata -- local context and global interaction -- that we identified by analyzing examples of multi-hop RCQA task. The loose coupling between LoGIX and the AP reveals the set of sentences used by the AP in predicting an answer. Therefore, answer predictions of TAP can be interpreted in a translucent manner. We conduct extensive evaluations and analyses on the HotpotQA dataset to understand the characteristics of TAP. TAP achieved state-of-the-art performance on the distractor setting of the HotpotQA dataset.

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: Model Extraction defense using Modified Variational Autoencoder

  • Speaker: Mr. Yash Gupta
                   M.Tech. (Research) Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Aditya Kanade & Prof. Shirish K Shevade
  • Date and Time: Tuesday, January 21, 2020, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Machine Learning as a Service (MLaaS) exposes machine learning (ML) models that are trained on confidential datasets to users in the form of an Application Programming Interface (API). Since the MLaaS models are deployed for commercial purposes the API is available as a pay-per-query service. A malicious user or attacker can exploit these APIs to extract a close approximation of the MLaaS model by training a substitute model using only black-box query access to the API, in a process called model extraction. The attacker is restricted to extract the MLaaS model using a limited query budget because of the paid service. The model extraction attack is invoked by firing queries that belong to a substitute dataset that consists of either (i) Synthetic Non-Problem Domain (SNPD), (ii) Synthetic Problem Domain (SPD), or (iii) Natural Non-Problem Domain (NNPD) dataset.

In this work, we propose a novel defense framework against model extraction, using a hybrid anomaly detector composed of an encoder and a detector. In particular we propose a modified Variational Autoencoder, VarDefend, which uses a loss function, specially designed, to separate the encodings of queries fired by malicious users from those by benign users. We consider two scenarios: (i) stateful defense where an MLaaS provider stores the queries made by each client for discovering any malicious pattern, (ii) stateless defense where individual queries are discarded if they are flagged as out-of-distribution. Treating encoded queries from benign users as normal, one can use outlier detection models to identify encoded queries from malicious users in the stateless approach. For the stateful approach, a statistical test known as Maximum Mean Discrepancy (MMD) is used to match the distribution of the encodings of the malicious queries with those of the in-distribution encoded samples. In our experiments, we observed that our stateful defense mechanism can completely block one representative attack for each of the three types of substitute datasets, without raising a single false alarm against queries made by a benign user. The number of queries required to block an attack is much smaller than those required by the current state-of-the-art model extraction defense PRADA. Further, our proposed approach can block NNPD queries that cannot be blocked by PRADA. Our stateless defense mechanism is useful against a group of colluding attackers without significantly impacting benign users. Our experiments demonstrate that, for MNIST and FashionMNIST dataset, proposed stateless defense rejects more than 98% of the queries made by an attacker belonging to either SNPD, SPD or NNPD datasets while rejecting only about 0.05% of all the queries made by a benign user. Our experiments also demonstrate that the proposed stateless approach makes the MLaaS model significantly more robust to adversarial examples crafted using the substitute model by blocking transferability.

Video Archive Go Top

 

Series: CSA Golden Jubilee Frontier Lecture Series
Title: Secure Multiparty Computation

  • Speaker: Professor Yuval Ishai
                   Abraham Tulin Professor of Computer Science
                   Technion – Israel Institute of Technology
  • Date and Time: Friday, January 17, 2020, 4:00 PM
  • Venue: Faculty Hall, Indian Institute of Science

Abstract
How can sensitive data be processed without introducing a single point of failure? How can several parties perform a joint computation on their secret inputs, say add them up or compute some other statistics, without revealing any additional information to each other except the desired output? Secure multiparty computation is a powerful cryptographic tool for solving this kind of problem. The talk will give an overview of research in the area, covering classical results, connections with other problems, current research directions, and remaining challenges.

Speaker Bio:
Yuval Ishai is presently a chaired professor of Computer Science at the Technion, Israel. He received his PhD at the Technion, served as a postdoctoral researcher at DIMACS Center, AT&T Labs Research and Princeton University, and spent two extended sabbaticals at UCLA. Yuval’s research focuses mostly on cryptography and its interactions with computational complexity theory. His works were recognized by the FOCS 2004, Crypto 2007, and Crypto 2016 Best Paper Awards and the SIAM Outstanding Paper Prize. He chaired the Theory of Cryptography and Eurocrypt conferences and was inducted as an IACR Fellow in 2018.

Host Faculty: Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: Department Seminar
Title: Parameterized Complexity of Network Design Problems

  • Speaker: Pranabendu Misra, MPI, Saarbrucken
  • Date and Time: Friday, January 17, 2020, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Network Design Problems, which concern designing minimum cost networks that satisfy given set of ``connectivity constrains'', are very well studied in computer science and combinatorial optimization. Almost all these problems are NP-hard, and a number of results are known about them in the realm of approximation algorithms. Parameterized Complexity is a framework for dealing with computational intractability, where (typically) we try to optimally solve those problem instances which admit a ``small cost solution'' or some other nice structural properties. In this talk we will look at some recent results on the parameterized complexity of network design problems, and future directions for research.

Speaker Bio:
Pranabendu Misra is a postdoctoral fellow at the Max Planck Institute for Informatics, Saarbrucken, Germany. He obtained his PhD in Computer Science from the Institute of Mathematical Sciences, Chennai, India, advised by Saket Saurabh. His primary research interests lie in graph theory and algorithms focusing on Parameterized Complexity. He is also interested in approximation algorithms, matroids, fault tolerant graphs, matching under preferences, de-randomization.

Host Faculty: Prof. Matthew Jacob

Video Archive Go Top

 

Series: Department Seminar
Title: The Story of Blind Men and the Elephant: Understanding Context Sensitivity

  • Speaker: Prof. Uday Khedker
  • Date and Time: Tuesday, January 14, 2020, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Context-sensitive methods of program analysis increase the precision of interprocedural analysis by achieving the effect of call inlining. These methods have been defined using different formalisms and hence appear as algorithms that are very different from each other. Some methods define context, whereas some do not. These methods place different kinds of restrictions on the data flow frameworks supported by them and seem to compute different kinds of values. As a consequence, it is difficult to compare the ideas behind these methods in spite of the fact that they solve essentially the same problem. We believe that these incomparable views are similar to blind men views of an elephant called context-sensitivity. We bring out this whole-elephant-view of context sensitivity in program analysis by proposing a unified model of context sensitivity which provides a clean separation between computation of contexts and computation of data flow values. Our model facilitates declarative specifications of context-sensitive methods by capturing the essence of context-sensitivity. We model most of the known context-sensitive methods using our unified model. This modelling uncovers the hidden notion of contexts in some methods, facilitates insightful comparisons between different methods, and facilitates cross fertilization of ideas and suggest interesting improvements in the known methods. Further, our unification also extends every modelled method to bidirectional analyses. This is a joint work with Swati Jaiswal.

Speaker Bio:
Uday P. Khedker (http://www.cse.iitb.ac.in/~uday) finished B.E. from GEC Jabalpur in 1986, M.Tech. from Pune University in 1989, and Ph.D. from IIT Bombay in 1995. He taught at the Department of Computer Science at Pune University from 1994 to 2001 and since then is with IIT Bombay where he is currently a Professor of Computer Science & Engg. His areas of interest are Programming Languages, Compilers and Program Analysis. He specialises in data flow analysis and its applications to code optimization. He has published papers in leading journals and conferences, has contributed chapters in a Compiler Design Handbook, and has authored a book titled “Data Flow Analysis: Theory and Practice” (http://www.cse.iitb.ac.in/~uday/dfaBook-web) published by Taylor and Francis (CRC Press).

Host Faculty: K V Raghavan

Video Archive Go Top

 

Series: Department Seminar
Title: Anomaly Detection in Static Networks using Egonets

  • Speaker: Dr. Srijan Sengupta
                   Virginia Tech
  • Date and Time: Monday, January 13, 2020, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Anomaly in networks refers to the situation where the networked system, or part of it, shows significant departure from regular or expected behavioral patterns. Anomalies in networks often imply illegal or disruptive activity by the actors in the network. There has been a lot of recent emphasis on developing network monitoring tools that can detect such anomalous activity. Networks can be static, where we have a single snapshot of the system, or dynamic, where we have network snapshots at several points in time. Anomalies can have different meanings in these two scenarios.

In static networks, anomaly typically means a local anomaly, in the form of a small anomalous subgraph which is significantly different from the rest of the network. Local anomalies are difficult to detect using simple network-level metrics since the anomalous subnetwork might be too small to cause significant changes to network-level metrics, e.g., network degree. Instead, such anomalies might be detectable if we monitor sub-network level metrics, e.g., degrees of all subgraphs. However, that option is computationally infeasible, as it involves computing total degrees for all O(2^n) subgraphs of an n-node network.

We propose a novel anomaly detection method by using egonet p-values, where the egonet of a node is defined as the sub-network spanned by all neighbors of that node. Since there are exactly n egonets, the number of subgraphs being monitored is n, which is a relatively manageable number. We establish theoretical properties of the egonet method. We demonstrate its accuracy from simulation studies involving a broad range of statistical network models. We also illustrate the method on several well-studied network datasets

Speaker Bio:
Dr. Sengupta, received a bachelor and masters from the Indian Statistical Institute, Kolkota, and Ph.D. from University of Illinois at Urbana-Champaign. At present he is an Assistant Professor at Virginia Tech. His research interests are in Statistical methodology, Network data, Community structure in networks, Bootstrap and related resampling methods, Machine learning, Big data and computational statistics, Time series and Spatial data, Statistical applications in scientific fields.

Host Faculty: Prof. Ambedkar Dukkipati

Video Archive Go Top

 

Series: Ph.D. Thesis Defense
Title: Extending program analysis techniques to web applications and distributed systems

  • Speaker: Ms. Snigdha Athaiya
                   Ph.D student
                   Dept. of CSA
  • Faculty Advisor: Prof. K V Raghavan
  • Date and Time: Monday, January 13, 2020, 2:30 PM
  • Venue: CSA Conference Room, (Room No. 328, Second Floor)

Abstract
Web-based applications and distributed message-passing systems are two ubiquitous techniques to distribute the execution of an application across multiple machines. Analysis and verification of such programs poses numerous challenges due to the non-linear flow of control and data within the application. In this thesis, we address program analysis problems in the domains of web applications and message-passing systems, and design effective solutions for these problems.

Performing end-to-end analysis of web applications using program analysis techniques is a challenge. This is due to multiple reasons, such as client-server interaction, user interaction, and the use of different languages to implement the server-side code and client-side functionality. We propose a technique that translates the entire web application, including server-side and client-side code, into a single-language ``model''. The model is nothing but a standard (i.e., non-web) executable program in the same language as the server-side code. The model is guaranteed to preserve the essential behavioral aspects of the given web application. The upshot is that off-the-shelf tools can be used to analyze the model, without the need to create web-application specific analysis tools.

We instantiate our approach in the context of J2EE applications that use JSP to describe pages and Java servlets to implement the server-side code. We have built a tool for the translation of such applications to Java programs. We evaluate our translation tool by converting 5 real world web applications into corresponding Java programs, and, then analyze these programs using three popular third-party program analysis tools - Wala (for static slicing), Java PathFinder (for explicit-state and symbolic model checking), and Zoltar (fault localization by dynamic analysis). With each of these analysis tools we get precise results in most of the cases.

Dataflow analysis is a static analysis technique to identify abstract properties of variables at all the points in a program. The second challenge that the thesis addresses is to extend dataflow analysis to distributed asynchronous message-passing systems. Such systems are commonly used to implement distributed mechanisms, protocols, and workflows, in different domains. To obtain good precision, dataflow analysis of message-passing programs needs to somehow skip the traversal of execution paths that read more messages than the number of messages sent so far in the path, as such paths are infeasible at runtime. Current dataflow analysis techniques for message-passing programs either compromise on precision by traversing infeasible paths in addition to feasible paths, or check only simple types of abstract properties (in particular, ones that are representable using finite lattices).

This thesis proposes an approach for precise dataflow analysis of message passing asynchronous programs. Our approach traverses only feasible paths, and can check a class of complex abstract properties that need infinite lattice representations. The problem solved by our approach was not known to be decidable so far. Our approach builds on an existing concept in the analysis of parallel systems, namely, coverability, in a novel and involved manner. We have implemented our approach as a tool, and have analyzed its performance on numerous realistic benchmarks. On several of the benchmarks our approach gave more precise results than a baseline analysis that traverses infeasible paths.

Video Archive Go Top

 

Series: M.Tech (Research) Thesis Defense
Title: Experiences in Using Reinforcement Learning for Directed Fuzz Testing

  • Speaker: Mr. Subhendu Malakar
                   M.Tech (Research) Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Vinod Ganapathy
  • Date and Time: Monday, January 13, 2020, 2:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Directed testing is a technique to analyze user-specified target locations in the program. It reduces the time and effort of developers by excluding irrelevant parts of the program from testing and focusing on reaching the target location. Existing tools for directed testing employ either symbolic execution with heavy-weight program analysis or fuzz testing mixed with fine-tuned heuristics.

In this thesis, we explore the feasibility of using a data-driven approach for directed testing. We aim to leverage the data generated by fuzz testing tools. We train an agent on the data collected from the fuzzers to learn the optimal mutation for each program input. The agent then directs the fuzzer towards the target location by instructing the optimal action for each program input. We use reinforcement learning based algorithms to train the agent. We implemented a prototype of our approach and evaluated it on synthetic as well as real-world programs. We also evaluate and compare different reward mechanisms to train the agent. Our evaluation shows that an agent based on reinforcement learning can learn the task for simple programs. However, it is not able to perform better for real-world programs as compared to fuzzers that have no such learning agent. From our experiments, we conclude that data-driven approaches are feasible and should be pursued. Although in the present state, reinforcement learning is not able to compete with state of the art fuzzers, we hope that advancements in reinforcement learning will be able to bridge the gap.

Video Archive Go Top

 

Series: Department Seminar
Title: Communication Complexity of Byzantine Agreement, Revisited

  • Speaker: Dr. Kartik Nayak
                   Assistant Professor
                   Department of Computer Science
                   Duke University
  • Date and Time: Friday, January 10, 2020, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
As Byzantine Agreement (BA) protocols find application in largescale decentralized cryptocurrencies, an increasingly important problem is to design BA protocols with improved communication complexity. A few existing works have shown how to achieve subquadratic BA under an adaptive adversary. Intriguingly, they all make a common relaxation about the adaptivity of the attacker, that is, if an honest node sends a message and then gets corrupted in some round, the adversary cannot erase the message that was already sent — henceforth we say that such an adversary cannot perform “after-the-fact removal”. By contrast, many (super-)quadratic BA protocols in the literature can tolerate after-the-fact removal. It turns out, as shown in our work, that disallowing after-the-fact removal is necessary for achieving subquadratic-communication BA.

In this talk, I will first present a simple quadratic BA protocol. Next, I will show a new subquadratic binary BA construction (of course, assuming no after-the-fact removal) that achieves near-optimal resilience and expected constant rounds under standard cryptographic assumptions and a public-key infrastructure (PKI). In comparison, all known subquadratic protocols make additional strong assumptions such as random oracles or the ability of honest nodes to erase secrets from memory, and even with these strong assumptions, no prior work can achieve the above properties.

Speaker Bio:
Kartik Nayak is an assistant professor in the Department of Computer Science at Duke University. He works in the areas of security, applied cryptography, distributed computing, and blockchains. Before joining Duke University, he spent a year as a postdoctoral researcher at VMware Research. Before that, he graduated from the University of Maryland, College Park under the supervision of Professor Jonathan Katz and Professor Elaine Shi. Kartik is a recipient of the 2016 Google Ph.D. fellowship in Security.

Host Faculty: Dr. Arpita Patra

Video Archive Go Top

 

Series: Department Seminar
Title: Authenticated Encryption

  • Speaker: Dr Avik Chakraborti, ISI Kolkata
  • Date and Time: Thursday, January 09, 2020, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Authenticated encryption (AE) is a symmetric-key cryptographic primitive for providing both confidentiality as well as authenticity. Due to the recent upsurge of communication networks, the era of the so-called Internet of Things, AE is expected to play a key role in securing these networks. Realizing, the importance of AE, several network protocol suites such as TLS, IPSec, SSH, IEEE 802.11i and few others have adopted the current AE standards. Understanding the inefficiencies of the current standards, some standardizing bodies have proposed standardization competitions for new AE submissions; CAESAR and NIST LWC competitions being the two most renowned among them. These competitions have given a huge boost in the domain of AE related researches. The presentation addresses both the directions of AE design and cryptanalysis.

1) It mainly covers the most popular design aspect in AE based research: Lightweight Cryptography, demonstrated with the design of COFB.

2) The talk also elucidates finding weaknesses from an AE scheme e.g, Cryptanalysis, demonstrated with the design Pi-Cipher.

The presentation concludes with the summary of the other significant works and ongoing researches along with the possible future research directions. To be specific, the talk shall bring forth the essence of the work done during my research career, highlight my expertise along with my future projections.

Speaker Bio:
Avik is currently designated as a Lecturer-cum-PDF at the R.C Bose Center for Cryptology and Security, ISI Kolkata. He had been associated with NTT Secure Platform Laboratories, Tokyo, Japan as a Post-Doctoral Fellow from Dec, 2016 to Nov, 2019. Briefly stating about his research background, Avik has completed his PhD from ISI, Kolkata under the supervision of Dr. Mridul Nandi. Avik's research expertise revolves mainly around the real world aspects of Symmetric Key Cryptology and their implementations. His current projects are mainly based on lightweight cryptography where he is primarily working for the NIST Lightweight Cryptography competition for standardization.

Host Faculty: Prof. Matthew Jacob

Video Archive Go Top

 

Series: CSA Golden Jubilee Frontier Lecture Series
Title: MACHINE LEARNING MODELS:FROM BIRTH TO SERVING THE REAL-WORLD

  • Speaker: Prof. Sunita Sarawagi
                   Institute Chair Professor
                   Computer Science and Engineering
                   Indian Institute of Technology, Bombay
  • Date and Time: Wednesday, January 08, 2020, 4:00 PM
  • Venue: Faculty Hall, Indian Institute of Science

Abstract
A Machine Learning (ML) Model is born by tting a function around examples sampled from an unknown distribution, and is designed to generalize to other examples drawn from the same distribution. The machine learning edice largely rests on this foundation. However, modern ML models trained for challenging tasks like object detection, speech recognition, and language understanding require huge amounts of labeled data and leave a large carbon footprint. In return, a model once trained needs to serve many diverse real-world settings that do not perfectly match its birth setting. In this talk, I will discuss current research on propping the ML edice against such shifting foundation. We will span over current props like calibration, out-of-sample detection, domain adaptation, domain generalization, and robustness and reect on some futuristic topics.

Speaker Bio:
Prof. Sunita Sarawagi is Institute Chair Professor in Computer Science and Engineering at IIT-Bombay. Prof. Sarawagi received her B.Tech. in Computer Science from the Indian Institute of Technology, Kharagpur in May 1991. She received her M.S. and Ph.D. in Computer Science from the University of California at Berkeley where she studied under Michael Stonebraker. Following her Ph.D., Sarawagi did stints at IBM Almaden Research Center as research scholar, Carnegie Mellon as visiting faculty, and joined IIT-Bombay in 1999. Between July 2014 and July 2016 Prof. Sarawagi was Visiting Scientist at Google Inc. in Mountain View where she worked on deep learning models for personalizing and diversifying YouTube and Play recommendations, improving Duo’s conversation assistance engine, and extracting attributes of classes from the Knowledge Graph. Among her many awards are IBM Faculty award (2003 and 2008); Fellow of the Indian National Academy of Engineering (INAE) (2013); PAKDD Most Influential Paper Award 2014 for the paper: “Discriminative Methods for Multi-labeled Classification Shantanu Godbole and Sunita Sarawagi in PAKDD 2004”. Prof. Sarawagi has several patents to her name. They include a patent for “Database System and Method Employing Data Cube Operator for Group-By Operations” and a patent for “Efficient evaluation of queries with mining predicates”. The Infosys Prize 2019 in Engineering and Computer Science is awarded to Prof. Sunita Sarawagi for her research in databases, data mining, machine learning and natural language processing, and for important applications of these research techniques. The prize recognizes her pioneering work in developing information extraction techniques for unstructured data.

Host Faculty: Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: Ph.D. Colloquium
Title: Scalable and Effective Polyhedral Auto-transformation without using Integer Linear Programming

  • Speaker: Mr. Aravind Acharya
                   Ph.D Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Uday Kumar Reddy .B
  • Date and Time: Wednesday, January 08, 2020, 2:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In recent years, polyhedral auto-transformation frameworks have gained significant interest in general-purpose compilation, because of their ability to find and compose complex loop transformations that extract high performance from modern architectures. These frameworks automatically find loop transformations that either enhance locality, parallelism and minimize latency or a combination of these. Recently, focus has also shifting on developing intermediate representations, like MLIR, where complex loop transformations and data-layout optimizations can be incorporated efficiently in a single common infrastructure.

Polyhedral auto-transformation frameworks typically rely on complex Integer Linear Programming (ILP) formulations to find affine loop transformations. However, construction and solving these ILP problems is time consuming which increases compilation time significantly. Secondly, loop fusion heuristics in these auto-transformation frameworks are ad hoc, and modeling loop fusion efficiently would further degrade compilation time.

In this thesis, we first relax the ILP formulation in the Pluto algorithm. We show that even though LP relaxation reduces the time complexity of the problem, it does not reduce the compilation time significantly because of the complex construction of constraints. We also observe that due to relaxation, sub-optimal loop transformations that result in significant performance degradation may be obtained. Hence, we propose a new polyhedral auto-transformation framework, called Pluto-lp-dfp, that finds efficient affine loop transformations quickly,while relying on Pluto's cost function. The framework decouples auto-transformation into three components: (1) loop fusion and permutation (2) loop scaling and shifting and (3) loop skewing components. In each phase, we solve a Linear Programming (LP) formulation instead of an ILP, thereby resulting in a polynomial time affine transformation algorithm. We propose a data structure, called fusion conflict graph, that allows us to model loop fusion to work in tandem with loop permutations, loop scaling and loop shifting transformations. We describe three greedy fusion heuristics, namely,max-fuse, typed-fuse and hybrid-fuse, of which, the hybrid-fuse and typed-fuse models incorporate parallelism preserving fusion heuristic without significant compilation time overhead. We also provide a characterization of time-iterated stencils that have tile-wise concurrent start and employ a different fusion heuristic in such programs. In our experiments, we demonstrate that Pluto-lp-dfp framework not only finds loop transformations quickly, resulting in significant improvements in compilation time, but also outperforms state-of-the-art polyhedral auto-parallelizers in terms of execution time of the transformed program. We observe that Pluto-lp-dfp is faster than PoCC and Pluto by a geomean factor of 461x and 2.2x in terms of compilation time. On larger NAS benchmarks, Pluto-lp-dfp was faster than Pluto by 246x. PoCC failed to find a transformation in a reasonable amount of time in these cases. In terms of execution time, the hybrid-fuse variant in Pluto-lp-dfp outperforms PoCC by a geomean factor 1.8x, with over 3x improvements in some cases. We also observe that Pluto-lp-dfp is faster than an improved version of Pluto by a factor of 7%, with a maximum performance improvement of 2x.

Video Archive Go Top

 

Series: Department Seminar
Title: Revisiting Old Abstractions to Design for The Future

  • Speaker: Dr. Jayneel Gandhi
                   Research scientist, VMware Research
  • Date and Time: Tuesday, January 07, 2020, 2:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Computing systems are required to achieve continued performance growth with a focus on security and privacy while increasing energy efficiency. To achieve this goal, we need to revisit old abstractions that exist between applications, system software, and hardware. These old abstractions have enabled independent development and optimizations within each layer. However, to achieve performance and energy efficiency required by future applications, computing systems need to unlock new opportunities for optimizations.

The approach I am pursuing to unlock these opportunities is creative cross-layer mechanisms. Such mechanisms allow flexibility across various layers to re-design them cohesively and effectively with optimizations not available when these layers were designed in isolation. In this talk, I will use two examples of cross-layer mechanisms in the area of memory virtualization. These two examples will serve as evidence for using such an approach to enable breakthroughs of the future.

Speaker Bio:
Jayneel Gandhi is a research scientist at VMware Research. He is interested in securely virtualizing and easing the adoption of heterogeneous accelerators and emerging memory technologies by introducing useful operating system abstractions and efficient memory system architectures. Three of his research papers have been selected as the IEEE MICRO Top Pick of the year. Additionally, his research led to three open-source projects being used by academia and industry.

Host Faculty: Arkaprava Basu

Video Archive Go Top

 

Series: Theory Seminar Series
Title: Beyond trace reconstruction: Population recovery from the deletion channel

  • Speaker: Sandip Sinha
                   Ph.D Student
                   CS Theory Group
                   Columbia University
  • Date and Time: Monday, January 06, 2020, 1:00 PM
  • Venue: CSA Lecture Hall (Room No. 117, Ground Floor)

Abstract
Population recovery is the problem of learning an unknown distribution over an unknown set of n-bit strings, given access to independent draws from the distribution that have been independently corrupted according to some noise channel. Recent work has intensively studied such problems both for the bit-flip and erasure noise channels. We initiate the study of population recovery under the deletion channel, in which each bit is independently deleted with some fixed probability and the surviving bits are concatenated and transmitted, in both worst-case and average-case settings of the strings in the support. This is a generalization of trace reconstruction, a challenging problem that has received much recent attention.

For the worst case, we show that for any s = o(log n / log log n), a population of s strings from {0,1}^n can be learned under deletion channel noise using exp(n^{1/2+o(1)}) samples. On the lower bound side, we show that n^{Omega(s)} samples are required to perform population recovery under the deletion channel, for all s <= n^0.49.

For the average case, we give an efficient algorithm for population recovery. The algorithm runs in time poly(n,s,1/eps) and its sample complexity is poly(s, 1/eps, exp(log^{1/3} n)), where eps is the TV distance between the original and output distributions.

This is based on the following joint work with Frank Ban, Xi Chen, Adam Freilich and Rocco Servedio: https://arxiv.org/abs/1904.05532 https://arxiv.org/abs/1907.05964

Speaker Bio:
Bio: Sandip Sinha is a Ph.D student in the CS theory group at Columbia University. He is advised by Profs. Rocco Servedio, Alex Andoni, and Cliff Stein. His primary interest is in computational learning theory, as well as data structures for various problems on strings. He is also interested in sublinear algorithms. Before joining Columbia, he graduated from the Indian Institute of Science (IISc) with a B.S. degree in Mathematics in 2016.

Host Faculty: Dr. Arindam Khan

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2018      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits