Events 

Seminars 

Golden Jubilee Frontier Lectures 

Prof. I. G. Sarma Memorial Lecture Series 

Prof. Priti Shankar Memorial Seminar Series 

Multimedia Archive 



UPCOMING SEMINARS 


PAST SEMINARS 

Series: M.Tech (Research) Colloquium ONLINE Title: Embedding Networks: Node and Graph Level Representations  Speaker: Ms. Manasvi Aggarwal
M.Tech (Research) Student
Dept. of CSA  Faculty Advisor: Prof. M. Narasimha Murty
 Date and Time: Wednesday, June 24, 2020, 1:00 PM
 Venue: Microsoft Teams  ONLINE
Abstract Network representation learning is important to carry out various network analysis downstream tasks. Graphs are the most suitable structures to represent relational data such as social networks and molecular structures. In this thesis work, we focus on learning representations of the nodes as well as of the entire graphs. Graph neural networks got significant importance for graph representation. Recently, attention mechanisms on graphs show promising results for classification tasks. Most of the attention mechanisms developed in graph literature use attention to derive the importance of a node or a pair of nodes for different tasks. But in the real world situation, calculating importance up to a pair of nodes is not adequate.
To address this problem, we introduce a novel GNN based approach, subgraph attention, to classify the nodes of a graph. On the other hand, the hierarchical graph pooling is promising in the recent literature. But, not all the hierarchies of a graph play an equal role for graph classification. Towards this end, we propose an algorithm called SubGattPool to find the important nodes in a hierarchy and the importance of individual hierarchies in a graph for embedding and classifying the graphs given a collection of graphs. Moreover, existing pooling approaches do not consider both the region based as well as the graph level importance of the nodes together. In the next research work, we solve this issue by proposing a novel pooling layer named R2pool which retains the most informative nodes for the next coarser version of the graph. Further, we integrate R2pool with our branch training strategy to learn coarse to fine representations and improve the model's capability for graph classification by exploiting multilevel prediction strategy. Thorough experimentation on both the real world and synthetic graphs shows the merit of the proposed algorithms over the stateoftheart.
Microsoft Teams link:
https://teams.microsoft.com/l/team/19%3a35aa9398569340ed9061a35f0589ffe2%40thread.tacv2/conversations?groupId=d9e50dc5360046c18888998889fcedb8&tenantId=6f15cd97f6a741e3b2c5ad4193976476
 Series: M.Tech (Research)  Thesis Defence (ONLINE) Title: Privacy Preserving Machine Learning via Multiparty Computation  Speaker: Mr. Chaudhari Harsh Mangesh Suchita
M.Tech (Research) student
Dept. of CSA  Faculty Advisor: Dr. Arpita Patra
 Date and Time: Monday, June 15, 2020, 11:00 AM
 Venue: ONLINE
Abstract Privacypreserving machine learning (PPML) via Secure Multiparty Computation (MPC) has gained momentum in the recent past. Assuming a minimal network of pairwise private channels, we propose an efficient fourparty PPML framework over rings, FLASH, the first of its kind in the regime of PPML framework, that achieves the strongest security notion of Guaranteed Output Delivery (all parties obtain the output irrespective of adversary's behaviour). The state of the art ML frameworks such as ABY3 by Mohassel et.al (ACM CCS'18) and SecureNN by Wagh et.al (PETS'19) operate in the setting of 3 parties with one malicious corruption but achieve the weaker security guarantee of abort. We demonstrate PPML with realtime efficiency, using the following custommade tools that overcome the limitations of the aforementioned stateoftheart (a) dot product, which is independent of the vector size unlike the stateoftheart ABY3, SecureNN and ASTRA by Chaudhari et.al (ACM CCSW'19), all of which have linear dependence on the vector size. (b) Truncation which is constant round and free of circuits like Ripple Carry Adder (RCA), unlike ABY3 which uses these circuits and has round complexity of the order of depth of these circuits. We then exhibit the application of our FLASH framework in the secure serveraided prediction of vital algorithms: Linear Regression, Logistic Regression, Deep Neural Networks, and Binarized Neural Networks. We substantiate our theoretical claims through improvement in benchmarks of the aforementioned algorithms when compared with the current best framework ABY3. All the protocols are implemented over a 64bit ring in LAN and WAN. Our experiments demonstrate that, for MNIST dataset, the improvement (in terms of throughput) ranges from 11x to 1390x over Local Area Network (LAN) and Wide Area Network (WAN) together.
Online link to join Microsoft meeting:
https://teams.microsoft.com/_#/prejoincalling/19:meeting_MmMxMDZkNmItZWZkOS00ZGJhLTgyYzYtNjlhODZiYjk5NzNj@thread.v2
 Series: Seminar Title: LearningBased Controlled Concurrency Testing  Speaker: Suvam Mukherjee
 Faculty Advisor: Deepak D'Souza
 Date and Time: Monday, June 01, 2020, 11:00 AM
 Venue: ONLINE
Abstract Concurrency bugs are notoriously hard to detect and reproduce. Controlled concurrency testing (CCT) techniques aim to offer a solution, where a scheduler explores the space of possible interleavings of a concurrent program looking for bugs. Since the set of possible interleavings is typically very large, these schedulers employ heuristics that prioritize the search to "interesting" subspaces. However, current heuristics are typically tuned to specific bug patterns, which limits their effectiveness in practice.
In this work, we present QL, a learningbased CCT framework where the likelihood of an action being selected by the scheduler is influenced by earlier explorations. We leverage the classical Qlearning algorithm to explore the space of possible interleavings, allowing the exploration to adapt to the program under test, unlike previous techniques. We have implemented and evaluated QL on a set of microbenchmarks, complex protocols, as well as production cloud services. In our experiments, we found QL to consistently outperform the stateoftheart in CCT.
This is joint work with Pantazis Deligiannis (Microsoft Research), Arpita Biswas (Indian Institute of Science) and Akash Lal (Microsoft Research).
Microsoft Teams Meeting Link:
https://teams.microsoft.com/l/meetupjoin/19%3ameeting_YjIwMzU5NDItM2Q2ZC00Zjg5LTkzYTYtMDVkODg2M2I0OGYw%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97f6a741e3b2c5ad4193976476%22%2c%22Oid%22%3a%224e94f9c8085e46c8b31f468b334d3138%22%7d
Speaker Bio: Suvam Mukherjee is a postdoctoral researcher at Microsoft Research India (MSRI). His research interests lie in the areas of programming languages, verification and modelchecking. At MSRI, he has been working on several projects in Microsoft Coyote, an opensource framework used by several teams in Microsoft Azure to write reliable and efficient cloud services. Prior to joining MSRI, Suvam obtained his PhD from the Indian Institute of Science, where his thesis focused on developing efficient static analyses for multithreaded programs. For a part of his work, he won the Radhia Cousot Young Researcher Best Paper Award at the 24th Static Analysis Symposium 2017, New York City.
Host Faculty: Deepak D'Souza
 Series: Ph.D (Engg.)  Thesis Defence (Skype) Title: Constantrate Nonmalleable Codes and their Applications  Speaker: Ms. Sai Lakshmi Bhavana Obbattu
Ph.D (Engg.)
Dept. of CSA  Faculty Advisor: Dr. Bhavana Kanukurthi
 Date and Time: Friday, May 29, 2020, 3:30 PM
 Venue: Skype
Abstract Nonmalleable codes (NMCs) introduced by Dziembowski, Pietrzak and Wichs in ITCS 2010, provide powerful security guarantees where standard errorcorrecting codes can not provide any guarantee: a decoded message is either the same or completely independent of the underlying message. NMCs have found applications to various aspects of cryptography such as CCA secure encryption, tamper and leakage resilient cryptography, nonmalleable commitments, nonmalleable secret sharing schemes and so on.
In this talk, we present an application of NMCs to the fascinating problem of "Privacy Amplification". In the problem of privacy amplification, two parties, Alice and Bob, who apriori share a weak secret, to agree on a uniform secret key, in the presence of a computationally unbounded adversary Eve. Building privacy amplification protocols with constant "entropy loss" and constant "round complexity" was open since 1988 (and recently closed in an independent work of Li [CCC '19]). In this talk, we will show how to construct such a privacy amplification protocol under the existence of nonmalleable code with certain strong security guarantees.
Next, we will also discuss the first explicit construction of a constant rate, constant state nonmalleable code.
This talk is based on joint works with Eshan Chattopadhyay, Bhavana Kanukurthi and Sruthi Sekar.
References:
[1] Bhavana Kanukurthi, Sai Lakshmi Bhavana Obbattu, and Sruthi Sekar. Fourstate nonmalleable codes with explicit constant rate. In Theory of Cryptography Conference, TCC 2017.
Invited to Journal of Cryptology.
[2] Eshan Chattopadhyay, Bhavana Kanukurthi, Sai Lakshmi Bhavana Obbattu, and Sruthi Sekar. Privacy amplification from nonmalleable codes. In Indocrypt 2019.
Please fill this form:
https://docs.google.com/forms/d/e/1FAIpQLSdYuJUadeb9EU60_G_dm8iy7Y0GjBqzaE8JWrUL8G9KgTqA/viewform?usp=sf_link
by 28 May to be added to the conversation
 Series: Ph.D (Engg.) Colloquium (ONLINE) Title: Representing Networks: Centrality, Node Embeddings, Community Outliers and Graph Representation  Speaker: Mr. Sambaran Bandyopadhyay
Ph.D (Engg.) ERP Student
Dept. of CSA  Faculty Advisor: Prof. M Narasimha Murty & Dr. Ramasuri Narayanam Organizational guide, IBM Research
 Date and Time: Friday, May 29, 2020, 2:00 PM
 Venue: Microsoft Teams
Abstract We start our technical work in this thesis by exploring the classical concept of node centrality (also known as influence measure) in information network. Like clustering, node centrality is also an illposed problem. There exist several heuristics and algorithms to compute the centrality of a node in a graph, but there is no formal definition of centrality available in the network science literature. Lately, researchers have proposed axiomatic frameworks for the centrality of a node in a network. However, these existing formal frameworks are not generic in nature in terms of characterizing the space of influence measures in complex networks. In this work, we propose a set of six axioms in order to capture most of the intrinsic properties that any influence measure ideally should satisfy. We also characterize existing measures of centrality with respect to this framework.
Next, we focus more on the representation learning on networks. Network embedding is required as real life networks are large, extremely sparse and discrete in nature. We investigate the problem of unsupervised node representation in attributed networks through informative random walk. Edges are also useful for various downstream network mining tasks, but most of the existing homogeneous network representation learning approaches focus on embedding the nodes of a graph. So, we propose a novel unsupervised algorithm to embed the edges of a network, through the application of the classical concept of line graph of a network. The optimization framework of edge embedding connects to the concept of node centrality in the representation learning framework. Finally, we also conduct research on attributed hypergraphs. We propose a novel graph neural network to represent and classify hypernodes.
Outlier analysis (or anomaly detection) is another important problem for the network science community. All the realworld networks contain outlier nodes to some extent. Empirically we have shown that outliers can affect the quality of network embedding if not handled properly. So, we integrate the process of network embedding and outlier detection into a single framework. In this research thread, we first propose a matrix factorization based approach which minimizes the effect of outlier nodes in the framework of attributed network embedding. Next, we propose two neural network architectures, based on L2 regularization and adversarial training respectively, to minimize the effect of outliers on node embedding of an attributed network. Further, extending the concept of support vector data description, we propose a novel algorithm which integrates node embedding, community detection and outlier detection into a single optimization framework by exploiting the link structure of a graph.
So far, we have conducted research only on the individual components of a graph, i.e., on nodes and edges. In the last part of the thesis, we focus on graph level representation and tasks. First, we propose a supervised graph neural network based algorithm with hierarchical pooling strategy to classify a graph from a set of graphs. Next, we propose a novel GNN based algorithm for the unsupervised representation of a graph from a set of graphs, so that similar graphs are represented closely in the embedding space and dissimilar graphs are separated away.
Microsoft Teams link:
https://teams.microsoft.com/l/team/19%3ad5d87af5b08d4f14b81c06c903932960%40thread.tacv2/conversations?groupId=ef37e04758df4367b37e9bb717bb42bc&tenantId=6f15cd97f6a741e3b2c5ad4193976476
 Series: M.Tech (Research) Thesis Defence (ONLINE) Title: Fault Aware ReadCopyUpdate  Speaker: Mr. Abhishek Dubey
M.Tech (Research) Student
Dept. of CSA  Faculty Advisor: Prof. K Gopinath
 Date and Time: Thursday, May 28, 2020, 2:30 PM
 Venue: ONLINE
Abstract Deferred freeing is the fundamental technique used in ReadCopyUpdate (RCU) synchronization technique where reclamation of resources is deferred until the completion of all active RCU readside critical sections. We observe that faults inside an RCU readside critical section can indefinitely block writers that are waiting for the completion of RCU readers and also lead to system failures by preventing the reclamation of deferred resources. We show that the impact of such faults in the Linux kernel is global; a fault in one subsystem can propagate and exhaust critical resources in other unrelated subsystems opening a window of opportunity for DoSbased attacks. For example, a fault in a filesystem can exhaust the process ulimit resulting in fork failures. Since, guaranteeing the absence of faults is practically impossible, it is imperative to harden RCU to tolerate faults. We first study the impact of mitigating lockup by termination of the faulty thread, as thread termination is standard approach used by Linux as recovery strategy. Whereas, another solution is stack based and do not require termination of faulty thread. We demonstrate the impact of faults in RCU readside critical sections and present RCU recovery techniques that use novel approaches to detect and isolate effect of such faults. We also discuss system consistency once the fault is handled by our approaches. Timely recovery results in a usable system, preserving the user application state and increasing the system’s availability. Our evaluation in the Linux kernel shows that our solution can prevent resource exhaustion in the presence of faults with no additional overhead in the absence of faults.
Teams Meeting Link:
https://teams.microsoft.com/dl/launcher/launcher.html?url=%2f_%23%2fl%2fmeetupjoin%2f19%3ameeting_Nzk2YTg2ZDEtOWRhYy00OGYyLThmZTYtZWI5NTZiMGY0YzRl%40thread.v2%2f0%3fcontext%3d%257b%2522Tid%2522%253a%25226f15cd97f6a741e3b2c5ad4193976476%2522%252c%2522Oid%2522%253a%252247d9ed45e13149b49b89ac82d3c3da13%2522%257d%26anon%3dtrue&type=meetupjoin&deeplinkId=d862ca07351345978b62e5bea2346d12&directDl=true&msLaunch=true&enableMobilePage=true&suppressPrompt=true
 Series: Ph.D Colloquium (ONLINE) Title: Online Learning from Relative Subsetwise Preferences  Speaker: Ms. Aadirupa Saha
Ph.D student
Dept. of CSA  Faculty Advisor: Prof. Chiranjib Bhattacharyya & Prof. Aditya Gopalan
 Date and Time: Friday, May 15, 2020, 9:00 AM
 Venue: ONLINE
Abstract The elicitation and aggregation of preferences is often the key to making better decisions. Be it a perfume company wanting to relaunch their 5 most popular fragrances, a movie recommender system trying to rank the most favoured movies, or a pharmaceutical company testing the relative efficacies of a set of drugs, learning from preference feedback is a widely applicable problem to solve. One can model the sequential version of this problem using the classical multiarmedbandit (MAB) (e.g., Auer, 2002) by representing each decision choice as one banditarm, or more appropriately as a DuelingBandit (DB) problem (Yue and Joachims, 2009). Although DB is similar to MAB in that it is an online decision making framework, DB is different in that it specifically models learning from pairwise preferences. In practice, it is often much easier to elicit information, especially when humans are in the loop, through relative preferences: `Item A is better than item B' is easier to elicit than its absolute counterpart: `Item A is worth 7 and B is worth 4'.
However, instead of pairwise preferences, a more general subsetwise preference model is more relevant in various practical scenarios, e.g. recommender systems, search engines, crowdsourcing, elearning platforms, design of surveys, ranking in multiplayer games. Subsetwise preference elicitation is not only more budget friendly, but also flexible in conveying several types of feedback. For example, with subsetwise preferences, the learner could elicit the best item, a partial preference of the top 5 items, or even an entire rank ordering of a subset of items, whereas all these boil down to the same feedback over pairs (subsets of size 2). The problem of how to learn adaptively with subsetwise preferences, however, remains largely unexplored; this is primarily due to the computational burden of maintaining a combinatorially large, O(n^k), size of preference information in general.
We take a step in the above direction by proposing "Battling Bandits (BB)"a new online learning framework to learn a set of optimal ('good') items by sequentially, and adaptively, querying subsets of items of size up to k (k>=2). The preference feedback from a subset is assumed to arise from an underlying parametric discrete choice model, such as the wellknown PlackettLuce model, or more generally any random utility (RUM) based model. It is this structure that we leverage to design efficient algorithms for various problems of interest, e.g. identifying the best item, set of topk items, full ranking etc., for both in PAC and regret minimization setting. We propose computationally efficient and (near) optimal algorithms for above objectives along with matching lower bound guarantees. Interestingly this leads us to finding answers to some basic questions about the value of subsetwise preferences: Does playing a general kset really help in faster information aggregation, i.e. is there a tradeoff between subsetsizek vs the learning rate? Under what type of feedback models? How do the performance limits (performance lower bounds) vary over different combinations of feedback and choice models? And above all, what more can we achieve through BB where DB fails?
We proceed to analyse the BB problem in the contextual scenario – this is relevant in settings where items have known attributes, and allows for potentially infinite decision spaces. This is more general and of practical interest than the finitearm case, but, naturally, on the other hand more challenging. Moreover, none of the existing online learning algorithms extend straightforwardly to the continuous case, even for the most simple Dueling Bandit setup (i.e. when k=2). Towards this, we formulate the problem of "Contextual Battling Bandits (CBB)" under utility based subsetwisepreference feedback, and design provably optimal algorithms for the regret minimization problem. Our regret bounds are also accompanied by matching lower bound guarantees showing optimality of our proposed methods. All our theoretical guarantees are corroborated with empirical evaluations.
Lastly, it goes without saying, that there are still many open threads to explore based on BB. These include studying different choicefeedback model combinations, performance objectives, or even extending BB to other useful frameworks like assortment selection, revenue maximization, budgetconstrained bandits etc. Towards the end we will also discuss some interesting combinations of the BB framework with other, wellknown, problems, e.g. Sleeping / Rotting Bandits, Preference based Reinforcement Learning, Learning on Graphs, Preferential BanditConvexOptimization etc.
Microsoft Teams link:
https://teams.microsoft.com/l/meetupjoin/19%3ameeting_YzFmNTdmODYtYjhhZi00Yjc4LTg3NWItNmEyNzc5NzlkMzQ1%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97f6a741e3b2c5ad4193976476%22%2c%22Oid%22%3a%22620fb6db36c54f95ba456ed8e824aa28%22%7d
 Series: Ph.D Colloquium (Online) Title: Decision Making under Uncertainty : Reinforcement Learning Algorithms and Applications in Cloud Computing, Crowdsourcing and Predictive Analytics  Speaker: Ms. Indu John
Ph.D Student
Dept. of CSA  Faculty Advisor: Prof. Shalabh Bhatnagar
 Date and Time: Friday, May 15, 2020, 3:00 PM
 Venue: Microsoft Teams meeting : https://teams.microsoft.com/l/meetupjoin/19%3a8dbfbf39c4464817a8333f18b16a8aec%40thread.tacv2/1588649977138?context=%7b%22Tid%22%3a%226f15cd97f6a741e3b2c5ad4193976476%22%2c%22Oid%22%3a%2286976192e64f4e50ae9f8a79b451c8f8%22%7d
Abstract In this thesis, we study both theoretical and practical aspects of decision making, with a focus on reinforcement learning based methods. Reinforcement learning (RL) is a form of semisupervised learning in which the agent learns the decision making strategy by interacting with its environment. We develop novel reinforcement learning algorithms and study decision problems in the domains of cloud computing, crowdsourcing and predictive analytics.
In the first part of the thesis, we develop a model free reinforcement learning algorithm with faster convergence named Generalized Speedy Qlearning and analyze its finite time performance. This algorithm integrates ideas from the wellknown Speedy Qlearning algorithm and the generalized Bellman equation to derive a simple and efficient update rule such that its finite time bound is better than that of Speedy Qlearning for MDPs with a special structure. Further, we extend our algorithm to deal with large state and action spaces by using function approximation.
Extending the idea in the above algorithm, we develop a novel Deep Reinforcement Learning algorithm by combining the technique of successive overrelaxation with Deep Qnetworks. The new algorithm, named SORDQN, uses modified targets in the DQN framework with the aim of accelerating training. We study the application of SORDQN in the problem of autoscaling resources for cloud applications, for which existing algorithms suffer from issues such as slow convergence, poor performance during the training phase and nonscalability.
Next, we consider an interesting research problem in the domain of crowdsourcing  that of efficiently allocating a fixed budget among a set of tasks with varying difficulty levels. Further, the assignment of tasks to workers with different skill levels is tackled. This problem is modeled in the RL framework and an approximate solution is proposed to deal with the exploding state space.
We also study the following problem in predictive analytics : predicting the future values of system parameters well in advance for a largescale software or industrial system, which is important for avoiding disruptions. An equally challenging and useful exercise is to identify the "important" parameters and optimize them in order to attain good system performance. In addition to devising an endtoend solution for the problem, we present a case study on a largescale enterprise system to validate the effectiveness of the proposed approach.
 Series: M.Tech (Research) Colloquium Title: Model Extraction defense using Modified Variational Autoencoder  Speaker: Mr. Yash Gupta
M.Tech. (Research) Student
Dept. of CSA  Faculty Advisor: Prof. Aditya Kanade and Prof. Shirish K Shevade
 Date and Time: Tuesday, May 05, 2020, 09:30 AM
 Venue: The defense will be conducted online. Please fill the following form before 4th May 5 PM to receive the Microsoft Teams meeting invite. https://forms.gle/Bw4Lj36uFuugZZFY7
Abstract Machine Learning as a Service (MLaaS) exposes machine learning (ML) models that are trained on confidential datasets to users in the form of an Application Programming Interface (API). Since the MLaaS models are deployed for commercial purposes the API is available as a payperquery service. A malicious user or attacker can exploit these APIs to extract a close approximation of the MLaaS model by training a substitute model using only blackbox query access to the API, in a process called model extraction. The attacker is restricted to extract the MLaaS model using a limited query budget because of the paid service. The model extraction attack is invoked by firing queries that belong to a substitute dataset that consists of either (i) Synthetic NonProblem Domain (SNPD), (ii) Synthetic Problem Domain (SPD), or (iii) Natural NonProblem Domain (NNPD) dataset.
In this work, we propose a novel defense framework against model extraction, using a hybrid anomaly detector composed of an encoder and a detector. In particular we propose a modified Variational Autoencoder, VarDefend, which uses a loss function, specially designed, to separate the encodings of queries fired by malicious users from those by benign users. We consider two scenarios: (i) stateful defense where an MLaaS provider stores the queries made by each client for discovering any malicious pattern, (ii) stateless defense where individual queries are discarded if they are flagged as outofdistribution. Treating encoded queries from benign users as normal, one can use outlier detection models to identify encoded queries from malicious users in the stateless approach. For the stateful approach, a statistical test known as Maximum Mean Discrepancy (MMD) is used to match the distribution of the encodings of the malicious queries with those of the indistribution encoded samples. In our experiments, we observed that our stateful defense mechanism can completely block one representative attack for each of the three types of substitute datasets, without raising a single false alarm against queries made by a benign user. The number of queries required to block an attack is much smaller than those required by the current stateoftheart model extraction defense PRADA. Further, our proposed approach can block NNPD queries that cannot be blocked by PRADA. Our stateless defense mechanism is useful against a group of colluding attackers without significantly impacting benign users. Our experiments demonstrate that, for MNIST and FashionMNIST dataset, proposed stateless defense rejects more than 98% of the queries made by an attacker belonging to either SNPD, SPD or NNPD datasets while rejecting only about 0.05% of all the queries made by a benign user. Our experiments also demonstrate that the proposed stateless approach makes the MLaaS model significantly more robust to adversarial examples crafted using the substitute model by blocking transferability.
 Series: Theory Seminar Series Title: Fair Rent Division  Speaker: Ms. Nidhi Rathi
IntPh.D Student
Dept. of Mathematics  Date and Time: Friday, March 13, 2020, 1:00 PM
 Venue: CSA Lecture Hall (Room No. 117, Ground Floor)
Abstract The theory of fair division addresses the fundamental problem of allocating resources among agents with equal entitlements but distinct preferences.
In this talk, I will focus on the classic problem of fair rent division that entails splitting the rent (money) and allocating the rooms (indivisible resources) of an apartment among roommates (agents) in a fair manner. Here, a distribution of the rent and an accompanying allocation is said to be fair if it is envy free, i.e., under the imposed rents, no agent has a strictly stronger preference (utility) for any other agent’s room. While envyfree solutions are guaranteed to exist under reasonably general utility functions, efficient algorithms for finding them were known only for quasilinear utilities. Our work addresses this notable gap and develops approximation algorithms for fair rent division with minimal assumptions on the utility functions.
Specifically, we show that if the agents have continuous, monotone decreasing, and piecewise linear utilities, then the fair rent division problem admits a fully polynomialtime approximation scheme (FPTAS). We complement the algorithmic results by proving that the fair rent division problem (under continuous, monotone decreasing, and piecewiselinear utilities) lies in the intersection of the complexity classes PPAD and PLS.
This talk is based on a joint work with Eshwar Arunachaleswaran and Siddharth Barman that appeared in SODA 2019.
Host Faculty: Dr. Anand Louis
 Series: M.Tech (Research) Thesis Defense Title: Optimizing the Linear Fascicle Evaluation Algorithm for
MultiCore and ManyCore Systems  Speaker: Mr. Karan Aggarwal
M.Tech (Research) student
Dept. of CSA  Faculty Advisor: Prof. Uday Kumar Reddy B
 Date and Time: Thursday, March 05, 2020, 2:30 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Sparse matrixvector multiplication (SpMV) operations are commonly used in various scientific
and engineering applications. The performance of the SpMV operation often depends on exploiting
regularity patterns in the matrix. Various new representations and optimization techniques have
been proposed to minimize the memory bandwidth bottleneck arising from the irregular memory
access pattern. Among recent representation techniques, tensor decomposition is a popular one
used for very large but sparse matrices. Post sparsetensor decomposition, the new representation
involves indirect accesses, making it challenging to optimize for multicore architectures and
even more demanding for the massively parallel architectures, such as on GPUs.
Computational neuroscience algorithms often involve sparse datasets while still performing
computeintensive operations. The Linear Fascicle Evaluation (LiFE) application is a popular
neuroscience algorithm used for pruning brain connectivity graphs. The datasets employed herein
involve the Sparse Tucker Decomposition (STD)  a widely used tensor decomposition method. Using
this decomposition leads to multiple irregular array references, making it very difficult to
optimize for multicores and GPUs. Recent implementations of the LiFE algorithm show that its SpMV
operations are the key bottleneck for performance and scaling. In this work, we first propose
targetindependent techniques such as (1) data restructuring techniques to minimize the effects of
irregular accesses, and (2) simple compiler optimizations. Then we apply targetspecific optimizations
to exploit the resources provided by the architecture. The CPUspecific optimizations that we
incorporated are loop tiling, loop parallelization and utilizing BLAS calls to exploit data reuse,
coarsegrained parallelism and finegrained parallelism respectively. We extend the PolyMage
domainspecific language, embedded in Python, to automate the CPUbased optimizations developed for
this algorithm. Next, we propose various GPUspecific optimizations to optimally map threads at the
granularity of warps, thread blocks and grid, and methods to split the computation among thread blocks
to obtain finegrained parallelism and data reuse. Our highly optimized and parallelized CPU
implementation obtain a reduction in execution time from 225 min to 8.2 min over the original sequential
approach running on 16core Intel Xeon Silver (Skylakebased) system. Our optimized GPU implementation
achieves a speedup of 5.2x over a reference optimized GPU code version on NVIDIA's GeForce RTX 2080 Ti GPU,
and a speedup of 9.7x over our highly optimized and parallelized CPU implementation.
 Series: Department Seminar Title: Fairness in Algorithmic Decision Making  Speaker: Dr Abhijnan Chakraborty, MPISWS Saarbrucken
 Date and Time: Thursday, March 05, 2020, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Algorithmic (datadriven) decision making is increasingly being used to assist or replace human decision making in domains with high societal impact, such as banking (estimating creditworthiness), recruiting (ranking job applicants), judiciary (offender profiling), healthcare (identifying highrisk patients who need additional care) and journalism (recommending newsstories). Consequently, in recent times, multiple research works have uncovered the potential for bias (unfairness) in algorithmic decisions in different contexts, and proposed mechanisms to control (mitigate) such biases. However, the emphasis of existing works has largely been on fairness in supervised classification or regression tasks, and fairness issues in other scenarios remain relatively unexplored. In this talk, I will cover our recent works on incorporating fairness in recommendation and matching algorithms in multisided platforms, where the algorithms need to fairly consider the preferences of multiple stakeholders. I will discuss the notions of fairness in these contexts and propose techniques to achieve them. I will conclude the talk with a list of open questions and directions for future work.
Speaker Bio: Abhijnan Chakraborty is a Postdoctoral Researcher at the Max Planck Institute for Software Systems (MPISWS), Germany. He obtained his PhD from the Indian Institute of Technology (IIT) Kharagpur under the supervision of Prof. Niloy Ganguly (IIT Kharagpur) and Prof. Krishna Gummadi (MPISWS). During PhD, he was awarded the Google India PhD Fellowship and the Prime Minister's Fellowship for Doctoral Research. Prior to joining PhD, he spent two years at Microsoft Research India, working in the area of mobile systems. His current research interests fall under the broad theme of Computing and Society, covering the research areas of Social Computing, Information Retrieval and Fairness in Machine Learning. He has authored several papers in toptier computer science conferences including WWW, KDD, AAAI, CSCW, ICWSM and MobiCom. His research works have won the best paper award at ASONAM'16 and best poster award at ECIR’19. He is one of the recipients of internationally competitive research grant from the Data Transparency Lab to advance his research on fairness and transparency in algorithmic systems. More details about him can be found at https://people.mpisws.org/~achakrab
Host Faculty: Prof. Matthew Jacob
 Series: Ph.D. Thesis Defense Title: On Learning and Lower Bound Problems Related to Iterated Matrix Multiplication Polynomial  Speaker: Mr. Vineer Nair
Ph.D student
Dept. of CSA
IISc  Faculty Advisor: Prof. Chandan Saha
 Date and Time: Wednesday, March 04, 2020, 9:45 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract The iterated matrix multiplication polynomial (IMM) of width w and length d is the 1x1 entry in the product of d square matrices of size w. The w^2d entries in the d matrices are distinct variables. In this thesis, we study certain learning and lower bound problems related to IMM.
Our first work gives a polynomial time randomized algorithm for equivalence testing of IMM. At its core, the equivalence testing algorithm exploits a connection between the irreducible invariant subspaces of the Lie algebra of the group of symmetries of a polynomial f that is equivalent to IMM and the layer spaces of a fullrank algebraic branching program computing f. This connection also helps determine the group of symmetries of IMM and show that IMM is characterized by its group of symmetries.
Our second work is related to learning affine projections of IMM, which is believed to be a very hard problem as it is equivalent to reconstructing a powerful model to compute polynomials called algebraic branching programs (ABP). Equivalence test for IMM can be viewed as reconstructing ABPs in the averagecase, when the width of the ABP is at most (n/d)^0.5, where n and d are the number of variables and the degree of the polynomial computed by the ABP respectively. Our second work improves this by first considering a related problem called `linear matrix factorization’ (LMF) which is a natural generalization of the polynomial factorization problem. We give a polynomial time randomized algorithm for averagecase LMF for matrix products of width at most 0.5(n^0.5). In fact, we give a polynomial time randomized algorithm that solves (worstcase) LMF problem when the input matrix product is nondegenerate or pure product a notion we define in this work. Using our algorithm for LMF, we give a nontrivial averagecase reconstruction algorithm for ABPs of width at most 0.5(n^0.5), which is interesting in the context of the Ω(n^0.5) width lower bound known for homogeneous ABPs.
Our last work gives lower bounds on interesting restrictions of arithmetic formulas computing IMM. We prove a w^Ω(d) lower bound on the size of multilinear depth three formulas computing IMM of width w and length d. The lower bound is proved by introducing a novel variant of the partial derivatives measure called skewed partial derivatives, which found applications in other subsequent works. Improving this result to w^Ω(log d) size lower bound on multilinear formulas computing IMM would imply a superpolynomial separation between ABPs and arithmetic formulas. We also show an exponential separation between multilinear depth three and multilinear depth four formulas which was an improvement over the quasipolynomial separation already known. We also consider a restriction of multilinear formulas, called interval setmultilinear formulas computing IMM. Proving a superpolynomial size lower bound on interval setmultilinear formulas computing IMM would imply a superpolynomial separation between algebraic branching programs and homogeneous formulas in the noncommutative world. We make progress in this direction by giving a superpolynomial size lower bound on an interesting restriction of the interval setmultilinear formula computing IMM.
 Series: CSA Faculty Colloquium Title: Zerocost synchronization: Fact or fiction? RCU and other stories  Speaker: Prof. K Gopinath
Professor
Dept.of CSA
IISc  Date and Time: Friday, February 28, 2020, 4:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract With proliferation of commodity multicore systems, low cost synchronization is critical for scaling.
The readcopyupdate (RCU) mechanism has been used for this purpose since the last 4 decades. We analyse
this and a few other related mechanisms, and discuss our work in this context (such as memory management,
virtualization and fault tolerance).
Speaker Bio: K. Gopinath is a professor at Indian Institute of Science in the Computer Science and Automation Department. His research interests are primarily in the computer systems area (Operating Systems, Storage Systems, Systems Security).
Host Faculty: Prof. Sunil L Chandran & Prof. Shalabh Bhatnagar
 Series: Department Seminar Title: Towards automated debugging and optimization  Speaker: Dr Abhilash Jindal, CTO, Mobile Enerlytics, San Francisco
 Date and Time: Friday, February 28, 2020, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Debugging and optimization are largely adhoc manual processes taking up 3575 percent of programmers' time costing more than $100B annually. This process becomes further exacerbated in the modern programming paradigm where programmers stand on the "shoulder of giant" software frameworks to quickly program complex systems but have a limited understanding of the intricacies of the underlying system.
In the first part of this talk, we will see how new power management APIs on smartphones seriously encumber programming. These APIs, when not used correctly, give rise to a whole new class of energy bugs called sleep disorder bugs. These bugs plague the complete smartphone software stack including apps, framework, kernel, and device drivers. I will present a taxonomy of sleep disorder bugs with precise bug definitions which enabled us to create a suite of automated tools to identify different flavors of sleep disorder bugs. I will then present an automated static analysis tool KLOCK that identified 63 sleepinduced time bugs, a subclass of sleep disorder bugs, in the Android Linux kernel.
In the second part of the talk, we will study how the traditional profilers fall short in giving actionable diagnosis for optimizing programs. As even after being presented with performance hotspots, developers still need significant manual inspection and reasoning of the source code to proceed with the remaining optimization tasks: 1. Is there a more efficient implementation? 2. How to come up with a more efficient implementation? I will present a new optimization methodology called differential profiling that automates answering these two questions. In particular, differential profiling automatically uncovers more efficient API usage by leveraging existing implementations of similar apps which are bountiful in the app marketplace. Our differential energy profiling tool, DIFFPROF employs a novel treematching algorithm for comparing energy profiler outputs of pairs of similar apps and found 12 energy improvements in popular Android apps.
Speaker Bio: Abhilash Jindal received B.Tech. in Electrical Engineering from IIT Kanpur. He received his Ph.D. in Computer Science from Purdue University where he researched softwarebased approaches for extending mobile battery life. He has published papers in top system conferences including OSDI, ATC, Eurosys, Mobisys, Sigmetrics, and Mobicom. Currently, he is commercializing his research work serving as the CTO of Mobile Enerlytics, a silicon valley startup. His research interests include mobile systems, software engineering, and operating systems.
Host Faculty: Prof. Matthew Jacob
 Series: Ph.D. Thesis Defense Title: Deep Learning for BugLocalization and Program Repair  Speaker: Mr. Rahul Gupta
Ph.D Student
Dept. of CSA  Faculty Advisor: Prof. Aditya S Kanade
 Date and Time: Thursday, February 27, 2020, 9:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract In this thesis, we focus on the problem of program debugging and
present novel deep learning techniques for buglocalization and program
repair. Deep learning techniques have been successfully applied to a
variety of tasks in natural language processing over the years.
Although natural languages and programming languages are similar to
some extent, the latter have procedural interpretation and richer
structure. Applying deep learning techniques to programs presents many
novel challenges which arise due to these differences. We address some
of these challenges in this thesis.
Most of the existing program debugging research is dominated by formal
and theoryfirst approaches. These approaches fail to take advantage of
the existing codebases available online in the form of open source
software repositories and student assignment submissions to massive
open online courses on programming. Recognizing this, researchers have
begun to replace expertdesigned heuristics with models learned from
codebases to improve the performance of the conventional debugging
techniques. This thesis shows that it is possible to solve program
debugging problems directly from raw programs using deep learning
techniques in an endtoend manner. More specifically, we present three
approaches for buglocalization and program repair which are entirely
datadriven and learn to perform their task instead of following the
steps specified by a domain expert.
We first introduce the notion of common programming errors and present
a deep neural network based endtoend technique, called DeepFix, that
can fix multiple such errors in a program without relying on any
external tool to locate or fix them. At the heart of DeepFix is a
multilayered sequencetosequence neural network with attention
mechanism, comprising an encoder recurrent neural network (RNN) to
process the input and a decoder RNN with an attention mechanism that
generates the output. The network is trained on a labeled dataset to
predict a faulty program location along with the correct program
statement. Multiple errors in a program can be fixed by invoking
DeepFix iteratively. Our experiments demonstrate that DeepFix is
effective and fixes thousands of programs.
While repositories containing erroneous programs are easily available,
the labels of these programs (faulty program location and correct
statement) required by DeepFix are not easily available. Labeling a
large number of erroneous programs is a daunting task. To address this
issue, we propose a novel deep reinforcement learning based technique,
called RLAssist, that does not require labeled training data and still
matches the performance of DeepFix. At the core of RLAssist is a novel
programming language correction framework amenable to reinforcement
learning. The framework allows an agent to mimic human actions for text
navigation and editing. We demonstrate that the agent can be trained
through selfexploration directly from the raw input, that is, the
program text itself, without any prior knowledge of the formal syntax
of the programming language. Reinforcement learning techniques are,
however, usually slow to train. We also show that RLAssist can be
trained much faster with the help of expert demonstrations for as
little as onetenth of its training data, which also helps it in
achieving better performance than DeepFix.
Finally, we present a deep learning based technique for semantic
buglocalization in programs with respect to failing tests. The
proposed technique works in two phases. In the first phase, a novel
tree convolutional neural network is used to predict whether a program
passes or fails the given test. In the second phase, we query a
stateoftheart neural prediction attribution technique to find out
which lines of the programs make the network predict the failures to
localize the bugs. This is a static buglocalization technique and does
not require program instrumentation and multiple executions necessary
for the existing dynamic buglocalization techniques. Our experiments
show that the proposed technique is competitive with two
stateoftheart programspectrum based and one syntactic difference
based buglocalization baselines.
All the techniques proposed in this thesis are programming language
agnostic. We believe that the ideas and tools developed in this work
can potentially be a roadmap for future attempts at applying deep
learning techniques to more problems in software engineering research.
 Series: Department Seminar Title: Security and Privacy of Connected Autonomous Vehicles  Speaker: Dr Vireshwar Kumar, Purdue university
 Date and Time: Wednesday, February 26, 2020, 11:30 AM
 Venue: CSA Lecture Hall (Room No. 112, Ground Floor)
Abstract The upcoming smart transportation systems which consist of connected autonomous
vehicles, are poised to transform our everyday life. The sustainability and growth of these systems
to their full potential will significantly depend on the robustness of these systems against security
and privacy threats. Unfortunately, the communication protocols employed in these systems lack
mainstream network security capabilities due to energy constraints of the deployed platforms and
bandwidth constraints of the communication medium. In this talk, I will present the results of my
efforts in anatomizing the two vital communication protocols employed in the smart transportation:
(1) vehicletoeverything (V2X) communication protocol which is utilized to facilitate wireless
communication among connected vehicles, and (2) controller area network (CAN) protocol which
is utilized within an autonomous vehicle to enable realtime control of critical automotive
components including brakes. For each of these two protocols, I will first describe the inquisitive
approach which led to the discovery of the new security vulnerabilities. Then, through the
experiments on realworld systems, I will demonstrate how these vulnerabilities can be exploited
to launch malicious attacks which evade the stateoftheart defense mechanisms employed in
these systems. I will conclude the talk by discussing novel countermeasures which are required
to mitigate these fundamental vulnerabilities and prevent their exploitation.
Speaker Bio: Dr. Vireshwar Kumar is a Postdoctoral Research Associate in the Department of Computer
Science at Purdue University. Vireshwar earned his B.Tech. in Electrical Engineering at Indian
Institute of Technology Delhi in 2009, and Ph.D. degree in Computer Engineering at Virginia Tech
in 2016. He was the recipient of the outstanding Ph.D. student award by the Center for Embedded
Systems for Critical Applications at Virginia Tech. He also had a short stint as a Project Assistant
in the Department of Electrical Communication Engineering at Indian Institute of Science in 2010.
His research interests include discovering and mitigating security vulnerabilities in the
communication protocols employed in cyberphysical systems, e.g., smart home, smart
transportation and smart city. Vireshwar’s research work has featured in toptier security venues
including ACM Conference on Computer and Communications Security (CCS) and IEEE
Transactions on Information Forensics and Security (TIFS). He has also served on the TPC of
flagship conferences including IEEE Conference on Communications and Network Security
(CNS) and IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).
Host Faculty: Prof. Matthew Jacob
 Series: Ph.D. Colloquium Title: On the Round Complexity Landscape of Secure Multiparty Computation  Speaker: Ms. Divya Ravi
Ph.D student
Dept. of CSA  Faculty Advisor: Dr. Arpita Patra
 Date and Time: Wednesday, February 26, 2020, 10:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract In secure multiparty computation (MPC), n parties wish to jointly perform a computation on their private inputs in a secure way, so that no adversary A corrupting a coalition of t among the n parties can learn more information than their outputs (privacy), nor can they affect the outputs of the computation other than by choosing their own inputs (correctness). The round complexity of MPC protocols is a fundamental question in the area of secure computation and its study constitutes a phenomenal body of work in the MPC literature. The research goal of this thesis is to advance the state of the art by expanding this study of round complexity to various adversarial settings and network models. The questions addressed in the thesis are of both theoretical and practical importance.
In this talk, we first present a highlevel overview of our results which involves establishing new lower bounds on the round complexity of MPC under various settings and constructing upper bounds with optimal round complexity. We then elaborate on one of our recent results that focuses on the exact round complexity of fair and robust MPC protocols in a generalized adversarial setting.
Fairness (adversary obtains the output iff honest parties do) and Robustness (adversary cannot prevent honest parties from obtaining the output) are two of the most soughtafter properties of MPC protocols. Achieving both, however, brings in the necessary requirement that only upto minority of the parties can be actively corrupt (where actively corrupt parties are allowed to deviate from the protocol in any arbitrary manner). In a generalized adversarial setting where the adversary is allowed to corrupt both actively and passively (weaker adversarial model where corrupt parties follow protocol specifications but try to learn more information than they are supposed to know), the necessary bound for a nparty fair or robust protocol turns out to be t_a + t_p < n, where t_a, t_p denote the threshold for active and passive corruption with the latter subsuming the former. Subsuming active minority as a boundary special case, this setting, denoted as dynamic corruption, opens up a range of possible corruption scenarios for the adversary. While dynamic corruption includes the entire range of thresholds for (t_a, t_p) starting from (ceil(n/2) – 1, floor(n/2)) to (0, n − 1), the boundary corruption restricts the adversary only to the boundary cases of (ceil(n/2) – 1, floor(n/2)) and (0, n − 1). Notably, both corruption settings empower an adversary to control majority of the parties, yet ensuring the count on active corruption never goes beyond ceil(n/2) – 1. We target the round complexity of fair and robust MPC tolerating dynamic and boundary adversaries.
References:
[1] Arpita Patra, Divya Ravi. On the Power of Hybrid Networks in MultiParty Computation. IEEE Transactions on Information Theory 2018.
[2] Arpita Patra, Divya Ravi. On the Exact Round Complexity of Secure ThreeParty Computation. CRYPTO 2018.
[3] Megha Byali, Arun Joseph, Arpita Patra, Divya Ravi. Fast Secure Computation for Small Population over the Internet. ACM CCS 2018.
[4] Arpita Patra, Divya Ravi, Swati Singla. On the Exact Round Complexity of BestofbothWorlds Multiparty Computation. Under Submission.
[5] Arpita Patra, Divya Ravi. Beyond Honest Majority: The Round Complexity of Fair and Robust Multiparty Computation. ASIACRYPT 2019.
 Series: Ph.D. Colloquium Title: Stochastic Approximation in Optimization, Estimation and Reinforcement Learning  Speaker: Chandramouli Kamanchi
 Faculty Advisor: Prof. Shalabh Bhatnagar
 Date and Time: Monday, February 24, 2020, 4:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Stochastic approximation deals with the problem of finding zeros of a function expressed as an expectation of a random variable. In this thesis we study and analyze convergence of stochastic approximation algorithms in the context of optimization under uncertainty, statistical estimation and in the context of learning, in particular reinforcement learning. Moreover we also explore second order methods in the context of Reinforcement learning algorithms.
Stochastic optimization (SO) considers the problem of optimizing an objective function in the presence of noise. Most of the solution techniques in SO estimate gradients from the noise corrupted observations of the objective and adjust parameters of the objective along the direction of the estimated gradients to obtain locally optimal solutions. Two prominent algorithms in SO namely Random Direction KieferWolfowitz (RDKW) and Simultaneous Perturbation Stochastic Approximation (SPSA) obtain noisy gradient estimate by randomly perturbing all the parameters simultaneously. This forces the search direction to be random in these algorithms and causes them to suffer additional noise on top of the noise incurred from the samples of the objective. Owing to this additional noise, the idea of using deterministic perturbations instead of random perturbations for gradient estimation has also been studied. Two specific constructions of the deterministic perturbation sequence using lexicographical ordering and Hadamard matrices have been explored and encouraging results have been reported in the literature. In this thesis, we characterise the class of deterministic perturbation sequences that can be utilised in the RDKW algorithm. This class expands the set of known deterministic perturbation sequences available in the literature. Using our characterization we propose a construction of a deterministic perturbation sequence that has the least possible cycle length among all deterministic perturbations. Through simulations we illustrate the performance gain of the proposed deterministic perturbation sequence in the RDKW algorithm over the Hadamard and the random perturbation counterparts. We establish the convergence of the RDKW algorithm for the generalized class of deterministic perturbations utilizing stochastic approximation techniques.
In statistical estimation one of the popular measures of central tendency that provides better representation and interesting insights of the data compared to the other measures like mean and median is the metric mode.
If the analytical form of the density function is known, mode is an argument of the maximum value of the density function and one can
apply optimization techniques to find the mode. In many of the practical applications, the analytical form of the density is not known and only the samples from the distribution are available. Most of the techniques proposed in the literature for estimating the mode from the samples assume that all the samples are available beforehand. Moreover, some of the techniques employ computationally expensive operations like sorting. In this thesis we provide a computationally effective, online iterative algorithm that estimates the mode of a unimodal smooth density given only the samples generated from the density. Asymptotic convergence of the proposed algorithm using stochastic approximation techniques is provided. We also prove the stability of the mode estimates by utilizing the concept of regularisation. Experimental results further demonstrate the effectiveness of the proposed algorithm.
In a discounted reward Markov Decision Process (MDP), the objective is to find the optimal value function, i.e., the value function corresponding to an optimal policy. This problem reduces to solving a functional equation known as the Bellman equation and a fixed point iteration scheme known as the value iteration is utilized to obtain the solution. In literature, a successive overrelaxation based value iteration scheme is proposed to speedup the computation of the optimal value function. The speedup is achieved by constructing a modified Bellman equation that ensures faster convergence to the optimal value function. However, in many practical applications, the model information is not known and we resort to Reinforcement Learning (RL) algorithms to obtain optimal policy and value function. One such popular algorithm is Qlearning. In this paper, we propose Successive OverRelaxation (SOR) Qlearning. We first derive a modified fixed point iteration for SOR Qvalues and utilize stochastic approximation to derive a learning algorithm to compute the optimal value function and an optimal policy. We then prove the almost sure convergence of the SOR Qlearning to SOR Qvalues. Finally, through numerical experiments, we show that SOR Qlearning is faster compared to the standard Qlearning algorithm.
Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive at the optimal solution. Value iteration is a first order method and therefore it may take a large number of iterations to converge to the optimal solution. Successive relaxation is a popular technique that can be applied to solve a fixed point equation. It has been shown in the literature that, under a special structure of the MDP, successive overrelaxation technique computes the optimal value function faster than standard value iteration. In this work, we propose a second order value iteration procedure that is obtained by applying the NewtonRaphson method to the successive relaxation value iteration scheme. We prove the global convergence of our algorithm to the optimal solution asymptotically and show the second order convergence. Through experiments, we demonstrate the effectiveness of our proposed approach.
Speaker Bio: Chandramouli Kamanchi is a senior PhD student in the department of Computer Science
and Automation, IISc.
 Series: Department Seminar Title: Efficient Distance Approximation for Structured HighDimensional Distributions via Learning  Speaker: Dr. Arnab Bhattacharyya
(NUS Singapore, School of Computing)  Date and Time: Tuesday, February 18, 2020, 11:15 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract We design efficient distance approximation algorithms for several classes of structured
highdimensional distributions. Specifically, we show algorithms for the following problems:
– Given sample access to two Bayes networks P1 and P2 over known directed acyclic graphs
G1 and G2 having n nodes and bounded indegree, approximate dTV (P1, P2) to within additive
error ε using poly(n, ε) samples and time
– Given sample access to two ferromagnetic Ising models P1 and P2 on n variables with bounded
width, approximate dTV (P1, P2) to within additive error ε using poly(n, ε) samples and time
– Given sample access to two ndimensional gaussians P1 and P2, approximate dTV (P1, P2)
to within additive error ε using poly(n, ε) samples and time
– Given access to observations from two causal models P and Q on n variables that are
defined over known causal graphs, approximate dTV (Pa, Qa) to within additive error ε
using poly(n, ε) samples, where Pa and Qa are the interventional distributions obtained
by the intervention do(A = a) on P and Q respectively for a particular variable A.
Our results are the first efficient distance approximation algorithms for these wellstudied
problems. They are derived using a simple and general connection to distribution learning
algorithms. The distance approximation algorithms imply new efficient algorithms for tolerant
testing of closeness of the abovementioned structured highdimensional distributions.
(based on a joint work with Sutanu Gayen, Kuldeep S. Meel and N. V. Vinodchandran)
Host Faculty: Sunil Chandran L

