Events 

Seminars 

Prof. I. G. Sarma Memorial Lecture Series 

Prof. Priti Shankar Memorial Seminar Series 

Multimedia Archive 



UPCOMING SEMINARS 

Series: Ph.D. Thesis Defense Title: Concurrency Analysis and Mining Techniques for APIs  Speaker: Mr. Anirudh Santhiar
Ph.D student
Dept. of CSA  Faculty Advisor: Prof. Aditya S Kanade
 Date and Time: Friday, December 22, 2017, 10:30 AM
 Venue: CSA Faculty Room, (Room No. 101, Ground floor)
Abstract Software components expose Application Programming Interfaces (APIs) as a means to access their
functionality, and facilitate reuse. Developers use APIs supplied by programming languages to
access the core data structures and algorithms that are part of the language framework. They use the
APIs of thirdparty libraries for specialized tasks. Thus, APIs play a central role in mediating a
developer’s interaction with software, and the interaction between different software components.
However, APIs are often large, complex and hard to navigate. They may have hundreds of classes and
methods, with incomplete or obsolete documentation. They may encapsulate concurrency behaviour that
the developer is unaware of. Finding the right functionality in a large API, using APIs correctly,
and maintaining software that uses a constantly evolving API are challenges that every developer runs
into. In this thesis, we design automated techniques to address two problems pertaining to APIs (1)
Concurrency analysis of APIs, and (2) API mining. Speciﬁcally, we consider the concurrency analysis
of asynchronous APIs, and mining of math APIs to infer the functional behaviour of API methods.
The problem of concurrency bugs such as race conditions and deadlocks has been well studied for
multithreaded programs. However, developers have been eschewing a pure multithreaded programming model in
favour of asynchronous programming models supported by asynchronous APIs. Asynchronous programs and
multithreaded programs have different semantics, due to which existing techniques to analyze the latter
cannot detect bugs present in programs that use asynchronous APIs. This thesis addresses the problem of
concurrency analysis of programs that use asynchronous APIs in an endtoend fashion. We give
operational semantics for important classes of asynchronous and eventdriven systems. The semantics
are designed by carefully studying real software and serve to clarify subtleties in scheduling. We
use the semantics to inform the design of novel algorithms to ﬁnd races and deadlocks. We implement
the algorithms in tools, and show their effectiveness by ﬁnding serious bugs in popular opensource software.
To begin with, we consider APIs for asynchronous eventdriven systems supporting programmatic event
loops. Here, event handlers can spin event loops programmatically in addition to the runtime’s default
event loop. This concurrency idiom is supported by important
classes of APIs including GUI, web browser, and OS APIs. Programs that use these APIs are prone to
interference between a handler that is spinning an event loop and another handler that runs inside the
loop. We present the ﬁrst happensbefore based race detection technique for such programs. Next, we
consider the asynchronous programming model of modern languages like C#. In spite of providing primitives
for the disciplined use of asynchrony, C# programs can deadlock because of incorrect use of blocking
APIs along with nonblocking (asynchronous) APIs. We present the ﬁrst deadlock detection technique for
asynchronous C# programs. We formulate necessary conditions for deadlock using a novel program
representation that represents procedures and continuations, control ﬂow between them and the threads
on which they may be scheduled. We design a static analysis to construct the program representation
and use it to identify deadlocks. Our ideas have resulted in research tools with practical impact.
SparseRacer, our tool to detect races, found 13 previously unknown useafterfree bugs in KDE Linux
applications. DeadWait, our deadlock detector, found 43 previously unknown deadlocks in asynchronous
C# libraries. Developers have ﬁxed 43 of these races and deadlocks, indicating that our techniques
are useful in practice to detect bugs that developers consider worth ﬁxing.
Using large APIs effectively entails ﬁnding the right functionality and calling the methods that
implement it correctly, possibly composing many API elements. Automatically inferring the
information required to do this is a challenge that has attracted the attention of the research
community. In response, the community has introduced many techniques to mine APIs and produce
information ranging from usage examples and patterns, to protocols governing the API method calling
sequences. We show how to mine unit tests to match API methods to their functional behaviour, for the
speciﬁc but important class of math APIs.
Math APIs are at the heart of many application domains ranging from machine learning to scientific
computations, and are supplied by many competing libraries. In contrast to obtaining usage examples
or identifying correct call sequences, the challenge in this domain is to infer API methods required
to perform a particular mathematical computation, and to compose them correctly. We let developers
specify mathematical computations naturally, as a math expression in the notation of interpreted
languages (such as Matlab). Our unit test mining technique maps subexpressions to math API methods
such that the method’s functional behaviour matches the subexpression’s executable semantics, as
deﬁned by the interpreter. We apply our technique, called MathFinder, to math API discovery and
migration, and validate it in a user study. Developers who used MathFinder ﬁnished their programming
tasks twice as fast as their counterparts who used the usual techniques like web and code search,
and IDE code completion. We also demonstrate the use of MathFinder to assist in the migration of Weka,
a popular machine learning library, to a different linear algebra library.
 Series: CSA Distinguished Lecture Title: Planar Graph Perfect Matching is in NC  Speaker: Professor Vijay Vazirani
University of California
Irvine
USA  Date and Time: Thursday, December 21, 2017, 4:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Is matching in NC, i.e., is there a deterministic fast parallel
algorithm for it? This has been an outstanding open question in TCS for
over three decades, ever since the discovery of Random NC matching
algorithms. Within this question, the case of planar graphs has remained
an enigma: On the one hand, counting the number of perfect matchings is
far harder than finding one (the former is #Pcomplete and the latter is
in P), and on the other, for planar graphs, counting has long been
known to be in NC whereas finding one has resisted a solution!
The case of bipartite planar graphs was solved by Miller and Naor in
1989 via a flowbased algorithm. In 2000, Mahajan and Varadarajan gave
an elegant way of using counting matchings to finding one, hence giving
a different NC algorithm.
However, nonbipartite planar graphs still didn't yield: the stumbling
block being odd tight cuts. Interestingly enough, these are also a key
to the solution: a balanced odd tight cut leads to a straightforward
divide and conquer NC algorithm. The remaining task is to find such a
cut in NC. This requires several algorithmic ideas, such as finding a
point in the interior of the minimum weight face of the perfect matching
polytope and uncrossing odd tight cuts. Joint work with Nima Anari.
Speaker Bio: Dr. Vazirani is distinguished professor of computer science in the
Donald Bren School of Information and Computer Sciences at the
University of California, Irvine. He is a leading researcher in
algorithm design, and more generally, in the theory of computing.
Throughout his research career, he has demonstrated a consistent ability
to obtain novel algorithmic ideas, frequently of substantial
mathematical depth, which while solving the problem at hand, also lay
the groundwork for future contributions by others.
Dr. Vazirani's research career spans over twenty five years. During the
first ten years, he made seminal contributions to the classical maximum
matching problem which has historically played a central role in the
development of the theory of algorithms. He discovered, jointly with
other researchers, the fastest known sequential and parallel algorithms
for his problem. Over the next ten years, Vazirani focused on
approximation algorithms for NPhard problems and had much influence on
this area through work on several of its fundamental problems. In 2001,
he published a book on this topic. He is currently working in
algorithmic game theory, in particular on algorithms for computing
market equilibria.
Host Faculty: Dr. Siddharth Barman
 Series: Ph.D. Thesis Defense Title: Efficient Static Analyses for Concurrent Programs  Speaker: Suvam Mukherjee
 Faculty Advisor: Deepak D'Souza
 Date and Time: Wednesday, December 20, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Concurrent programs are pervasive owing to the increasing adoption of
multicore systems across the entire computing
spectrum. Unfortunately, writing correct and efficient concurrent
programs is a hard endeavour. Automatically reasoning about the
behaviours of such programs is harder still  a very large number of
behaviours are possible due to interactions between the threads. Thus,
concurrent programs provide fertile grounds for a large class of
insidious defects, which are difficult to detect and can cause
failures.
Static analysis techniques infer semantic properties of programs
without executing them. They are attractive because they are sound
(they can guarantee the absence of bugs), can execute with a fair
degree of automation, and do not depend on test cases. However,
current static analyses techniques for concurrent programs are either
precise and prohibitively slow, or fast but imprecise. In this
thesis, we partially address this problem by designing novel efficient
static analyses for concurrent programs.
In the first part of the thesis, we provide a framework for designing
and proving the correctness of data flow analyses that target data
race free (DRF) programs, and which are in the same spirit as the
"syncCFG" analysis originally proposed in [De2011]. To achieve this,
we first propose a novel concrete semantics for DRF programs called
LDRF, that is threadlocal in nature with each thread operating on
its own copy of the data state. We show that abstractions of our
semantics allow us to reduce the analysis of DRF programs to a
sequential analysis. This aids in rapidly porting existing sequential
analyses to scalable analyses for DRF programs. Next, we parameterize
the semantics with a partitioning of the program variables into
"regions" which are accessed atomically. Abstractions of the
regionparameterized semantics yield more precise analyses for
"regionrace free" concurrent programs. We instantiate these
abstractions to devise efficient relational analyses for race free
programs, which we have implemented in a prototype tool called
RATCOP. On the benchmarks, RATCOP was able to prove up to 65% of the
assertions, in comparison to 25% proved by a version of the analysis
from [De2011]. In a comparative study with a recent abstract
interpretation based concurrent static analyser, RATCOP was up to 5
orders of magnitude faster on a set of looselycoupled concurrent
programs.
In the second part of the thesis, we provide techniques to detect all
"highlevel" data races in an RTOS kernel. A highlevel race occurs
when an execution interleaves instructions corresponding to
userannotated critical accesses to shared memory structures. Such
races are good indicators of atomicity violations. We propose a
technique for detecting all highlevel dataraces in a system library,
like the kernel API of a realtime operating system (RTOS) that relies
on flagbased scheduling and synchronization. Our methodology is based
on modelchecking, but relies on a metaargument to bound the number
of task processes needed to orchestrate a race. We describe our
approach in the context of FreeRTOS, a popular RTOS in the embedded
domain. Using our technique, a user is able to soundly disregard 99.8%
of an estimated 41,000 potential highlevel races. Our tool detected
38 highlevel data races in FreeRTOS, out of which 16 were harmful.
Speaker Bio: Suvam Mukherjee submitted his PhD thesis in the Department of Computer Science and Automation, IISc.
Host Faculty: Deepak D'Souza


PAST SEMINARS 

Series: Ph.D. Colloquium Title: Algorithms for Multilingual IR in Low Resource Languages using Weakly Aligned Corpora  Speaker: Goutham Tholpadi
 Faculty Advisor: Prof. Chiranjib Bhattacharyya & Prof. Shirish K Shevade
 Date and Time: Friday, December 15, 2017, 10:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract A large section of the world population is multilingual, and a large proportion of the digitized textual content generated today is in nonEnglish languages. As a result, multilingual information retrieval (MLIR) has become a real and pressing need. Most MLIR methods rely on machine translation and other linguistic resources such as dictionaries, taggers, parsers, etc. However, most of these resources are unavailable in low resource languages, e.g. in many Indian languages. For many MLIR problems, an alternative approach that has shown some success is to map all texts to an abstract semantic space of concepts where similarity can be computed uniformly across languages. These concepts are induced using statistical models learned from multilingual "weakly aligned" corpora which are available even in low resource languages. A number of models have been proposed for learning topical correspondence from weakly aligned corpora that are documentaligned. However, in recent times, there is a growing interest in modeling topical correspondence from user generated content (UGC) data from social networks, commenting widgets, etc. A general pattern seen in such data is the presence of a main object (such as an article, post, tweet, image, etc.) and a collection of dependent objects (such as comments, replies, retweets, tags, etc.). However very little work has been done on topical correspondence models for corpora where the dependent objects are multilingual. This task is made more challenging due to the presence of romanization, sparsity, and noise.
Topic models are often used to compute topical affinity between textual units, which is a core task in most MLIR problems. However, the topic models that can be learned from weak aligment are not directly applicable to the computation of all kinds of topical affinities. In particular, the size of the textual units involved in the affinity computation has a significant bearing on the applicability of a model, and on the method of application.
In this thesis, we focus on low resource languages and develop a series of hierarchical Bayesian models for capturing topical correspondence in multilingual newscomment corpora. The final model, called Multiglyphic Correspondence Topic Model (MCTM), captures topical correspondence between comments and the article, between comments and specific sections of the article, among the comments themselves, and between comments and the other articles. The model handles problems of data sparsity, noise and comment romanization and is shown to be a better fit for articlecomment data when compared to existing models.
From an application perspective, we consider three different MLIR problems corresponding to different levels with respect to the size of the textual units involved in the topical affinity computation step. The first MLIR problem (at the level of single words) involves inducing translation correspondences in arbitrary language pairs using Wikipedia data. We present an approach for translation induction that leverages Wikipedia corpora in auxiliary languages using the notion of translingual themes. We propose extensions that enable the computation of cross lingual semantic relatedness between words and apply the method to the task of crosslingual Wikipedia title suggestion. The second MLIR problem (at the level of sentences or paragraphs) involves the detection of topical relevance between specific portions of articles and comments. Using MCTM, we devise methods for topical categorization of comments with respect to the article. In the third MLIR problem (at the level of document collections), the task is to generate keyword summaries for a cluster of documents. We develop an architecture for performing Scatter/Gather on multilingual document collections and a labeling procedure that generates keyword labels onthefly to summarize a multilingual document cluster in any language.
 Series: Department Seminar Title: Leveraging Plan Recosting for Online Optimization
of Parameterized Queries  Speaker: Dr. Anshuman Dutt
Microsoft Research
USA  Date and Time: Wednesday, December 13, 2017, 4:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Optimization of parameterized queries is a wellstudied problem. For each
incoming instance of a parameterized query, the challenge is to decide
whether to optimize it afresh or reuse one of the previously stored plans
for the same parameterized query. For each mistaken decision, we either
pay unnecessary optimization overhead or execute a suboptimal execution
plan. Most existing solutions handle this challenge by identifying and
storing a set of execution plans such that at least one of them is close
to optimal for any future query instance. Later, when a query instance
arrives, they rely on an efficient mechanism to select the most suitable
execution plan. All such approaches fall under the term "Parametric
Query Optimization" (PQO). Initial research efforts for PQO focused on
an offline version of the problem where the set of all required plans are
identified and stored even before the first instance arrives. Since the
high startup cost of offline solution is not always acceptable, recently
there have been attempts to handle this challenge in an online fashion,
which is also the focus of this work.
An ideal solution to online PQO would ensure that (a) the suboptimality
value for each query instance is guaranteed to be small; (b) only a small
fraction of query instances are optimized, and (c) the set of stored
plans is small and can adapt to changes in query distribution. Existing
solutions to online PQO either provide suboptimality guarantee or save
optimizer overhead. Also, none of these techniques provide a mechanism
to control the number of stored plans. In this talk, we start with
an overview of existing solutions to PQO and then present our solution
technique that perform well on all three metrics. Our solution is based on
"plan recosting", a technique that enables us to evaluate performance
of an existing plan for a new query instance, and at least 10x faster
than fresh optimization. We show the effectiveness of our solution with
empirical results on industry benchmark and realworld query workloads
using a modified version of Microsoft SQL Server optimizer.
(Talk based on research paper published in ACM SIGMOD 2017)
(Joint work with Vivek Narasayya and Surajit Chaudhuri at Microsoft Research, Redmond.)
Speaker Bio: Anshuman Dutt is a postdoctoral researcher in the DMX group
at Microsoft Research, Redmond. Previously, he completed Ph.D. from
Dept. of Computer Science and Automation, Indian Institute of Science,
Bangalore in 2016.
Host Faculty: Prof. Jayant Haritsa
 Series: Department Seminar Title: An Introduction to Qualcomm Datacenter Technologies & Centriq 2400 Processor  Speaker: Trey Cain, Luke Yen, Niket Choudhary, Qualcomm Datacenter Technologies
 Date and Time: Friday, December 08, 2017, 11:00 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract <tcain@qti.qualcomm.com> <lyen@qti.qualcomm.com>; <niketc@qti.qualcomm.com>
In this talk we will provide introduction to Qualcomm’s Datacenter business unit and present details of the first commercially available ARM based manycore server chip, CentriqTM 2400, produced by the team, which will be competing head to head against Intel’s server business. Our team brings decade of experience delivering highperformance, powerefficient ARM CPU architectures for mobile platform Qualcomm SnapdragonTM SoCs and now using its expertise to disrupt the datacenter market segment by architecting microprocessors targeted to scaleout performance/efficiency for cloud applications. The CentriqTM 2400 focuses on true server class features and performance with aggressive power management techniques. The talk will provide a basic microarchitecture overview of the Falkor™ CPU, the brain of CentriqTM 2400, and the overall SoC design.
The talk will also introduce the Architecture team within QDT and how we fit into the architecture definition, pathfinding activities, and performance/power modeling. Finally it should briefly touch upon some of the interesting design challenges that the team is tackling for next gen server chip.
Host Faculty: R. Govindarajan
 Series: Ph.D. Colloquium Title: Structured Regularization Through Convex Relaxations of Discrete Penalties.  Speaker: Raman Sankaran
 Faculty Advisor: Prof. Chiranjib Bhattacharyya
 Date and Time: Thursday, December 07, 2017, 9:00 AM
 Venue: CSA Lecture Hall (Room No. 117, Ground Floor)
Abstract Motivation.
Empirical risk minimization(ERM) is a popular framework for learning predictive models from data, which has been used in various domains such as computer vision, text processing, bioinformatics, neurobiology, temporal point processes, to name a few. We consider the cases where one has apriori information regarding the model structure, the simplest one being the sparsity of the model. The desired sparsity structure can be imputed into ERM problems using a regularizer, which is typically a norm on the model vector. Popular choices of the regularizers include the $ell_1$ norm (LASSO) which encourages sparse solutions, block$ell_1$ which encourages block level sparsity, among many others which induce more complicated structures. To impute the structural prior, recently, many studies have considered combinatorial functions on the model's support to be the regularizer. But this leads to an NPhard problem, which thus motivates us to consider convex relaxations of the corresponding combinatorial functions, which are tractable.
Existing work and research gaps.
The convex relaxations of combinatorial functions have been studied recently in the context of structured sparsity, but they still lead to inefficient computational procedures in general cases: e.g., even when the combinatorial function is submodular, whose convex relaxations are well studied and easier to work with than the general ones, the resulting problem is computationally expensive (the proximal operator takes $O(d^6)$ time for $d$ variables). Hence, the associated high expenses have limited the research interest towards these these regularizers, despite the submodular functions which generate them are expressive enough to encourage many of the structures such as those mentioned before.
Hence it remains open to design efficient optimization procedures to work with the submodular penalties, and with combinatorial penalties in general. It is also desirable that the optimization algorithms designed to be applicable across the possible choices of the loss functions arising from various applications. We identify four such problems from these existing research gaps, and address them through the contributions which are listed below. We provide the list of publications related to this thesis following this abstract.
Contributions.
First, we propose a novel kernel learning technique termed as textbf{Variable sparsity kernel learning} (VSKL) for support vector machines (SVM), which are applicable when there is an apriori information regarding the grouping among the kernels. In such cases, we propose a novel mixednorm regularizer, which encourages sparse selection of the kernels within a group while selecting all groups. This regularizer can also be viewed as the convex relaxation of a specific discrete penalty on the model's support. The resulting nonsmooth optimization problem is difficult, where offtheshelf techniques are not applicable. We propose a mirror descent based optimization algorithm to solve this problem, which has a guaranteed convergence rate of $O(1/sqrt{T})$ over $T$ iterations. We demonstrate the efficiency of the proposed algorithm in various scaling experiments, and applicability of the regularizer in an object recognition task.
Second, we introduce a family of regularizers termed as textbf{Smoothed Ordered Weighted L1 (SOWL) norms}, which are derived as the convex relaxation of nondecreasing cardinalitybased submodular penalties, which form an important special class of the general discrete penalties. Considering linear regression, where the presence of correlated predictors cause the traditional regularizers such as Lasso to fail recovering the true support, SOWL has the property of selecting the correlated predictors as groups. While Ordered Weighted $ell_1$ (OWL) norms address similar use cases, they are biased to promote undesirable piecewise constant solutions, which SOWL does not have. SOWL is also shown equivalent to grouplasso, but without specifying the groups explicitly. We develop efficient proximal operators for SOWL, and illustrate its computational and theoretical benefits through various simulations.
Third, we discuss textbf{HawkesOWL}, an application of OWL regularizers for the setting of multidimensional Hawkes processes. Hawkes process is a multidimensional point process (collection of multiple event streams) with self and mutual influences between the event streams. While the popular $ell_1$ regularizer fails to recover the true models in the presence of strongly correlated event streams, OWL regularization address this issue and groups the correlated predictors. This is the first instance in the literature, where OWL norms, which predominantly have been studied with respect to simple loss functions such as the squared loss, are extended to the Hawkes process with similar theoretical and computational guarantees.
In the fourth part, we discuss generic firstorder algorithms for learning with textbf{Subquadraic norms}, a special subfamily of convex relaxations of submodular penalties. We consider subquadratic norms regularization in a very general setting, covering all loss functions, and propose different reformulations of the original problem. The reformulations enable us to propose two different primaldual algorithms, CP$eta$ and ADMM$eta$, both of which having a guaranteed convergence rate of $O(1/T)$. This study thus provides the first ever algorithms with efficient convergence rates for learning with subquadratic norms.
 Series: Department Seminar Title: Constrained Counting and Sampling: Bridging the Gap between Theory and Practice  Speaker: Kuldeep Meel
 Date and Time: Monday, December 04, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Constrained counting and sampling are two fundamental problems in
Computer Science with numerous applications, including network
reliability, privacy, probabilistic reasoning, and constrainedrandom
verification. In constrained counting, the task is to compute the
total weight, subject to a given weighting function, of the set of
solutions of the given constraints. In constrained sampling, the task
is to sample randomly, subject to a given weighting function, from the
set of solutions to a set of given constraints. In this talk, I will
introduce a novel algorithmic framework for constrained sampling and
counting that combines the classical algorithmic technique of
universal hashing with the dramatic progress made in Boolean reasoning
over the past two decades. This has allowed us to obtain breakthrough
results in constrained sampling and counting, providing a new
algorithmic toolbox in machine learning, probabilistic reasoning,
privacy, and design verification. I will demonstrate the utility of
the above techniques on various real applications including
probabilistic inference, design verification and our ongoing
collaboration in estimating the reliability of critical infrastructure
networks during natural disasters.
Speaker Bio: Kuldeep Meel is starting as Assistant Professor at National University
of Singapore. His research broadly lies at the intersection of
artificial intelligence and formal methods. He is the recipient of a
201617 IBM PhD Fellowship, the 201617 Lodieska Stockbridge Vaughn
Fellowship and the 2014 Vienna Center of Logic and Algorithms
International Outstanding Masters thesis award.
Host Faculty: Deepak D'Souza
 Series: Department Seminar Title: Complexity measures of Boolean functions  Speaker: Dr. Swagato Sanyal
Postdoctoral Researcher
Center for Quantum Technologies (NUS) & NTU
Singapore  Date and Time: Thursday, November 30, 2017, 4:00 PM
 Venue: CSA Lecture Hall (Room No. 117, Ground Floor)
Abstract In this talk, I shall introduce the query model and various query complexity measures of Boolean functions.
I shall present a few recent results on the role of randomness in the query model, separation amongst
various query complexity measures of Boolean functions, and a composition theorem for randomized query complexity.
Speaker Bio: Swagato Sanyal is currently a postdoctoral researcher at Center for Quantum Technologies (NUS) & NTU Singapore. He finished his PhD from TIFR Mumbai in Jan 2017 where he won the TIFR Alumni AssociationSasken Award in Computer Science (Best Thesis Award) for the year 201617. His research interests are in theoretical computer science in general, and computational complexity theory in particular. He has worked in in Analysis of Boolean functions, Complexity Measures of Boolean functions and Communication Complexity.
Host Faculty: Dr. Arnab Bhattacharyya
 Series: M.Sc. (Engg) Thesis Defense Title: Towards a characterization of the symmetries of the NisanWigderson polynomial family  Speaker: Mr. Nikhil Gupta
M.Sc. (Engg)student
Dept. of CSA
IISc  Faculty Advisor: Dr. Chandan Saha
 Date and Time: Thursday, November 30, 2017, 11:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Understanding the structure and complexity of a polynomial family is a fundamental problem
of arithmetic circuit complexity. There are various approaches like studying the lower bounds,
which deals with finding the smallest circuit required to compute a polynomial, studying the
orbit and stabilizer of a polynomial with respect to an invertible transformation etc to do this.
We have a rich understanding of some of the well known polynomial families like determinant,
permanent, IMM etc. In this thesis we study some of the structural properties of the polyno
mial family called the NisanWigderson polynomial family. This polynomial family is inspired
from a well known combinatorial design called NisanWigderson design and is recently used to
prove strong lower bounds on some restricted classes of arithmetic circuits
([KayalLimayeSahaSrinivasan(14)], [KayalSahaSaptharishi(14)],[KayalSahaTavenas(16)]).
But unlike determinant, permanent, IMM etc, our understanding of the NisanWigderson polynomial family is
inadequate. For example we do not know if this polynomial family is in VP or VNP complete or
VNPintermediate assuming VP not equal to VNP, nor do we have an understanding of the complexity of
its equivalence test. We hope that the knowledge of some of the inherent properties of NisanWigderson
polynomial like group of symmetries and Lie algebra would provide us some insights in this regard.
An invertible square matrix A of size n^2 is called a symmetry of an nvariate polynomial f if
f(A·X) = f(X), where X is the set of n variables. The set of symmetries of f forms a subgroup
of general linear group, which is also known as group of symmetries of f, denoted G_f .
A vector space is attached to G_f to get the complete understanding of the symmetries of f.
This vector space is known as the Lie algebra of group of symmetries of f (or Lie algebra of f),
represented as L_f . Lie algebra of f contributes some elements of G_f, known as continuous
symmetries of f. Lie algebra has also been instrumental in designing efficient randomized
equivalence tests for some polynomial families like determinant, permanent, IMM etc
([Kayal(11)], [KayalNairSahaTavenas(17)]).
In this work we completely characterize the Lie algebra of the NisanWigderson polynomial family.
We show that L_NW contains diagonal matrices of a specific type. The Lie algebra L_NW turns
out to be a proper linear subspace of L_Perm , the Lie algebra of the symbolic Permanent
polynomial. The knowledge of L_NW not only helps us to completely figure out the continuous
symmetries of the NisanWigderson polynomial family, but also gives some crucial insights into
the other symmetries of the NisanWigderson polynomial (i.e. the discrete symmetries). Thereafter
using the Hessian matrix of the NisanWigderson polynomial and the concept of evaluation dimension, we
are able to almost completely identify the structure of G_NW . In particular we prove that any A in G_NW is
a product of diagonal and permutation matrices of certain kind that we call blockpermuted permutation matrix.
Finally, we give explicit examples of nontrivial blockpermuted permutation matrices using the automorphisms
of finite field that establishes the richness of the discrete symmetries of the NisanWigderson polynomial family.
 Series: Department Seminar Title: Scalable Dynamic Analysis of Large Linear Hybrid Systems  Speaker: Sridhar Parasala Duggirala
 Date and Time: Tuesday, November 28, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Safety critical CyberPhysical Systems (CPS) such as
airtraffic control systems, autonomous vehicles, medical devices,
etc. should always satisfy their respective safety
specification. Violation of such safety specification can lead to loss
of property or life. One of the widely used technique for ensuring
that a CPS satisfies it safety requirements is to model the CPS as a
hybrid automaton and apply formal verification techniques. This hybrid
automaton captures the evolution of both the discrete and continuous
components of the CPS. A widely used subclass of hybrid automata is
linear hybrid automata where the continuous evolution of the system is
given as solutions of linear differential equations. This talk will be
about a new verification technique for linear hybrid automata.
In this talk, I will present a new development in dynamic analysis
technique for linear hybrid system verification. In this technique,
verification of an an ndimensional system requires using a mere n+1
sample simulations. This technique greatly improves the scalability of
verification. In this talk, I will present the basic algorithm for
efficient dynamic analysis, its enhancements, and present verification
results for systems with dimensions ranging from 10  10,000.
Speaker Bio: Sridhar Parasara Duggirala is an Assistant Professor in the
Department of Computer Science and Engineering at University of
Connecticut (UConn). His main research interests are in CyberPhysical
Systems, Formal Methods, and Control Theory. His paper titled "Safety
Verification of Linear Control Systems" won the ACM SIGBED best paper
award at the International Conference on Embedded Software (conducted
as part of ESWeek) 2013. He has received his PhD (in 2015) from
University of Illinois at Urbana Champaign (UIUC) and his B.Tech (in
2009) from Indian Institute of Technology Guwahati.
Host Faculty: Deepak D'Souza
 Series: M.Sc. (Engg) Thesis Defense Title: COMPILATION OF GRAPH ALGORITHMS FOR HYBRID, CROSSPLATFORM AND DISTRIBUTED ARCHITECTURES  Speaker: Parita Patel
 Faculty Advisor: Y.N. Srikant
 Date and Time: Monday, November 27, 2017, 10:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Graph algorithms are abundantly used in various disciplines. These algorithms perform poorly due to random memory access and negligible spatial locality. In order to improve performance, parallelism exhibited by these algorithms can be exploited by leveraging modern high performance parallel computing resources. Implementing graph algorithms for these parallel architectures requires manual thread management and memory management which becomes tedious for a programmer.
Large scale graphs cannot fit into the memory of a single machine. One solution is to partition the graph either on multiple devices of a single node or on multiple nodes of a distributed network. All the available frameworks for such architectures demand unconventional programming which is difficult and error prone.
To address these challenges, we propose a framework for compilation of graph algorithms written in an intuitive graph domainspecific language, Falcon. The framework targets shared memory parallel architectures, computational accelerators and distributed architectures (CPU and GPU cluster). First, it analyses the abstract syntax tree (generated by Falcon) and gathers essential information. Subsequently, it generates optimized code in OpenCL for sharedmemory parallel architectures and computational accelerators, and OpenCL coupled with MPI code for distributed architectures. Motivation behind generating OpenCL code is its platformagnostic and vendoragnostic behavior, i.e., it is portable to all kinds of devices. Our framework makes memory management, thread management, message passing, etc., transparent to the user. None of the available domainspecific languages, frameworks or parallel libraries handle portable implementations of graph algorithms.
Experimental evaluations demonstrate that the generated code performs comparably to the stateoftheart nonportable implementations and handtuned implementations.The results also show portability and scalability of our framework.
Speaker Bio: Parita Patel is an M.Sc(Engg.) student in the CSA department.
Host Faculty: Y.N. Srikant
 Series: M.Sc. (Engg) Thesis Defense Title: On the gap between outcomes of voting rules  Speaker: Ms. Anurita Mathur
M.Sc. student
Dept. of CSA  Faculty Advisor: Dr. Arnab Bhattacharyya
 Date and Time: Friday, November 24, 2017, 2:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Through the course of history, a variety of voting rules have been used to
determine election outcomes. The traditional way in social choice theory to
evaluate a voting rule is by checking whether it satisfies axioms deemed to
be desirable for voting rules. Celebrated results in social choice theory
say that even for a small set of seemingly necessary axioms, no voting rule
exists satisfying them. In the face of such impossibility results, it
becomes challenging to justify why certain voting rules work well in
practice. Although in theory, these rules may yield drastically different
outcomes, for realworld datasets, behavioural social choice analyses have
found that the rules are often in perfect agreement with each other! This
work attempts to give a mathematical explanation of this phenomenon.
In this work, we formulate a quantitative approach towards comparing voting
rules by viewing them as two players engaged in a zerosum game. If rule A
selects candidate c_1 while rule B selects candidate c_2, the payoff for A
is the number of voters who prefer c_1 to c_2 minus the number who prefer
c_2 to c_1.The optimal voting rule according to this criterion
(corresponding to the optimal randomized strategy for the game) is the
"gametheoretic rule" (GT) while the best deterministic strategy is the
wellknown Minimax voting rule. We investigate rigorously how various
common voting rules fare against each other in terms of the minimum payoff
they receive for arbitrary voting profiles. We also study the case when the
voting profiles are drawn from a mixture of multinomial logit
distributions. This scenario corresponds to a population consisting of a
small number of groups, each voting according to a latent preference
ranking. We supplement the theoretical findings by empirically comparing
the payoff of voting rules when they are applied to user preferences for
movies as determined from the Netfix competition dataset. On this dataset,
we find that the Minimax rule, the Schulze rule, and a deterministic
variant of the GT rule perform the best in our framework.
 Series: Ph.D. Thesis Defense Title: Stochastic Recursive Inclusions: Theory and Applications  Speaker: Mr. Arunselvan R
Ph.D student
Dept. of C.S.A  Faculty Advisor: Prof. Shalabh Bhatnagar
 Date and Time: Friday, November 24, 2017, 11:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Stochastic approximation algorithms encompass a class of iterative schemes that converge to a sought value
through a series of successive approximations, even with erroneous observations. Such algorithms
naturally arise in reinforcement learning, game theory, optimization, financial mathematics and cyberphysical
systems. The first stochastic approximation algorithm was developed by Robbins and Monro in 1951 to solve
the rootfinding problem. In 1977 Ljung showed that the asymptotic behavior of a stochastic approximation
algorithm can be studied by associating a deterministic ODE, called the associated ODE, and studying it's
asymptotic behavior instead. In 1996 Benaim and Benaim and Hirsch used the dynamical systems approach
to develop a framework to analyze generalized stochastic approximation algorithms. One bottleneck
in deploying this framework was the requirement on stability (almost sure boundedness) of the iterates. In 1999
Borkar and Meyn developed a unified set of assumptions that guaranteed both stability and convergence of
stochastic approximations. However, the aforementioned frameworks did not account for scenarios with
setvalued mean fields. In 2005 Benaim, Hofbauer and Sorin showed that the dynamical systems approach to
stochastic approximations can be extended to scenarios with setvalued meanfields. Again, stability of the
iterates was assumed.
The main focus of this talk will be on stochastic approximation methods with setvalued dynamical systems
(stochastic recursive inclusions), with applications to reinforcement learning and optimization. In the first
part of this talk we will discuss an extension of the BorkarMeyn theorem to the scenario of stochastic
recursive inclusions. As an application of this framework we will discuss a solution
to the `approximate drift problem'. In the second part of this talk we will discuss the problem of gradient
methods with errors. We will discuss a framework for the stability and convergence of gradient methods,
even when the errors are `nondiminishing'. Our work extends the results of Bertsekas and Tsitsiklis, and
Mangasarian and Solodov. Using our framework, we present a fast and hasslefree implementation of
Simultaneous Perturbation Stochastic Approximation (SPSA), using constant sensitivity parameters. It is
worth noting that our implementation `decouples' the sensitivity parameters and stepsizes.
We will also present experimental results in support of our theory. In the third part of this talk, we will
discuss stochastic recursive inclusions with twotimescales. We use our framework to develop a two
timescale algorithm to solve the constrained Lagrangian dual optimization problem. Due to the generality of
our framework, the class of objective and constraint functions is enlarged, when
compared to previous literature. In the final part of this talk, we will discuss a framework for stability
and convergence of stochastic approximations with ‘controlled Markov processes’. Using this framework we
present a relaxation of the assumptions for the analysis of temporal difference learning algorithms (an
important policy evaluation scheme in reinforcement learning). It may be noted that our assumptions for
stability (of temporal difference learning) are compatible with hitherto present frameworks. Further it may be
noted that the conditions are weakened twofold: (a) the Markov process is no longer required to evolve in a
finite state space and (b) the state process is not required to be ergodic under a given stationary policy.
 Series: Department Seminar Title: Concentration Bounds for Stochastic Approximation with Applications to Reinforcement Learning  Speaker: Mr. Gugan Thoppe
 Date and Time: Thursday, November 23, 2017, 3:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Stochastic Approximation (SA) is useful when the aim is to find optimal points, or zeros of a function, given only its noisy estimates. In this talk, we will review our recent advances in techniques for SA analysis. This talk has four major parts. In the first part, we will see a motivating application of SA to network tomography. Also, we shall discuss the convergence of a novel stochastic Kaczmarz method. In the second part, we shall discuss a novel tool based on the Alekseev's formula to obtain the rate of convergence of a nonlinear SA to a specific solution, given that there are multiple locally stable solutions. In the third part, we shall extend the previous tool to the two timescale but linear SA setting. We shall also discuss how this tool applies to gradient Temporal Difference (TD) methods such as GTD(0), GTD2, and TDC used in reinforcement learning. For the analyses in the second and third parts to hold, the initial step size must be chosen sufficiently small, depending on unknown problemdependent parameters. This is often impractical. In the fourth part, we shall discuss a trick to obviate this in context of the one timescale, linear TD(0) method. We strongly believe that this trick is generalizable. We also provide here a novel expectation bound. We shall end with some future directions.
Speaker Bio: Gugan Thoppe did his Ph.D. in Systems Science under Prof. Vivek Borkar at TIFR, Mumbai, INDIA. His work won the TAASasken best thesis award for 2017. He is also a two time recipient of the IBM Ph.D. fellowship award (2013–14) and (201415). He recently finished a two year postdoc position with Prof. Robert Adler at the Technion, Haifa, ISRAEL. From December, he will starting another postdoc with Prof. Sayan Mukherjee at Duke University, North Carolina, USA. His research interests include stochastic approximation and its applications to learning algorithms, and stochastic algebraic topology.
Host Faculty: Prof. Shalabh Bhatnagar
 Series: Department Seminar Title: Accountability for Social Values in DataDriven Systems  Speaker: Prof. Anupam Datta
Carnegie Mellon University
USA  Date and Time: Wednesday, November 22, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Our vision is to enable datadriven systems to be accountable for social values, such as privacy
and fairness. We use the term “accountable” to refer to computational mechanisms that can be
used to “account for” behaviors of systems to support detection of privacy and fairness
violations, as well as explain how they came about. We then leverage this understanding to
repair systems and carefully account for the absence of violations.
As an illustration of this paradigm, I will introduce a notion of proxy nondiscrimination and
methods for making machine learning models accountable for this property. A key building block
for these notions is Quantitative Input Influence  a form of explanation for decisions of machine
learning models. This work is motivated by recent discoveries of unfairness or bias in widely
used datadriven systems, including our own work that discovered gender bias in the targeting
of jobrelated ads in the Google advertising ecosystem.
The talk is based on results with colleagues at CMU published at the 2016 IEEE Symposium on
Security and Privacy and the 2017 ACM Conference on Computer and Communications Security.
Speaker Bio: Anupam Datta is an Associate Professor (with tenure) at Carnegie Mellon University where he holds a primary appointment in the Electrical and Computer Engineering Department and a courtesy appointment in the Computer Science Department. His research area is security and privacy, with an emphasis on bridging theory and practice.
Datta's current focus is on information accountability: foundations and tools that can be used to provide oversight of complex information processing ecosystems (including big data systems) to examine whether they respect privacy, and other desirable values in the personal data protection area, such as fairness and transparency. His research group produced the first complete logical specification and audit of all disclosurerelated clauses of the HIPAA Privacy Rule for healthcare privacy. His group's work with Microsoft Research produced the first automated privacy compliance analysis of the production code of an Internetscale system  the big data analytics pipeline for Bing, Microsoft's search engine.
Datta has also made significant contributions to the research area of compositional security. Specifically, his work led to new principles for securely composing cryptographic protocols and their application to several protocol standards, most notably to the IEEE 802.11i standard for wireless authentication and to attestation protocols for trusted computing. Datta has authored a book and over 50 other publications on these topics.
He serves as an EditorinChief of Foundations and Trends in Privacy and Security, an Associate Editor of the Journal of Computer Security and the Journal of Computer and System Sciences, as well as the 201314 Program CoChair of the IEEE Computer Security Foundations Symposium. Datta obtained Ph.D. (2005) and M.S. (2002) degrees from Stanford University and a B.Tech. (2000) from IIT Kharagpur, all in Computer Science.
Host Faculty: Prof. Vinod Ganapathy
 Series: M.Sc. (Engg) Colloquium Title: Checking Observational Purity of Procedures  Speaker: Himanshu Arora
 Faculty Advisor: K. V. Raghavan
 Date and Time: Monday, November 20, 2017, 4:00 PM
 Venue: CSA Conference Room (Room No. 225, First Floor)
Abstract We provide two static analysis (using theorem proving) that check if a given (recursive) pro
cedure behaves as if it were stateless, even when it maintains state in global variables. In
other words, we check if the given procedure behaves like a mathematical function. In order to
eliminate the need for manual annotations, we define a syntactic invariant that represents the reachable global states in all runs of the program. This syntactic invariant makes use of function
symbols, which aids in automated construction of the invariant. The two static analysis are
an existential checker and an impurity witness checker. The impurity witness checker outputs
a formula whose unsatisfiability implies that the
procedure is observationally pure. Whereas, the existential checker outputs a formula that constrains the definition of the function that the given procedure may implement. Satisfiability
of the formula generated by the existential checker implies that the given procedure is observationally pure. The impurity witness approach works better (empirically) with SMT solvers,
whereas the existential approach is more precise on paper. We illustrate our work on examples such as
matrix chain multiplication. Examples such as these are not addressable by related techniques in the literature.
The closest work to our is by Barnett et al.; this work cannot handle procedures
with self recursion. We prove both our approaches to be sound. We have implemented the
two static analyses using the Boogie framework and the Z3 SMT solver, and have evaluated our implementation on a number of examples.
 Series: M.Sc. (Engg) Thesis Defense Title: Feature Selectio n under Multicollinearity
&
Causal Inference on Ti me Series  Speaker: Mr. Indranil Bhattacharya
M.Sc. student
Dept. of CSA  Faculty Advisor: Dr. Arnab Bhattacharyya
 Date and Time: Monday, November 20, 2017, 2:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract In this work, we study and extend algorithms for Sparse Regression and Causal Inference problems. Both
the problems are fundamental in the area of Data Science.
The goal of regression problem is to find out the “best” relationship between an output variable and
input variables, given samples of the input and output values. We consider sparse regression under a
highdimensional linear model with strongly correlated variables, situations which cannot be handled well using
many existing model selection algorithms. We study the performance of the popular feature selection
algorithms such as LASSO, Elastic Net, BoLasso, Clustered Lasso as well as Projected Gradient Descent
algorithms under this setting in terms of their running time, stability and consistency in recovering the
true support. We also propose a new feature selection algorithm, BoPGD, which cluster the features
first based on their sample correlation and do subsequent sparse estimation using a bootstrapped variant
of the projected gradient descent method with projection on the nonconvex L 0 ball. We attempt to
characterize the efficiency and consistency of our algorithm by performing a host of experiments on both
synthetic and real world datasets.
Discovering causal relationships, beyond mere correlation, is widely recognized as a fundamental
problem. The Causal Inference problems use observations to infer the underlying causal structure of the
data generating process. The input to these problems is either a multivariate time series or i.i.d sequences
and the output is a Feature Causal Graph where the nodes correspond to the variables and edges capture
the direction of causality. For high dimensional datasets, determining the causal relationships becomes a
challenging task because of the curse of dimensionality. Graphical modeling of temporal data based on the
concept of “Granger Causality” has gained much attention in this context. The blend of Granger methods
along with model selection techniques, such as LASSO, enables efficient discovery of a “sparse” subset
of causal variables in high dimensional settings. However, these temporal causal methods use an input
parameter, L, the maximum time lag. This parameter is the maximum gap in time between the occurrence
of the output phenomenon and the causal input stimulus. However, in many situations of interest, the
maximum time lag is not known, and indeed, finding the range of causal effects is an important problem.
In this work, we propose and evaluate a datadriven and computationally efficient method for Granger
causality inference in the Vector Auto Regressive (VAR) model without foreknowledge of the maximum
time lag. We present two algorithms Lasso Granger++ and Group Lasso Granger++ which not only
constructs the hypothesis feature causal graph, but also simultaneously estimates a value of maxlag ( Lˆ )
for each variable by balancing the tradeoff between “goodness of fit” and “model complexity".
 Series: M.Sc. (Engg) Thesis Defense Title: Large Scale Graph Processing in a Distributed Environment  Speaker: Mr.Nitesh Chandra Upadhyay
M.Sc. (Engg) Student
Dept of CSA
IISc  Faculty Advisor: Prof. Y.N. Srikant
 Date and Time: Monday, November 20, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Graph algorithms are ubiquitously used across domains. They exhibit parallelism,
which can be exploited on parallel architectures, such as, multicore processors
and accelerators. However, real world graphs are massive in size and cannot fit
into the memory of a single machine. Such large graphs are partitioned and
processed in a distributed cluster environment which consist of multiple GPUs
and CPUs.
Existing frameworks that facilitate large scale graph processing in the distributed
cluster have their own style of programming and require extensive involvement by
the user in communication and synchronization aspects. Adaptation of these
frameworks appears to be an overhead for a programmer. Furthermore, these frameworks
have been developed to target only CPU clusters and lack the ability to harness the
GPU architecture.
We provide a backend framework to the graph Domain Specific Language, Falcon,
for large scale graph processing on CPU and GPU clusters. The Motivation behind
choosing this DSL as a frontend is its sharedmemory based imperative programmability
feature. Our framework generates Giraph code for CPU clusters. Giraph code runs
on the Hadoop cluster and is known for scalable and faulttolerant graph processing.
For GPU cluster, Our framework applies a set of optimizations to reduce computation
and communication latency, and generates efficient CUDA code coupled with MPI.
Experimental evaluations show the scalability and performance of our framework for
both CPU and GPU clusters. The performance of the framework generated code is
comparable to the manual implementations of various algorithms in distributed
environments
 Series: Department Seminar Title: Actively Secure Garbled Circuits with Constant Communication Overhead in the Plain Model  Speaker: Dr. Carmit Hazay
Senior Lecturer
Computer Engineering Department
BarIlan University
Israel  Date and Time: Wednesday, November 15, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract We consider the problem of constantround secure twoparty computation in the presence of active
(malicious) adversaries. We present the first protocol that has only a constant
multiplicative communication overhead compared to Yao's protocol for passive adversaries and can be
implemented in the plain model by only making a blackbox use of (parallel) oblivious transfer and a pseudo
random generator. This improves over the polylogarithmic overhead of the previous best protocol. A
similar result could previously be obtained only in an amortized setting, using preprocessing, or by
assuming bitoblivioustransfer as an ideal primitive that has a constant cost.
We present two variants of this result, one which is aimed at minimizing the number of oblivious
transfers and another which is aimed at optimizing concrete efficiency. Our protocols are based on a
novel combination of previous techniques together with a new efficient protocol to certify that pairs
of strings transmitted via oblivious transfer satisfy a global relation. The communication complexity
of the second variant of our protocol can beat the best previous protocols even for realistic values of
the circuit size and the security parameter. This variant is particularly attractive in the offlineonline
setting, where the online cost is dominated by a single evaluation of an authenticated garbled circuit,
and can also be made noninteractive using the FiatShamir heuristic.
This is a joint work with Yuval Ishai and Muthuramakrishnan Venkitasubramaniam.
Speaker Bio: Carmit Hazay is a Senior Lecturer in the Computer Engineering Department, BarIlan University, Israel. In 2009 she completed her Ph.D. studies at BarIlan University and started her postdoctoral training at Weizmann Institute of Science and IDC (20092010) and at Aarhus University, Denmark (20102012). She joined the Faculty of Engineering at BarIlan University in October 2012.
Host Faculty: Dr. Arpita Patra

