Events 

Seminars 

Prof. I. G. Sarma Memorial Lecture Series 

Prof. Priti Shankar Memorial Seminar Series 

Multimedia Archive 



UPCOMING SEMINARS 

Series: Department Seminar Title: Data Driven Precondition Inference for Compiler Optimizations  Speaker: Dr. Santosh Nagarakatte
Assistant Professor
Department of Computer Science, Rutgers University  Date and Time: Thursday, August 03, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Peephole optimizations are a common source of compiler bugs.
Compiler developers typically transform an incorrect peephole
optimization into a valid one by strengthening the precondition. This
process is challenging and tedious. In this talk, I will describe our
new datadriven approach for inferring preconditions for peephole
optimizations expressed in Alive. Our main idea is to generate positive
and negative examples for an optimization in a context without tests,
enumerate predicates on demand, and learn a set of predicates that
separate the positive and negative examples. We repeat this process
until we are able to find a precondition that ensures the validity of
the optimization. Our prototype, ALIVEINFER, reports both a weakest
precondition and a set of succinct partial preconditions to the
developer. Our prototype generates preconditions that are weaker than
LLVM’s preconditions for majority of the optimizations in the Alive
suite. The talk will also demonstrate the applicability of this
technique to generalize optimization patterns generated by Souper, an
LLVM IR–based superoptimizer.
Speaker Bio: Santosh Nagarakatte is an Assistant Professor of Computer Science
at Rutgers University. He obtained his PhD from the University of
Pennsylvania in 2012. His research interests are in HardwareSoftware
Interfaces spanning Programming Languages, Compilers, Software
Engineering, and Computer Architecture. His papers have been selected as
IEEE MICRO TOP Picks papers of computer architecture conferences in 2010
and 2013. He has received the NSF CAREER Award in 2015, ACM SIGPLAN PLDI
2015 Distinguished Paper Award, and ACM SIGSOFT ICSE 2016 Distinguished
Paper Award for his research on LLVM compiler verification. His papers
have been selected as the 2016 SIGPLAN Research Highlights Paper and
2017 Communication of the ACM Research Highlights Paper.
Host Faculty: Prof. Vinod Ganapathy
 Series: Ph.D. Colloquium Title: Program analyses to support memorysaving refactorings in Java programs  Speaker: Girish Maskeri Rama
 Faculty Advisor: K. V. Raghavan
 Date and Time: Friday, July 28, 2017, 11:30 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Software commonly consumes unexpectedly high amounts of memory, frequently
due to programming idioms that are used to make software more reliable,
maintainable and understandable. In the case of modern objectoriented
systems this problem is partly due to creation of large numbers of
coexisting *isomorphic* objects. Intuitively, two objects are
isomorphic if they are of the same type, have identical values in
corresponding primitive fields, and are such that corresponding reference
fields themselves point to isomorphic objects. In other words, the portions
of memory rooted at the two objects are isomorphic shapewise as well as
valueswise. A significant reduction in heap usage can therefore be
achieved if the code is refactored to *deduplicate* or *share*
objects whenever possible instead of always creating distinct but possibly
isomorphic objects. Such a refactoring, which employs a cache to keep track
of objects created so far and to share them, is termed as objectsharing
refactoring. In practice, objectsharing refactoring is commonly used, as
it directly impacts the attributes of software that end users care about 
memory utilization and performance. However, manual refactoring is tedious
and error prone.
To support objectsharing refactoring we have (1) designed and implemented
an approach to estimate memorysavings potential due to this refactoring,
(2) espoused the idea of *full initialization points* of objects, and a
static analysis to identify these points, and (3) proposed a scalable
refinementbased pointsto analysis for verifying the safety of
objectsharing refactoring, but which has general applicability to several
other verification tasks as well.
(1) We present a dynamic analysis technique for *estimating* for all the
allocation sites in a program, for a given input, the reduction in heap
memory usage (in bytes, or as a percentage) to be obtained by employing
object sharing at each site. The quantitative estimates produced by our
technique of a userobservable benefit (i.e., actual memory savings) make
it easier for developers to select sites to refactor. Experimentation with
our estimation tool on real Java programs indicate that nearly all
applications have potential for reduction of memory usage by object
sharing, with up to 37% estimated reduction in heap footprint in the best
case.
(2) We define a novel concept termed fullinitialization points (FIPs) to
characterize the points in the program where objects allocated at any
chosen allocation site become fully initialized. This notion supports
objectsharing refactoring as follows  the FIPs are good candidate program
points where the cache lookup can be performed to find isomorphic objects
and share them. Additionally, we argue that FIPs are useful in the context
of other allocationsite refactorings, such as immutability refactoring,
and refactoring to the builder pattern. We present a novel and
conservative static analysis to detect FIPs for a given allocation site. By
introducing code to cache and share objects at the FIPs suggested by our
analysis, objectsharing refactoring was able to obtain a mean memory
savings of 11.4% on a set of real Java benchmarks. Immutability refactoring
guided by our analysis achieved a mean runtime speedup of 1.6X compared to
performing the same refactoring using a baseline approach that did not make
use of FIPs.
(3) A standard pointsto analysis approach, namely, the object sensitivity
approach, uses an equal level of precision to represent symbolic objects
allocated at all allocation sites in a program. This approach does not
scale to large programs unless low levels of precision are used. We show
how to use a user's query of interest to identify a limited set of
allocation sites that need a higher level of precision, and use higher
precision only at these sites. We have implemented our
approach. Experiments using our implementation show that for the task of
checking safety of applying objectsharing refactoring at each site, our
approach gives mean improvement in precision of 4.4% over the previous
approach, on a set of real benchmarks. For another wellknown verification
task, namely downcast safety analysis, our approach results in mean
precision improvement of 11.9% over the previous approach.
Speaker Bio: Girish Rama is employed at Infosys Ltd., and is a PhD student at CSA, IISc. His research interests are in software refactoring, software modularity and metrics, and program analysis techniques.
 Series: Ph.D. Thesis Defense Title: Power Issues in SoCs: Power Aware DFT Architecture and Power Estimation  Speaker: Mr. Jaynarayan Thakurdas Tudu
Ph.D student
Dept. of CSA, IISc  Faculty Advisor: Prof. Matthew Jacob Thazhuthaveetil
 Date and Time: Friday, July 28, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Test power, data volume, and test time have been longstanding problems for sequential scan
based testing of systemonchip (SoC) design. The modern SoCs fabricated at lower technology
nodes are complex in nature, the transistor count is as large as billions of gate for some of
the microprocessors. The design complexity is further projected to increase in the coming years
in accordance with Moore’s law. The larger gate count and integration of multiple functionalities
are the causes for higher test power dissipation, test time and data volume. The dynamic power
dissipation during scan testing, i.e. during scan shift, launch and response capture, are major
concerns for reliable as well as cost effective testing. Excessive average power dissipation leads
to a thermal problem which causes burnout of the chip during testing. Peak power on other hand
causes test failure due to power induced additional delay. The test failure has direct impact on
yield. The test power problem in modern 3D stacked based IC is even a more serious issue. Estimating
the worst case functional power dissipation is yet another great challenge. The worst case functional
power estimation is necessary because it gives an upper bound on the functional power dissipation
which can further be used to determine the safe power zone for the test.
Several solutions in the past have been proposed to address these issues. In this thesis
we have three major contributions: 1) Sequential scan chain reordering, and 2) JScanan
alternative Jointscan DFT architecture to address primarily the test power issues along
with test time and data volume, and 3) an integer linear programming methodology to address
the power estimation problem. In order to reduce test power during shift, we have proposed
a graph theoretic formulation for scan chain reordering and for optimum scan shift operation.
For each formulation a set of algorithms is proposed. The experimental results on ISCAS89
benchmark circuit show a reduction of around 25% and 15% in peak power and scan shift time respectively.
In order to have a holistic DFT architecture which could solve test power, test time,
and data volume problems, a new DFT architecture called Jointscan (JScan) have been
developed. In JScan we have integrated the serial and random access scan architectures
in a systematic way by which the JScan could harness the respective advantages from
each of the architectures. The serial scan architecture suffers from test power,
test time, and data volume problems. However, the serial scan is simple in terms
of its functionality and is cost effective in terms of DFT circuitry. Whereas, the
random access scan architecture is opposite to this; it is power efficient and it
takes lesser time and data volume compared to serial scan. However, the random
access scan occupies larger DFT area and introduces routing congestion. Therefore,
we have proposed a methodology to realize the JScan architecture as an efficient
alternative for standard serial and random access scan. Further, the JScan
architecture is optimized and it resulted into a 2Mode 2MJscan Jointscan
architecture. The proposed architectures are experimentally verified on larger
benchmark circuits and compared with existing state of the art DFT architectures.
The results show a reduction of 50% to 80% in test power and 30% to 50% in test
time and data volume. The proposed architectures are also evaluated for routing
area minimization and we obtained a saving of around 7% to 15% of chip area.
Estimating the worst case functional power being a challenging problem, we have
proposed a binary integer linear programming (BILP) based methodology. Two different
formulations have been proposed considering the different delay models namely
zerodelay and unitdelay. The proposed methodology generates a pair or input vectors
which could toggle the circuit to dissipate worst power. The BILP problems are solved
using CPLEX solver for ISCAS85 combinational benchmark circuits. For some of the
circuits, the proposed methodology provided the worst possible power dissipation i.e.
80 to 100% toggling in nets.
 Series: M.Sc. (Engg) Thesis Defense Title: Efficient Whole Program Path Tracing  Speaker: Mr. Sridhar G
M.Sc. (Engg) Student
Dept. of CSA
IISc  Faculty Advisor: Prof. Matthew Jacob Thazhuthaveetil
 Date and Time: Thursday, July 27, 2017, 10:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Obtaining an accurate whole program path (WPP) that captures a program's runtime behavior in
terms of a controlflow trace has a number of wellknown benefits,including opportunities for
code optimization, bug detection, program analysis refinement, etc. Existing techniques to compute
WPPs perform suboptimal instrumentation resulting in significant space and time overheads. Our goal
in this paper is to minimize these overheads without losing precision.
To do so, we design a novel and scalable whole program analysis to determine instrumentation points
used to obtain WPPs. Our approach is divided into three components: (a) an efficient summarization
technique for interprocedural path reconstruction, (b) specialized data structures called conflict
sets that serve to effectively distinguish between pairs of paths, and (c) an instrumentation
algorithm that computes the minimum number of edges to describe a path based on these
conflict sets. We show that the overall problem is a variant of the minimum hitting
set problem, which is NPhard, and employ
various sound approximation strategies to yield a practical solution.
We have implemented our approach and performed elaborate experimentation on Java programs from
the DaCapo benchmark suite to demonstrate the efficacy of our approach across multiple dimensions.
On average, our approach necessitates instrumenting only 9% of the total number of CFG edges in the program.
The average runtime overhead incurred by our approach to collect WPPs is 1.97x, which is
only 26% greater than the overhead induced by only instrumenting edges guaranteed to
exist in an optimal solution. Furthermore, compared to the stateoftheart, we observe a
reduction in runtime overhead by an average and maximum factor of 2.6 and 5.4, respectively.
Speaker Bio: Sridhar Gopinath is a M.Sc.(Engg.) student in the Department of Computer Science
and Automation since Aug 2015. His research interests are in software engineering
and programming languages. He is a member of the Scalable Software Systems lab
in CSA. He received a B.E in Computer Science from SJCE, Mysore in 2015.


PAST SEMINARS 

Series: Ph.D. Colloquium Title: Falcon: A Graph Manipulation Language for Distributed Heterogeneous Systems  Speaker: Unnikrishnan C.
 Faculty Advisor: Y.N. Srikant
 Date and Time: Tuesday, July 25, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Graphs model relationships across realworld entities in web graphs, social network graphs, and road network graphs. Graph algorithms analyze and transform a graph to discover graph properties or to apply a computation. For instance, a pagerank algorithm computes a rank for each page in a webgraph, and a community detection algorithm discovers likely communities in a social network, while a shortest path algorithm computes the quickest way to reach a place from another, in a road network. In Domains such as social information systems, the number of edges can be in billions or trillions. Such large graphs are processed on distributed computer systems or clusters.
Graph algorithms can be executed on multicoreCPUs, GPUs with thousands of cores, multiGPU devices, and CPU+GPU clusters, depending on the size of the graph object. While programming such algorithms on heterogeneous targets, a programmer is required to deal with parallelism and and also manage explicit data communication between distributed devices. This implies that a programmer is required to learn CUDA, OpenMP, MPI, etc., and also the details of the hardware architecture. Such codes are error prone and difficult to debug. A Domain Specific Language (DSL) which hides all the hardware details and lets the programmer concentrate only the algorithmic logic will be very useful.
With this as the research goal, Falcon, graph DSL and its compiler have been developed. Falcon programs are explicitly parallel and Falcon hides all the hardware details from the programmer. Large graphs that do not fit into the memory of a single device are automatically partitioned by the Falcon compiler. Another feature of Falcon is that it supports mutation of graph objects and thus enables programming dynamic graph algorithms. The Falcon compiler converts a single DSL code to heterogeneous targets such as multicore CPU, GPU, multiGPU device, and CPU+GPU cluster. Falcon compiled codes match or outperform stateoftheart graph frameworks for different target platforms and benchmarks.
Speaker Bio: Unnikrishnan is a Ph.D student in the CSA dpeartment. His areas of interest are compilation for heterogeneous architectures and compiler optimizations.
Host Faculty: Y.N. Srikant
 Series: Department Seminar Title: Learning with Limited Rounds of Adaptivity: Coin Tossing,
MultiArmed Bandits, and Ranking from Pairwise Comparisons  Speaker: Mr. Arpit Agarwal
PhD student
Department of Computer & Information Science
University of Pennsylvania  Date and Time: Monday, July 24, 2017, 2:30 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract In many learning settings, active/adaptive querying is
possible, but the number of rounds of adaptivity is limited. We study
the relationship between query complexity and adaptivity in identifying
the kmost biased coins among a set of n coins with unknown biases. This
problem is a common abstraction of many wellstudied problems, including
the problem of identifying the kbest arms in a stochastic multiarmed
bandit, and the problem of topk ranking from pairwise comparisons.
An rround adaptive algorithm for the k most biased coins problem
specifies in each round the set of coin tosses to be performed based on
the observed outcomes in earlier rounds, and outputs the set of kmost
biased coins at the end of r rounds. When r = 1, the algorithm is known
as nonadaptive; when r is unbounded, the algorithm is known as fully
adaptive. While the power of adaptivity in reducing query complexity is
well known, full adaptivity requires repeated interaction with the coin
tossing (feedback generation) mechanism, and is highly sequential, since
the set of coins to be tossed in each round can only be determined after
we have observed the outcomes of the coin tosses from the previous
round. In contrast, algorithms with only few rounds of adaptivity
require fewer rounds of interaction with the feedback generation
mechanism, and offer the benefits of parallelism in algorithmic
decisionmaking. Motivated by these considerations, we consider the
question of how much adaptivity is needed to realize the optimal worst
case query complexity for identifying the k most biased coins. Given any
positive integer r, we derive essentially matching upper and lower
bounds on the query complexity of rround algorithms. We then show that
Θ(log* n) rounds are both necessary and sufficient for achieving the
optimal worst case query complexity for identifying the kmost biased
coins. In particular, our algorithm achieves the optimal query
complexity in at most log* n rounds, which implies that on any realistic
input, 5 parallel rounds of exploration suffice to achieve the optimal
worstcase sample complexity. The best known algorithm prior to our work
required Θ(log n) rounds to achieve the optimal worst case query
complexity even for the special case of k = 1.
Speaker Bio: Arpit is a PhD student at the Department of Computer & Information
Science, University of Pennsylvania. Previously he completed M.E. from
CSA, IISc for which he received the Computer Society of India (Bangalore
Chapter) medal.
Host Faculty: Dr. Siddharth Barman
 Series: Ph.D. Colloquium Title: Data Structures and Algorithms to Analyze Concurrency in Android Applications  Speaker: Ms. Pallavi Maiya H P
PhD Student  Faculty Advisor: Prof. Aditya Kanade
 Date and Time: Tuesday, July 18, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Android is a popular mobile operating system, providing a rich ecosystem for
the development of applications which run on the Android platform. Entities
such as the device user, network and sensors interact continuously with the
mobile device causing an Android application to face a continuous stream of
input (events). In spite of having to perpetually handle a huge volume of
events, Android applications are expected to be responsive to new events
from the environment. The Android framework achieves this by exposing
applications to a concurrency model which combines multithreading with
asynchronous eventbased dispatch. While this enables development of
efficient applications, unforeseen thread interleavings combined with
nondeterministic ordering of events can lead to subtle concurrency bugs in
Android applications.
In an Android application, threads communicate through shared variables
and by sending events to each others event queues. Even though typically
a thread processes events in its event queue and invokes corresponding event
handlers in a FIFO order, this ordering however can be altered by attributes
such as delays and priorities associated with the events. Existing literature
extensively describe techniques to detect concurrency bugs such as data races,
deadlocks and atomicity violations in multithreaded programs. Recent works
also present techniques to analyze bugs manifested due to singlethreaded
eventdriven concurrency. However, the complex eventdriven semantics of
Android applications combined with the threadbased semantics render a naive
extension of such concurrency analysis techniques either less effective or
unsound for Android applications. This thesis takes the initial steps towards
developing data structures and algorithms to facilitate effective dynamic
concurrency analysis of Android applications.
We firstly formalize the concurrency behaviour of Android applications by
giving concurrency semantics. Using this semantics we derive a set of rules
to reason about the happensbefore ordering between operations in an Android
execution trace. These rules induce a partial order called a
happensbefore (HB) relation on the operations of an execution trace. Our HB
relation generalizes the so far independently studied HB relations for
multithreaded programs and singlethreaded eventdriven programs. We have
developed a happensbefore based dynamic race detector for Android
applications, called DroidRacer, which detects data races across threads as
well as race conditions between memory accesses from different event handlers
on the same thread. DroidRacer has detected several races in various Android
applications.
We identify a major bottleneck in the computation of the HB relation for
Android applications, that of discovering HB order between event handlers
executed on the same thread. HB order between events in Android is
characterized by complex HB rules which order a pair of event handlers by
inspecting, for example, the order between operations which enqueued the
corresponding events. Android applications usually receive a large number of
events even within a short span of time, which increases the cost of evaluating
such HB rules. As a result HB computation using standard data structures such
as vector clocks alone becomes inefficient in case of Android applications.
We propose a novel data structure, called "event graph", that maintains a subset
of the HB relation to efficiently infer order between any pair of events. We
present an algorithm, called EventTrack, which improves efficiency of vector
clock based HB computation for eventdriven programs using event graphs.
Evaluation of EventTrack against a stateoftheart HB computation technique
for Android applications, yielded significant speedup.
The scope of happensbefore based dynamic race detector is limited to the
thread interleaving corresponding to the execution trace inspected. However, a
systematic state space exploration technique such as a stateless model checker
can explore all the thread interleavings, and also identify harmful
manifestations of data races which may happen only when multiple racing memory
accesses are performed in a particular order. Partial order reduction (POR) is
a technique used by stateless model checkers to reduce the state space to be
explored while preserving certain interesting program properties. The existing
POR techniques selectively explore different thread interleavings only to
reorder pairs of racing operations from different threads. However, they fail
to follow this strategy for events and hence end up eagerly exploring all
possible ordering between event handlers executed on the same thread, even if
reordering them does not lead to different states. We propose a new POR
technique based on a novel backtracking set called the "dependencecovering
set". Events handled by the same thread are reordered by our POR technique only
if necessary. We prove that exploring dependencecovering sets suffices to
detect all deadlock cycles and assertion violations. To evaluate effectiveness
of our POR scheme, we have implemented a dynamic algorithm to compute
dependencecovering sets. On execution traces obtained from a few Android
applications, we demonstrate that our technique explores many fewer transitions
—often orders of magnitude fewer— compared to exploration based on persistent
sets, a popular POR technique which explores all possible orderings between
events.
The tools, (i) Droidracer, a race detector for Android applications, (ii) a
standalone implementation of EventTrack, to compute HB relation over Android
execution traces, and (iii) EMExplorer  a proofofconcept model checking
framework which simulates the nondeterministic behaviour exhibited by Android
applications, developed by this thesis have been made public.
 Series: M.Sc. (Engg) Colloquium Title: Towards a characterization of the symmetries of
NisanWigderson polynomial family  Speaker: Mr. Nikhil Gupta
M.Sc. (Engg)student
Dept. of CSA
IISc  Faculty Advisor: Dr. Chandan Saha
 Date and Time: Friday, July 14, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Understanding the structure and complexity of a polynomial family is a
fundamental problem of arithmetic circuit complexity. There are various approaches like
studying the lower bounds,which deals with finding the smallest circuit required to compute a
polynomial, studying the orbit and stabilizer of a polynomial with respect to an invertible
transformation etc to do this.
We have a rich understanding of some of the well known polynomial families
like determinant, permanent, IMM etc. In this thesis we study some of the structural
properties of the polynomial family called the NisanWigderson polynomial family. This polynomial
family is inspired from a well known combinatorial design called NisanWigderson design and is
recently used to prove strong lower bounds on some restricted classes of arithmetic circuits
([KayalLimayeSahaSrinivasan(14)], [KayalSahaSaptharishi(14)],[KayalSahaTavenas(16)]).
But unlike determinant, permanent, IMM etc, our understanding of the
NisanWigderson polynomial family is inadequate. For example we do not know if this polynomial family
is in VP or VNP complete or VNPintermediate assuming VP not equal to VNP, nor do we have an
understanding of the complexity of its equivalence test. We hope that the knowledge of some of the inherent
properties of Nisan Wigderson polynomial like group of symmetries and Lie algebra would provide
us some insights in this regard.
An invertible square matrix A of size n^2 is called a symmetry of an
nvariate polynomial f if f(A·X) = f(X), where X is the set of n variables. The set of symmetries of
f forms a subgroup of general linear group, which is also known as group of symmetries of f,
denoted G_f . A vector space is attached to G_f to get the complete understanding of the
symmetries of f. This vector space is known as Lie algebra of group of symmetries of f (or
Lie algebra of f), represented as L_f . Lie algebra of f contributes some elements of G_f,
known as continuous symmetries of f. Lie algebra has also been instrumental in designing
efficient randomized equivalence tests for some polynomial families like determinant, permanent,
IMM etc ([Kayal(11)], [KayalNairSahaTavenas(17)]).
In this work we completely characterize the Lie algebra of NisanWigderson
polynomial family. We show that L_NW contains diagonal matrices of a specific type. The Lie
algebra L_NW turns out to be a proper linear subspace of L_Perm , the Lie algebra of the
symbolic Permanent polynomial. The knowledge of L_NW not only helps us to completely figure
out the continuous symmetries of the NisanWigderson polynomial family, but also gives some
crucial insights into the other symmetries of NisanWigderson polynomial (i.e. the discrete
symmetries). Thereafter using Hessian matrix of NisanWigderson polynomial and the concept of
evaluation dimension, we are able to almost completely identify the structure of G_NW . In
particular we prove that any A in G_NW is a product of diagonal and permutation matrices of certain kind
that we call blockpermuted permutation matrix. Finally, we give explicit examples of
nontrivial blockpermuted permutation matrices using the Frobenious automorphisms that establishes
the richness of the discrete symmetries of the NisanWigderson polynomial family.
 Series: Ph.D. Thesis Defense Title: Grobner Basis Algorithms for Polynomial Ideal
Theory over Noetherian Commutative Rings  Speaker: Ms. Maria Francis
 Faculty Advisor: Prof. Ambedkar Dukkipati
 Date and Time: Thursday, July 13, 2017, 2:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract One of the fundamental problems in commutative algebra and algebraic
geometry is to understand the nature of the solution space of a system of
multivariate polynomial equations over a field $Bbbk$, such as real or
complex numbers. An important algorithmic tool in this study is the notion
of Gr"obner bases citep{Buchberger:1965:thesis}. Given a system of
polynomial equations, $f_1 =0, ldots, f_m =0$, Gr"obner basis is a
``canonical" generating set of the ideal generated by $f_1, ldots, f_m$,
that can answer, constructively, many questions in computational ideal theory.
It generalizes several concepts of univariate polynomials like resultants to
the multivariate case, and answers decisively the ideal membership problem.
The dimension of the solution set of an ideal $mathfrak{a}$ called the
affine variety, an important concept in algebraic geometry, is equal to the
Krull dimension of the corresponding coordinate ring, $Bbbk[x_1, ldots,x_n]/mathfrak{a}$.
Gr"obner bases were first introduced to compute $Bbbk$vector space bases
of $Bbbk[x_1, ldots,x_n]/mathfrak{a}$ and use that to characterize
zerodimensional solution sets. Since then, Gr"obner basis techniques
have provided a generic algorithmic framework for computations in control
theory, cryptography, formal verification, robotics, etc, that involve
multivariate polynomials over fields.
The main aim of this thesis is to study problems related to computational
ideal theory over Noetherian commutative rings (e.g: the ring of integers,
$mathbb{Z}$, the polynomial ring over a field, $Bbbk[y_1, ldots, y_m]$,
etc ) using the theory of Gr"obner bases. These problems surface in many
domains including lattice based cryptography, control systems, systemonchip design, etc.
Although, formal and standard techniques are available for polynomial rings
over fields, the presence of zero divisors and non units make developing
similar techniques for polynomial rings over rings challenging.
Given a polynomial ring over a Noetherian commutative ring, $A$ and an
ideal $mathfrak{a}$ in $A[x_1, ldots, x_n]$, the first fundamental
problem that we study is whether the residue class polynomial ring,
$A[x_1, ldots, x_n]/mathfrak{a}$ is a free $A$module or not.
Note that when $A=Bbbk$, the answer is always `yes' and the
$Bbbk$vector space basis of $Bbbk[x_1, ldots, x_n]/mathfrak{a}$
plays an important role in computational ideal theory over fields.
In our work, we give a Gr"obner basis characterization for $A[x_1,
ldots,x_n]/mathfrak{a}$ to have a free $A$module representation w.r.t. a
monomial ordering. For such $A$algebras, we give an algorithm to compute
its $A$module basis. This extends the MacaulayBuchberger basis theorem to
polynomial rings over Noetherian commutative rings. These results help us
develop a theory of border bases in $A[x_1, ldots,x_n]$ when the residue class polynomial
ring is finitely generated. The theory of border bases is handled as two
separate cases: (i) $A[x_1, ldots,x_n]/mathfrak{a}$ is free and (ii)
$A[x_1, ldots,x_n]/mathfrak{a}$ has torsion submodules.
For the special case of $A = mathbb{Z}$, we show how short reduced
Gr"obner bases and the characterization for a free $A$module
representation help identify the cases when $mathbb{Z}[x_1,
ldots,x_n]/mathfrak{a}$ is isomorphic to $mathbb{Z}^N$ for some $Nin
mathbb{N}$. Ideals in such $mathbb{Z}$algebras are called ideal
lattices. These structures are interesting since this means we can use the
algebraic structure, $mathbb{Z}[x_1, ldots, x_n]/mathfrak{a}$ as a
representation for point lattices and extend all the computationally hard
problems in point lattice theory to $mathbb{Z}[x_1, ldots, x_n]/mathfrak{a}$.
Univariate ideal lattices are widely used in lattice based cryptography for they are a more
compact representation for lattices than matrices. In this thesis, we
give a characterization for multivariate ideal lattices and construct
collision resistant hash functions based on them using Gr"obner
basis techniques. For the construction of hash functions, we define a
worst case problem, shortest substitution problem w.r.t. an ideal in
$mathbb{Z}[x_1,ldots, x_n]$, and establish hardness results for this problem.
Finally, we develop an approach to compute the Krull dimension of
$A[x_1,ldots,x_n]/mathfrak{a}$ using Gr"obner bases, when $A$ is a
Noetherian integral domain. When $A$ is a field, the Krull dimension of
$A[x_1,ldots,x_n]/mathfrak{a}$ has several equivalent algorithmic
definitions by which it can be computed. But this is not true in the case
of arbitrary Noetherian rings. We introduce the notion of combinatorial
dimension of $A[x_1, ldots, x_n]/mathfrak{a}$ and give a Gr"obner
basis method to compute it for residue class polynomial rings that have a
free $A$module representation w.r.t. a lexicographic ordering.
For such $A$algebras, we derive a relation between Krull dimension and
combinatorial dimension of $A[x_1, ldots, x_n]/mathfrak{a}$. For
$A$algebras that have a free $A$module representation w.r.t.
degree compatible monomial orderings, we introduce the concepts of Hilbert
function, Hilbert series and Hilbert polynomials and show that Gr"obner
basis methods can be used to compute these quantities. We then proceed to
show that the combinatorial dimension of such $A$algebras is equal to the
degree of the Hilbert polynomial. This enables us to extend the relation
between Krull dimension and combinatorial dimension to $A$algebras with a
free $A$module representation w.r.t. a degree compatible ordering as well.
 Series: M.Sc. (Engg) Colloquium Title: Fully Resilient NonInteractive Idbased Hierarchical Key
Agreement  Speaker: Mayank Tiwari (MSc Engg.)
 Faculty Advisor: Dr. Sanjit Chatterjee
 Date and Time: Tuesday, July 11, 2017, 12:00 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract NonInteractive Key Agreement (NIKA) is a cryptographic
primitive which allows two parties to agree on a shared key without any
interaction. Identitybased NonInteractive Key Agreement (IDNIKA)
allows
each party to compute shared secret key using its own secret key and the
peer's identity. IDNIKA can be used to establish shared secret keys in
Adhoc networks using minimal battery power and communication.
Mobile Adhoc NETwork (MANET) is a network of mobile and moderately
resource constrained devices communicating through a wireless medium.
Examples of standard MANET devices are laptops, cellphones etc. Due to
the
inherent characteristics like mobility, dynamic topology and lack of
centralized infrastructure, MANETs face some serious security issues. We
are particularly interested about IDNIKA in MANETs. This is of crucial
interest for secure communication between two nodes in MANETs.
In 2008, Gennaro et al. introduced a scheme called Hybrid Hierarchical
Key
Agreement Scheme (HHKAS). HHKAS uses a linear hierarchical key
agreement
scheme at the nonleaf levels and a key agreement scheme due to Sakai et
al. (referred as SOKKAS) at the leaf level. HHKAS is (i)
noninteractive,
(ii) identity based, (iii) hierarchical and (iv) fully resilient against
node compromises at leaf level and resilient against node compromises
upto
certain threshold values in nonleaf levels. Thus one can say that
HHKAS
is partially resilient against node compromises. In their paper the
authors claim that there is no key agreement scheme for MANETs in the
literature, with all above four properties. This was motivated as an
interesting open problem in this area.
Guo et al. proposed a scheme known as Strong Key Agreement Scheme (SKAS)
in 2012. The authors claimed it a potential solution to the open problem
posed by Gennaro et al. in their work. However, in 2014, Zhu et al.
showed
a concrete attack on SKAS. This attack makes SKAS practically useless
for
real life applications.
Our main contribution is a hybrid scheme using two existing schemes. Our
scheme uses a deterministic key predistribution scheme by Lee and
Stinson
termed as Basic Id Oneway function Scheme (BIOS) at level 1 (where root
is at level 0). Beyond level 1, we use SOKKAS for key agreement. We
refer
our scheme as BIOSSOK key agreement. BIOS and SOK schemes satisfy
properties (i), (ii) and (iv) but none of them is hierarchical in
nature.
In our work we have made an amalgam of both schemes which is
hierarchical
in nature. Thus, BIOSSOK scheme satisfies (i), (ii), (iii) and is also
fully resilient against arbitrary number of node compromises at any
level.
BIOSSOK scheme also possesses the benefits of low space requirement,
low
shared key computation time and better scalability for many reallife
applications when compared with the scheme of Gennaro et al. In HHKAS,
the key agreement is carried out only at the leaf level. In BIOSSOK
scheme, any two nodes in the hierarchy (at same or different levels) can
compute shared secret key. We also provide a rigorous security analysis
for our scheme in a stronger security model compared to the security
model
used for HHKAS.
 Series: M.Sc. (Engg) Colloquium Title: Improved lower bounds for depth four arithmetic circuits.  Speaker: Mr. Abhijat Sharma
M.Sc. (Engg)  Faculty Advisor: Dr . Chandan Saha
 Date and Time: Tuesday, July 11, 2017, 11:00 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract We study the problem of proving lower bounds for depth four arithmetic
circuits. Depth four circuits have been receiving much attraction when it comes
to recent circuit lower bound results, as a result of the series of results
culminating in a proof that strong enough lower bounds for depth four circuits
will imply superpolynomial lower bounds for general arithmetic circuits, and
hence solve one of the most central open problems in algebraic complexity i.e a
separation between the VP and VNP complexity classes.
However despite several efforts, the best known lower bounds for general
circuits remains Omega(N logN), where N is the number of input variables.
In the case of arithmetic formulas, an O(N^2) lower bound is known,
which has not seen any improvement since then.
The situation for depth three arithmetic circuits was also similar for
many years, until a recent result that achieved an almost cubic lower bound
improving upon the previous best quadratic lower bound.
However, unlike depth three circuits, proving a superquadratic lower
bound for depth four circuits remains a prevalent open problem for many years.
Previous results implied a superlinear lower bound of Omega(N^{1.33})
for depth four circuits. As the main contribution of this thesis, we improve
upon this superlinear bound and prove an Omega(N^(1.5)) lower bound on the
size of depth four circuits, computing an explicit Nvariate polynomial in VNP
with degree d = O(sqrt{N}).
Our approach offers a potential route to proving a superquadratic lower
bound for depth four circuits. Taking cue from the numerous successful results
recently, we use the technique of the shifted partial derivatives measures to
achieve the said lower bound.
Particularly, we use the Dimension of Projected Shifted Partials (DPSP)
measure. Coming to the choice of the hard polynomial, we again follow the
status quo and use a variant of the NisanWigderson (NW) polynomial family that
has proved to be very helpful over the past few years in arithmetic circuit
complexity.
Finally, we do a careful analysis of two previous work on general depth
four circuits and compare their techniques to ours.
We conclude that our result can potentially be used as a starting point,
and techniques similar to that of the almost cubic lower bound result for depth
three circuits can likely be used to strengthen our lower bound to
Omega(N^(2.5)), for general depth four arithmetic circuits.
 Series: Department Seminar Title: Approximating the Sparsest Cut on ExpanderLike Graphs  Speaker: Dr. Rakesh Venkat
Postdoctoral Fellow
Hebrew University of Jerusalem, Israel  Date and Time: Monday, July 10, 2017, 3:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract The SparsestCut problem is a fundamental NPHard problem in
graph optimization that has many practical applications. Simultaneously,
on the theoretical side, designing approximation algorithms for it has
revealed intriguing connections to questions in geometry involving
metric spaces and embeddings. The best known algorithm for the Sparsest
Cut problem due to Arora, Rao and Vazirani (2004) gives a $O(sqrt{log
n})$ approximation on nvertex graphs, and this is believed to be the
best possible in general. However, one could ask if better approximation
algorithms exist for specific graph classes. In this talk, I will survey
the relevant background, and describe better approximation algorithms
for the Sparsest Cut problem on an important class of graphs called low
thresholdrank graphs, that are generalizations of the wellknown
class of expander graphs. (Based on joint works with Amit Deshpande,
Prahladh Harsha and Yuval Rabani).
Speaker Bio: Rakesh Venkat is currently a Postdoctoral Fellow at the Hebrew
University of Jerusalem, Israel with Prof. Yuval Rabani. He received his
PhD from TIFR, Mumbai under the supervision of Prof. Prahladh Harsha.
His research interests include topics in approximation algorithms,
complexity theory and communication complexity.
Host Faculty: Dr. Anand Louis
 Series: M.Sc. (Engg) Colloquium Title: A Fine Grained Dynamic Information Flow Analysis for Android Apps  Speaker: Shyam Sankaran
 Faculty Advisor: Prof. Aditya Kanade
 Date and Time: Friday, July 07, 2017, 11:30 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Android has been steadily gaining popularity ever since its launch in 2008. One
of the major factors for this is the easy availability of a large variety of apps.
They range from simple apps such as calculator apps to apps which can help
people maintain their schedules and thus manage many aspects of their lives.
In addition, a lot of free apps are available to the user thanks to the power of
inapp purchases and advertisements. However, these also raise many security
concerns. Apps are privy to a lot of private information regarding the user,
such as his contacts, location, etc. It is essential to ascertain that apps do not
leak such information to untrustworthy entities. In order to solve this problem,
there have been many static and dynamic analyses which aim to track private
data accessed or generated by the app to its destination. Such analyses are
commonly known as Information Flow analyses. Dynamic analysis techniques,
such as TaintDroid, tracks private information and alerts the user when it is
accessed by specific API calls. However, they do not track the path taken by
the information, which can be useful in debugging and validation scenarios.
We propose a model to perform dynamic information flow analysis, inspired
by FlowDroid and TaintDroid, at a fine granularity so that we can obtain the
path taken by the information flow. The model uses pathedges to track the
information flows during a dynamic run which can then be used to obtain the
paths taken by the flows. We have implemented this model and used it to
obtain pathedges which can then be seeded into FlowDroid to obtain sound
and quicker results compared to standalone runs of FlowDroid. We tested the
model on 10 realworld apps where we find on average about 20% of the total
pathedges found by FlowDroid which gave us an average of 1.8x speedup of
the analysis.
 Series: Department Seminar Title: Testing and Analysis of Web Applications using Page Models  Speaker: Snigdha Athaiya
PhD student  Faculty Advisor: Prof. K V Raghavan
 Date and Time: Wednesday, July 05, 2017, 4:00 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Web applications are difficult to analyze using codebased tools
because dataflow and controlflow through the application occurs via both
serverside code and clientside pages. Clientside pages are typically
specified in a scripting language that is different from the main
serverside language; moreover, the pages are generated dynamically from the scripts.
To address these issues we propose a staticanalysis approach that automatically
constructs a ``model'' of each page in a given application. A page model is a code
fragment in the same language as the serverside code, which faithfully
overapproximates the possible elements of the page as well as the
controlflows and dataflows due to these elements. The serverside code
in conjunction with the page models then becomes a standard (nonweb)
program, thus amenable to analysis using standard codebased tools.
We have implemented our approach in the context of J2EE applications. We
demonstrate the versatility and usefulness of our approach by applying three
standard analysis tools on the resultant programs from our approach: a
concolicexecution based model checker (JPF), a dynamic fault localization
tool (Zoltar), and a static slicer (Wala).
*This is a practice talk for ISSTA 2017*
 Series: Ph.D. Thesis Defense Title: Automatic Storage Optimization of Arrays in Affine Loop Nests  Speaker: Somashekaracharya G Bhaskaracharya
 Faculty Advisor: Uday Kumar Reddy B
 Date and Time: Wednesday, July 05, 2017, 11:30 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Efficient memory usage is crucial for dataintensive applications as a
smaller memory footprint ensures better cache performance and allows
one to run a larger problem size given a fixed amount of main
memory. The solutions found by existing techniques for automatic
storage optimization for arrays in affine loopnests, which minimize
the storage requirements for the arrays, are often far from good or
optimal and could even miss nearly all storage optimization
potential. In this work, we present a new automatic storage
optimization framework and techniques that can be used to achieve
intraarray as well as interarray storage reuse within affine
loopnests with a predetermined schedule.
Over the last two decades, several heuristics have been developed for
achieving complex transformations of affine loopnests using the
polyhedral model. However, there are no comparably strong heuristics
for tackling the problem of automatic memory footprint
optimization. We tackle the problem of storage optimization for arrays
by formulating it as one of finding the right storage partitioning
hyperplanes: each storage partition corresponds to a single storage
location. Statementwise storage partitioning hyperplanes are
determined that partition a unified global array space so that values
with overlapping live ranges are not mapped to the same partition. Our
integrated heuristic for exploiting intraarray as well as interarray
reuse opportunities is driven by a fourfold objective function that
not only minimizes the dimensionality and storage requirements of
arrays required for each highlevel statement, but also maximizes
interstatement storage reuse.
We built an automatic polyhedral storage optimizer called SMO using
our storage partitioning approach. Storage reduction factors and
other results that we obtained from SMO demonstrate the effectiveness of our
approach on several benchmarks drawn from the domains of image
processing, stencil computations, highperformance computing, and the
class of tiled codes in general. The reductions in storage requirement
over previous approaches range from a constant factor to asymptotic in
the loop blocking factor or array extents  the latter being a
dramatic improvement for practical purposes.
As an incidental and related topic, we also studied the problem of
polyhedral compilation of graphical dataflow programs. While
polyhedral techniques for program transformation are now used in
several proprietary and open source compilers, most of the research on
polyhedral compilation has focused on imperative languages such as C,
where the computation is specified in terms of statements with zero or
more nested loops and other control structures around them. Graphical
dataflow languages, where there is no notion of statements or a
schedule specifying their relative execution order, have so far not
been studied using a powerful transformation or optimization approach.
The execution semantics and referential transparency of dataflow
languages impose a different set of challenges. In this work, we
attempt to bridge this gap by presenting techniques that can be used
to extract polyhedral representation from dataflow programs and to
synthesize them from their equivalent polyhedral representation. We
then describe PolyGLoT, a framework for automatic transformation of
dataflow programs that we built using our techniques and other popular
research tools such as Clan and Pluto. For the purpose of experimental
evaluation, we used our tools to compile LabVIEW, one of the most
widely used dataflow programming languages. Results show that
dataflow programs transformed using our framework are able to
outperform those compiled otherwise by up to a factor of seventeen,
with a mean speedup of 2.30$times$ while running on an 8core Intel
system.
Speaker Bio: Somashekaracharya G. B is a PhD (ERP) student in the
Multicore Computing Lab, Department of Computer Science and
Automation, Indian Institute of Science. He obtained his
Bachelor's degree in Computer Science from R. V. College of
Engineering, Bangalore in 2008. He is currently working as
an Advanced Senior Software Engineer at National Instruments
R&D, India. His interests are broadly in compilers, dataflow
languages, and parallel programming.
Host Faculty: Uday Kumar Reddy B
 Series: Department Seminar Title: DomLock: A New MultiGranularity Locking Technique for Hierarchies  Speaker: Prof. Rupesh Nasre
 Date and Time: Wednesday, July 05, 2017, 10:00 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract This talk will present a new locking technique for hierarchies. Finegrained and coarsegrained are two extreme ways of locking, and their generalization, multigranularity locking (MGL) offers several options in between. MGL has been implemented in current database systems such as Oracle and MySQL using a traversalbased protocol called Intention Locks. I would highlight limitations of intention locks and propose a new way to perform MGL based on a DFSlike numbering of the hierarchy. Key to the technique is modeling of the hierarchical structure in numerical manner. The talk would end highlighting limitations of the proposed technique and a few interesting open problems.
Speaker Bio: Rupesh Nasre is an Assistant Professor in the CSE department at IIT Madras. He completed PhD from IISc Bangalore and PostDoctoral Fellowship from the University of Texas at Austin. His research focus is in Compilers and Parallelization. He is a recipient of the Young Faculty Recognition Award at IIT Madras, NVIDIA Special Prize for CodeForScience, a winner of the Yahoo! University Hack Day, and holds five US patents.
Host Faculty: Uday Kumar Reddy B
 Series: Department Seminar Title: Efficient Computation of HappensBefore Relation for EventDriven Programs  Speaker: Ms. Pallavi Maiya
PhD Student  Faculty Advisor: Prof. Aditya Sunil Kanade
 Date and Time: Monday, July 03, 2017, 12:00 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract An emerging style of programming is to use both threads and events to
achieve better scalability. The improved scalability comes at the price
of increased complexity, as both threads and events can follow
nondeterministic schedules. The happensbefore (HB) relation captures
the space of possible schedules and forms the basis of various concurrency
analyses. Improving efficiency of the HB computation can speed up these
analyses.
In this paper, we identify a major bottleneck in computation of the HB
relation for such eventdriven programs. Eventdriven programs are designed
to interact continuously with their environment, and usually receive a large
number of events even within a short span of time. This increases the cost of
discovering the HB order among the events. We propose a novel data structure,
called event graph, that maintains a subset of the HB relation to efficiently
infer order between any pair of events. We present an algorithm, called
EventTrack, which improves efficiency of vector clock based HB computation for
eventdriven programs using event graphs.
We have implemented EventTrack and evaluated it on traces of eight Android
applications. Compared to the stateoftheart technique, EventTrack gave an
average speedup of 4.9X. The speedup ranged from 1.8X to 10.3X across the
applications.
*This is a practice talk for ISSTA 2017*
 Series: M.Sc. (Engg) Thesis Defense Title: Automatic Optimization of Geometric Multigrid Methods using a DSL Approach  Speaker: Mr. Vinay Vasista
M.Sc. (Engg) Student
Dept. of CSA
IISc  Faculty Advisor: Prof. Uday Kumar Reddy B
 Date and Time: Friday, June 30, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract The Geometric Multigrid (GMG) method is widely used in numerical analysis to
accelerate the convergence of partial differential equations solvers using a
hierarchy of grid discretizations. Multiple grid sizes and recursive
expression of multigrid cycles make the task of program optimization tedious
and errorprone. A highlevel language that aids domain experts to quickly
express complex algorithms in a compact way using dedicated constructs for
multigrid methods and with good optimization and parallelization support is
thus valuable.
In this work, we demonstrate how high performance can be achieved along with
enhanced programmability for GMG, with new language and optimization support
in the PolyMage domain specific language framework. We compare our approach
with (a) handoptimized codes, (b) handoptimized codes optimized in
conjunction with techniques based on the polyhedral compiler framework, and
(c) the existing PolyMage optimizer adapted for Multigrid. We use benchmarks
varying in Multigrid cycle structure, smoothing steps, and other aspects, in
addition to the NAS Multigrid benchmark for evaluation.
On a 24core Intel Xeon Haswell multicore system, our automatically
optimized codes achieve a mean improvement of 3.2x over a straightforward
parallelization, and an improvement of 1.31x over the PolyMage DSL
optimizer. In most cases, we were also able to match or outperform handoptimized
implementations available. The results demonstrate the effectiveness of the
approach on delivering high performance and high programmer productivity.
Speaker Bio: Vinay Vasista is an M.Sc student in thr Multicore Computing Lab,
Department of CSA, Indian Institute of Science. He got his bachelor's degree in Computer
Science from Sri Jayachamarajendra College of Engineering, Mysore in 2013. His
interests are in High Performance Computing, Parallel programming, Compilers
and program optimization. He is currently working as Software Developer at
Mentor Graphics Corporation, India.
 Series: Ph.D. Colloquium Title: Static analysis and automated testing of fileprocessing programs  Speaker: M. Raveendra Kumar
 Faculty Advisor: K. V. Raghavan
 Date and Time: Thursday, June 29, 2017, 11:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Programs that process data residing in files are widely used in domains
such as banking, healthcare, networking, and logfile analysis. For these
programs, there is a strong need for automated tool support for analysis
and transformation tasks such as bug detection, program understanding, and
program remodularization. However, due to the complex idioms used in
fileprocessing programs, precise analysis of these programs as well as
automated testing of these programs is a challenging task. These are the
problems that we address in this thesis.
Our first contribution is about dataflow analysis of fileprocessing
programs. Our key insight here is that the analysis of a file processing
program can be made more precise and useful if knowledge of the input file
format of the program is made available to the analysis and incorporated
into the transfer functions. We show empirical results for this using
standard dataflow problems. We also develop two novel applications of this
technique  the first one to specialize a program to a given "restricted"
input file format, and the second one to verify if a program "conforms" to
a given input file format. Experiments on a set of benchmark programs
showed that our approach provides precise output compared to naive
approaches.
For precise dataflow analysis of programs with multiple procedures,
*context sensitive* interprocedural data flow analysis is required. It is
very expensive in its full form. As our second contribution, we propose a
*prefix* approximation for contextsensitive analysis, wherein a prefix of
the full context stack is used to tag dataflow facts. Our technique, which
is in contrast with the standard *suffix* approximation technique, is
designed to be more scalable on programs that have distinct *application*
and *library* layers, while offering comparable or even better precision
within the application layer. We analyzed several large enterprise
fileprocessing programs using an implementation of our technique, and
compared it with the suffix approximation technique. Our approach's
performance was several times better than that of the suffix approximation,
while generally not suffering any precision penalty in the application
layer of the programs. It is notable that this approach could be equally
applicable in the context of non fileprocessing programs as well.
Our final contribution is regarding automated testing of fileprocessing
programs. *Fuzz testing* is an automated softwaretesting technique that
aims to uncover runtime errors by executing the target program with a
large number of inputs generated automatically and systematically. *Grey
box* fuzzing is a kind of fuzzing that employs lightweight instrumentation
of the program, and observes the behaviors exhibited on a set of test
runs. It then uses this information to generate new test inputs that might
exercise other behaviors. This process is iterated until as many behaviors
as possible get exhibited. AFL is a stateoftheart fuzzer for
fileprocessing programs. With AFL, if a test input causes a new region of
code in the program to be reached, it is considered as a new behavior. In
other words, code coverage is the objective. However, a vulnerability at a
program point may not get exploited in every visit to the program point;
only certain program paths that lead to that point may expose the
vulnerability. We extend greybox fuzzing to take into account a given
vulnerability of interest (e.g., buffer overruns), and to try explicitly to
traverse paths that are likely to expose such vulnerabilities. We have
implemented our approach by enhancing AFL, in the context of the buffer
overrun vulnerability. We evaluated our approach on a suite of benchmark
programs, and compared it with AFL in its standard form. Our approach
could uncover many more number of buffer overflow errors in these
benchmarks compared to standard AFL for the same given amount of fuzzing
time.
Speaker Bio: M. Raveendra Kumar is a PhD student at the Department of CSA, IISc. He is also a researcher with TCS Innovation Labs. His interests are in static analysis and dynamic analysis of enterprise applications.
 Series: M.Sc. (Engg) Colloquium Title: Deep learning models for Fewshot and Metric Learning  Speaker: Akshay Mehrotra
 Faculty Advisor: Prof. Ambedkar Dukkipati
 Date and Time: Tuesday, June 27, 2017, 12:00 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Deep neural networks achieve unprecedented performance levels over many
tasks in the traditional supervised setting and scale well with large
quantities of data. On the other hand improving performance in the lowdata
regime is understudied and the sheer number of parameters versus small
amounts of data makes it a challenging task. The problem of learning from a
very small amount is known as fewshot learning. A related problem is that
of metric learning over disjoint training and testing classes, where the
task is to learn a function that maps a data point to a lowdimensional
manifold such that similar points are closer in the lowdimensional space.
In this thesis, we propose a couple of deep learning approaches for
fewshot learning, and then extend them to the related problem of metric
learning over disjoint training and testing classes.
We first argue that a more expressive pairwise similarity matching
component is crucial for solving the fewshot learning problem. Towards
that end, a network design with a learnable and more expressive similarity
objective is proposed by extending the deep residual network. This residual
pairwise network approximates a learned metric in the representation space
and outperforms previous stateoftheart on the challenging miniImagenet
dataset for fewshot learning by getting over 55% accuracy for the 5way
classification task over unseen classes. We also evaluate the
generalization behaviour of deep residual networks with a varying number of
parameters over classes not observed during training.
Next, since regularization plays a key role in learning with small amounts
of data, an additional generator network is proposed by extending the
Generative Adversarial Network (GAN) framework for disjoint training and
testing classes. This provides a strong regularizer by leveraging the
generated data samples. The proposed model can generate plausible
variations of exemplars over unseen classes and outperforms strong
discriminative baselines with L2 regularization for few shot classification
tasks.
Finally, the idea of regularizing by adversarial generation is extended to
the metric learning setting over disjoint training and testing classes. The
proposed model is an efficient alternative to the hard negative mining
scheme necessary when training with triplets in the large margin nearest
neighbours setting. The proposed model shows performance and efficiency
improvement over models trained without negativemining or with
approximations commonly used for metric learning with triplet loss
respectively.

