Events 

Seminars 

Prof. I. G. Sarma Memorial Lecture Series 

Prof. Priti Shankar Memorial Seminar Series 

Multimedia Archive 



UPCOMING SEMINARS 

Series: M.Sc. (Engg) Thesis Defense Title: Automatic Optimization of Geometric Multigrid Methods using a DSL Approach  Speaker: Mr. Vinay Vasista
M.Sc. (Engg) Student
Dept. of CSA
IISc  Faculty Advisor: Prof. Uday Kumar Reddy B
 Date and Time: Friday, June 30, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract The Geometric Multigrid (GMG) method is widely used in numerical analysis to
accelerate the convergence of partial differential equations solvers using a
hierarchy of grid discretizations. Multiple grid sizes and recursive
expression of multigrid cycles make the task of program optimization tedious
and errorprone. A highlevel language that aids domain experts to quickly
express complex algorithms in a compact way using dedicated constructs for
multigrid methods and with good optimization and parallelization support is
thus valuable.
In this work, we demonstrate how high performance can be achieved along with
enhanced programmability for GMG, with new language and optimization support
in the PolyMage domain specific language framework. We compare our approach
with (a) handoptimized codes, (b) handoptimized codes optimized in
conjunction with techniques based on the polyhedral compiler framework, and
(c) the existing PolyMage optimizer adapted for Multigrid. We use benchmarks
varying in Multigrid cycle structure, smoothing steps, and other aspects, in
addition to the NAS Multigrid benchmark for evaluation.
On a 24core Intel Xeon Haswell multicore system, our automatically
optimized codes achieve a mean improvement of 3.2x over a straightforward
parallelization, and an improvement of 1.31x over the PolyMage DSL
optimizer. In most cases, we were also able to match or outperform handoptimized
implementations available. The results demonstrate the effectiveness of the
approach on delivering high performance and high programmer productivity.
Speaker Bio: Vinay Vasista is an M.Sc student in thr Multicore Computing Lab,
Department of CSA, Indian Institute of Science. He got his bachelor's degree in Computer
Science from Sri Jayachamarajendra College of Engineering, Mysore in 2013. His
interests are in High Performance Computing, Parallel programming, Compilers
and program optimization. He is currently working as Software Developer at
Mentor Graphics Corporation, India.
 Series: M.Sc. (Engg) Colloquium Title: Deep learning models for Fewshot and Metric Learning  Speaker: Akshay Mehrotra
 Faculty Advisor: Prof. Ambedkar Dukkipati
 Date and Time: Tuesday, June 27, 2017, 12:00 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Deep neural networks achieve unprecedented performance levels over many
tasks in the traditional supervised setting and scale well with large
quantities of data. On the other hand improving performance in the lowdata
regime is understudied and the sheer number of parameters versus small
amounts of data makes it a challenging task. The problem of learning from a
very small amount is known as fewshot learning. A related problem is that
of metric learning over disjoint training and testing classes, where the
task is to learn a function that maps a data point to a lowdimensional
manifold such that similar points are closer in the lowdimensional space.
In this thesis, we propose a couple of deep learning approaches for
fewshot learning, and then extend them to the related problem of metric
learning over disjoint training and testing classes.
We first argue that a more expressive pairwise similarity matching
component is crucial for solving the fewshot learning problem. Towards
that end, a network design with a learnable and more expressive similarity
objective is proposed by extending the deep residual network. This residual
pairwise network approximates a learned metric in the representation space
and outperforms previous stateoftheart on the challenging miniImagenet
dataset for fewshot learning by getting over 55% accuracy for the 5way
classification task over unseen classes. We also evaluate the
generalization behaviour of deep residual networks with a varying number of
parameters over classes not observed during training.
Next, since regularization plays a key role in learning with small amounts
of data, an additional generator network is proposed by extending the
Generative Adversarial Network (GAN) framework for disjoint training and
testing classes. This provides a strong regularizer by leveraging the
generated data samples. The proposed model can generate plausible
variations of exemplars over unseen classes and outperforms strong
discriminative baselines with L2 regularization for few shot classification
tasks.
Finally, the idea of regularizing by adversarial generation is extended to
the metric learning setting over disjoint training and testing classes. The
proposed model is an efficient alternative to the hard negative mining
scheme necessary when training with triplets in the large margin nearest
neighbours setting. The proposed model shows performance and efficiency
improvement over models trained without negativemining or with
approximations commonly used for metric learning with triplet loss
respectively.
 Series: Department Seminar Title: Provably Secure Computing using Certified Software and Trusted
Hardware  Speaker: Prof. Sanjit A. Seshia, EECS Dept., UCBerkeley
 Date and Time: Tuesday, June 27, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Securitycritical applications face a range of threats including
exploits in privileged software such as the operating system and
hypervisor. In order to protect sensitive data from such privileged
adversaries, there is increasing interest in developing and using secure
hardware primitives, such as Intel SGX, ARM TrustZone, and Sanctum
RISCV extensions. In this talk, I will discuss the problem of building
applications with provable security guarantees by including only these
hardware primitives (and very little software) in the trusted computing
base. We have developed compiler and verification techniques to develop
applications that can be certified (at the level of machine code) to not
leak secrets, both via their outputs and certain side channels.
Furthermore, I will discuss how formal models of trusted hardware can be
leveraged to port programs and their correctness proofs across different
hardware platforms.
Speaker Bio: Sanjit A. Seshia is a Professor in the Department of Electrical
Engineering and Computer Sciences at the University of California,
Berkeley. He received an M.S. and Ph.D. in Computer Science from
Carnegie Mellon University, and a B.Tech. in Computer Science and
Engineering from IITBombay. His research interests are in dependable
computing and computational logic, with a current focus on applying
automated formal methods to problems in cyberphysical systems, computer
security, electronic design automation, and synthetic biology. His Ph.D.
thesis work on the UCLID verifier and decision procedure helped pioneer
the area of satisfiability modulo theories (SMT) and SMTbased
verification. He is coauthor of a widelyused textbook on embedded,
cyberphysical systems and has led the development of technologies for
cyberphysical systems education based on formal methods. His awards and
honors include a Presidential Early Career Award for Scientists and
Engineers (PECASE), an Alfred P. Sloan Research Fellowship, the
Frederick Emmons Terman Award for contributions to electrical
engineering and computer science education, and the School of Computer
Science Distinguished Dissertation Award at Carnegie Mellon University.
Host Faculty: Prof. Vinod Ganapathy


PAST SEMINARS 

Series: Department Seminar Title: Understanding the cost and the opportunity to enhance
programmability of heterogeneous processors  Speaker: Dr Arkaprava Basu, AMD Research, Austin TX
 Date and Time: Monday, June 19, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract As the benefits from transistor scaling diminish, the computing
is turning increasingly heterogeneous. In a heterogeneous processor
domainspecific accelerators like graphics processing unit (GPU) are
coupled with general purpose CPUs for better performance and energy
efficiency. However, programming such heterogeneous processors, consisting
of multiple disparate computing element, is a challenge. In this talk, we
will discuss two emerging features that can address this  namely, (1)
the shared virtual memory, and (2) the ability to invoke operating system
services from an accelerator (e.g., GPU). Shared virtual memory or SVM
makes it easy to share data across the CPU and accelerators by presenting
them with the same view of the memory. Here, we will discuss possible
performance pitfall of the SVM implementation in a heterogeneous processor
and the opportunities to enhance its future iterations. Next, we will
discuss how an accelerator's ability to invoke operating system services
like page fault, file access, is crucial to make it a true peer to the
CPU. We will then discuss the performance implications of this capability
and avenues for betterment and further exploration.
Finally, we will conclude the talk by discussing several future research
opportunities to embrace heterogeneity across the hardware and the
software stack and across the compute and the memory.
Speaker Bio: Arkaprava (Arka) is a researcher with AMD Research at
Austin,TX. Arka’s research interest broadly lies in the area of computer
architecture. Since joining AMD Research in January of 2014, Arka has been
exploring various aspects of virtual memory management techniques and
heterogeneous processing as part of US Department of Energy’s Exascale
computing initiative. Before that he obtained his PhD in Computer Science
from the University of WisconsinMadison under the supervision of Prof.
Mark D. Hill and Prof. Michael M. Swift in December, 2013. Prior to that
Arka obtained M.Tech in computer science from IIT Kanpur and B.Tech from
Government Engineering College in Kalyani, West Bengal.
Host Faculty: Prof. Matthew Jacob
 Series: Department Seminar Title: Rethinking TLB Designs in Virtualized Environments: A Very Large PartofMemory TLB  Speaker: Dr. Nagendra Gulur Dwarakanath, Texas Instruments
 Date and Time: Friday, June 16, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract With increasing deployment of virtual machines for cloud services and
server applications, memory address translation overheads in virtualized
environments have received great attention. In the radix4 type of page
tables used in x86 architectures, a TLB miss necessitates up to 24 memory
references for one guest to host translation. While dedicated page walk
caches and such recent enhancements eliminate many of these memory references,
our measurements on the Intel Skylake processors indicate that many programs
in virtualized mode of execution still spend hundreds of cycles for
translations that do not hit in the TLBs.
This talk presents an innovative scheme to reduce the cost of address
translations by using a very large Translation Lookaside Buffer
that is part of memory, the POMTLB. In the POMTLB, only one access
is required instead of up to 24 accesses required in commonly used 2D walks
with radix4 type of page tables. Even if many of the 24 accesses may
hit in the page walk caches, the aggregated cost of the many hits plus the
overhead of occasional misses from page walk caches still exceeds the cost
of one access to the POMTLB. Since the POMTLB is part of the memory space,
TLB entries (as opposed to multiple page table entries) can be cached in
large L2 and L3 data caches, yielding signifiicant benefits. Through detailed
evaluation running SPEC, PARSEC and graph workloads, we demonstrate that the
proposed POMTLB improves performance by approximately 10% on average.
The improvement is more than 16% for 5 of the benchmarks. It is
further seen that a POMTLB of 16MB size can eliminate nearly all TLB
misses in 8core systems.
Speaker Bio: Dr. Nagendra Gulur Dwarakanath works for Texas Instruments, in the area of
executable specifications for automotive processors and micro controllers.
He obtained his M.E. and PhD degrees from IISc. His research interests are
primarily in Computer Architecture with specific focus on memory systems,
heterogeneous architectures, accelerators, emerging workloads and programming
models.
Host Faculty: R. Govindarajan
 Series: Department Seminar Title: Static Deadlock Detection for Asynchronous C# Programs  Speaker: Mr. Anirudh Santhiar
Ph.D Student
Dept. of CSA
IISc  Faculty Advisor: Prof. Aditya Sunil Kanade
 Date and Time: Wednesday, June 14, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Asynchronous programming is a standard approach for designing responsive
applications. Modern languages such as C# provide async/await primitives
for the disciplined use of asynchrony. In spite of this, programs can
deadlock because of incorrect use of blocking operations along with
nonblocking (asynchronous) operations. While developers are aware of this
problem, there is no automated technique to detect deadlocks in
asynchronous programs.
We present a novel representation of control flow and scheduling of
asynchronous programs, called the continuation scheduling graph, and
formulate necessary conditions for a deadlock to occur in a program. We
design static analyses to construct continuation scheduling graphs of
asynchronous C# programs and to identify deadlocks in them. We have
implemented the static analyses in a tool called DeadWait. Using DeadWait,
we found 43 previously unknown deadlocks in 11 asynchronous C# libraries.
We reported the deadlocks to the library developers. They have confirmed
and fixed 40 of them.
*This is a practice talk for PLDI 2017*
 Series: Department Seminar Title: Cumulative prospect theory meets bandits and reinforcement learning  Speaker: Dr. Prashanth L.A
Department of Computer Science and Engineering
Indian Institute of Technology Madras  Date and Time: Monday, June 12, 2017, 11:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Traditional optimization approaches rely on the central
assumption that expected (value) utility appropriately captures
preferences, but there is overwhelming evidence in behavioral economics
research that indicates that this is not the case for human decision
making. Violations of the expected valuebased preferences in humanbased
decision making systems can be addressed by incorporating distortions in
the underlying probabilities of the system. We incorporate probabilistic
distortions in a stochastic optimization framework, speciï¬cally using the
wellestablished cumulative prospect theory (CPT) of Tversky and Kahneman.
The stochastic optimization setting presents two particular challenges when
CPT is applied: estimating the CPT objective requires estimations of the
entire distribution of the underlying random variable, and traditional
optimization approaches must deal with biased (albeit asymptotically
unbiased) samples of the CPT objective. We propose sampleeï¬ƒcient
estimation schemes and use the resulting estimators for optimizing an
appropriate CPT objective in both ï¬nite discrete optimization (e.g.,
multiarmed bandit models, ranking & selection) and in settings with
continuous decision variables (e.g., policy optimization for Markov
decision processes) using gradientbased and global stochastic optimization
algorithms. Two transportation applications will be used to test and
validate the proposed algorithms in the CPT framework.
Speaker Bio: Prashanth L.A. is an Assistant Professor in the Department of Computer
Science and Engineering at Indian Institute of Technology Madras. Prior to
this, he was a postdoctoral researcher at the Institute for Systems
Research, University of Maryland  College Park from 2015 to 2017 and at
INRIA Lille  Team SequeL from 2012 to 2014. From 2002 to 2009, he was with
Texas Instruments (India) Pvt Ltd, Bangalore, India.
He received his Masters and Ph.D degrees in Computer Science and Automation
from Indian Institute of Science, in 2008 and 2013, respectively. He was
awarded the third prize for his Ph.D. dissertation, by the IEEE Intelligent
Transportation Systems Society (ITSS).
He is the coauthor of a book entitled `Stochastic Recursive Algorithms for
Optimization: Simultaneous Perturbation Methods', published by Springer in
2013. His research interests are in reinforcement learning, stochastic
optimization and multiarmed bandits, with applications in transportation
systems, wireless networks and recommendation systems.
Host Faculty: Prof. Shalabh Bhatnagar
 Series: Ph.D. Thesis Defense Title: Ranking from Pairwise Comparisons: The Role of the Pairwise Preference
Matrix  Speaker: Mr. Arun Rajkumar
Ph.D Student
Dept. of CSA
IISc  Faculty Advisor: Prof. Chiranjib Bhattacharyya
 Date and Time: Friday, June 09, 2017, 9:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Ranking a set of candidates or items from pairwise comparisons is a
fundamental problem that arises in many settings such as elections,
recommendation systems, sports team rankings, document rankings and so on.
Indeed it is well known in the psychology literature that when a large
number of items are to be ranked, it is easier for humans to give pairwise
comparisons as opposed to complete rankings. The problem of ranking from
pairwise comparisons has been studied in multiple communities such as
machine learning, operations research, linear algebra, statistics etc.,
and several algorithms (both classic and recent) have been proposed.
However, it is not well understood under what conditions these different
algorithms perform well. In this thesis, we aim to fill this fundamental
gap, by elucidating precise conditions under which different algorithms
perform well, as well as giving new algorithms that provably perform well
under broader conditions. In particular, we consider a natural statistical
model wherein for every pair of items (i,j), there is a probability P_ij
such that each time items i and j are compared, item j beats item i with
probability P_ij. Such models, which we summarize through a matrix
containing all these pairwise probabilities, have been used explicitly or
implicitly in much previous work in the area; we refer to the resulting
matrix as the pairwise preference matrix, and elucidate clearly the
crucial role it plays in determining the performance of various
algorithms.
In the first part of the thesis, we consider a natural generative model
where all pairs of items can be sampled and where the underlying
preferences are assumed to be acyclic. Under this setting, we elucidate
the conditions on the pairwise preference matrix under which popular
algorithms such as matrix Borda, spectral ranking, least squares and
maximum likelihood under a BradleyTerryLuce (BTL) model produce optimal
rankings that minimize the pairwise disagreement error. Specifically, we
derive explicit sample complexity bounds for each of these algorithms to
output an optimal ranking under interesting subclasses of the class of all
acyclic pairwise preference matrices. We show that none of these popular
algorithms is guaranteed to produce optimal rankings for all acyclic
preference matrices. We then propose a novel support vector machine based
rank aggregation algorithm that provably does so.
In the second part of the thesis, we consider the setting where
preferences may contain cycles. Here, finding a ranking that minimizes the
pairwise disagreement error is in general NPhard. However, even in the
presence of cycles, one may wish to rank 'good' items ahead of the rest.
We develop a framework for this setting using notions of winners based on
tournament solution concepts from social choice theory. We first show that
none of the existing algorithms are guaranteed to rank winners ahead of
the rest for popular tournament solution based winners such as top cycle,
Copeland set, Markov set etc. We propose three algorithms  matrix
Copeland, unweighted Markov and parametric Markov  which provably rank
winners at the top for these popular tournament solutions. In addition to
ranking winners at the top, we show that the rankings output by the matrix
Copeland and the parametric Markov algorithms also minimize the pairwise
disagreement error for certain classes of acyclic preference matrices.
Finally, in the third part of the thesis, we consider the setting where
the number of items to be ranked is large and it is impractical to obtain
comparisons among all pairs. Here, one samples a small set of pairs
uniformly at random and compares each pair a fixed number of times; in
particular, the goal is to come up with good algorithms that sample
comparisons among only O(n log(n)) item pairs (where n is the number of
items). Unlike existing results for such settings, where one either
assumes a noisy permutation model (under which there is a true underlying
ranking and the outcome of every comparison differs from the true ranking
with some fixed probability) or assumes a BTL or Thurstone model, we
develop a general algorithmic framework based on ideas from matrix
completion, termed lowrank pairwise ranking, which provably produces an
optimal ranking by comparing only O(n log(n)) pairs, O(log(n)) times each,
not only for popular classes of models such as BTL and Thurstone, but also
for much more general classes of models wherein a suitable transform of
the pairwise probabilities leads to a lowrank matrix; this subsumes the
guarantees of many previous algorithms in this setting.
Overall, our results help to understand at a fundamental level the
statistical properties of various algorithms for the problem of ranking
from pairwise comparisons, and under various natural settings, lead to
novel algorithms with improved statistical guarantees compared to existing
algorithms for this problem.
 Series: Department Seminar Title: Common Information based Markov Perfect Equilibrium in Dynamic
Games with Asymmetric Information  Speaker: Dr. Abhishek Gupta
 Date and Time: Thursday, June 08, 2017, 3:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract In many engineering systems, agents compete to optimize their own
objective functions. Agents may have different information. For
instance, when a control system is attacked by a malicious attacker, the
attacker may not know the complete details about the system or the
underlying states of the system. Motivated by these considerations, we
developed a framework to compute Nash equilibrium in dynamic games with
asymmetric information.
To compute the equilibrium, we divided the agents’ information into
common information and private information. We show that the belief on
state and private information given the common information forms a
Markov state of a certain high dimensional virtual game. Under certain
assumptions, we can compute a Markov perfect equilibrium of this new
virtual game, which can be projected into the original game to compute a
Nash equilibrium of the original game. We will discuss the extension of
this technique to linearquadraticGaussian games and finite zerosum games.
Speaker Bio: Abhishek Gupta is an Assistant Professor in the ECE department at The
Ohio State University. Prior to joining OSU, he was a postdoctoral
researcher at the University of Southern California. He received his PhD
in Aerospace Engineering (2014), MS in Applied Mathematics (2012), and
MS in Aerospace Engineering (2011) from UIUC, and B.Tech. in Aerospace
Engineering from IIT Bombay (2009). His research interests are
optimization, game theory, market design, and cybersecurity
Host Faculty: Prof. Siddharth Barman
 Series: Ph.D. Colloquium Title: Efficient Static Analyses for Concurrent Programs  Speaker: Suvam Mukherjee
 Faculty Advisor: Deepak D'Souza
 Date and Time: Thursday, June 01, 2017, 4:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Concurrent programs are pervasive owing to the increasing adoption of
multicore systems across the entire computing
spectrum. Unfortunately, writing correct and efficient concurrent
programs is a hard endeavour. Automatically reasoning about the
behaviours of such programs is harder still  an exponential number of
behaviours are possible due to interactions between the threads. Thus,
concurrent programs provide fertile grounds for a large class of
insidious defects, which are difficult to detect and can cause
failures.
Static analysis techniques infer semantic properties of programs
without executing them. They are attractive because they are sound
(they can guarantee the absence of bugs), can execute with a fair
degree of automation, and do not depend on test cases. However,
current static analyses techniques for concurrent programs are either
precise and prohibitively slow, or fast but imprecise.
In this thesis, we partially address this problem by designing novel
efficient static analyses for concurrent programs.
In the first part of the thesis, we provide a framework for designing
and proving the correctness of data flow analyses that target data
race free (DRF) programs, and which are in the same spirit as the
"syncCFG" analysis originally proposed in [De2011]. To achieve this, we
first propose a novel concrete semantics for DRF programs called
LDRF, that is threadlocal in nature with each thread operating on
its own copy of the data state. We show that abstractions of our
semantics allow us to reduce the analysis of DRF programs to a
sequential analysis. This aids in rapidly porting existing sequential
analyses to scalable analyses for DRF programs. Next, we parameterize
the semantics with a partitioning of the program variables into
"regions" which are accessed atomically. Abstractions of the
regionparameterized semantics yield more precise analyses for
"regionrace free" concurrent programs. We instantiate these
abstractions to devise efficient relational analyses for race free
programs, which we have implemented in a prototype tool called
RATCOP. On the benchmarks, RATCOP was able to prove up to 65% of the
assertions, in comparison to 25% proved by a version of the analysis
from [De2011]. In a comparative study with a recent abstract
interpretation based concurrent static analyser, RATCOP was up to 5
orders of magnitude faster on a set of looselycoupled concurrent programs.
In the second part of the thesis, we provide techniques to detect all
"highlevel" data races in an RTOS kernel. A highlevel race occurs
when an execution interleaves instructions corresponding to
userannotated critical accesses to shared memory structures. Such
races are good indicators of atomicity violations. We propose a
technique for detecting all highlevel dataraces in a system library,
like the kernel API of a realtime operating system (RTOS) that relies
on flagbased scheduling and synchronization. Our methodology is based
on modelchecking, but relies on a metaargument to bound the number
of task processes needed to orchestrate a race. We describe our
approach in the context of FreeRTOS, a popular RTOS in the embedded
domain. Using our technique, a user is able to soundly disregard 99.8%
of an estimated 41,000 potential highlevel races. Our tool detected
38 highlevel data races in FreeRTOS, out of which 16 were harmful.
Speaker Bio: Suvam Mukherjee is a PhD student in the CSA department.
Host Faculty: Deepak D'Souza
 Series: M.Sc. (Engg) Colloquium Title: Large Scale Graph Processing in a Distributed Environment  Speaker: Nitesh Upadhyay
 Faculty Advisor: Y.N. Srikant
 Date and Time: Thursday, June 01, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Graph algorithms are ubiquitously used across domains. They exhibit parallelism,
which can be exploited on parallel architectures, such as, multicore processors
and accelerators. However, real world graphs are massive in size and cannot fit
into the memory of a single machine. Such large graphs are partitioned and
processed in a distributed cluster environment which consist of multiple GPUs
and CPUs.
Existing frameworks that facilitate large scale graph processing in the distributed
cluster have their own style of programming and require extensive involvement by
the user in communication and synchronization aspects. Adaptation of these
frameworks appears to be an overhead for a programmer. Furthermore, these frameworks
have been developed to target only CPU clusters and lack the ability to harness the
GPU architecture.
We provide a backend framework to the graph Domain Specific Language, Falcon,
for large scale graph processing on CPU and GPU clusters. The Motivation behind
choosing this DSL as a frontend is its sharedmemory based imperative programmability
feature. Our framework generates Giraph code for CPU clusters. Giraph code runs
on the Hadoop cluster and is known for scalable and faulttolerant graph processing.
For GPU cluster, Our framework applies a set of optimizations to reduce computation
and communication latency, and generates efficient CUDA code coupled with MPI.
Experimental evaluations show the scalability and performance of our framework for
both CPU and GPU clusters. The performance of the framework generated code is
comparable to the manual implementations of various algorithms in distributed
environments.
Speaker Bio: Nitesh Upadhyay is an M.Sc(Engg.) student in the CSA department.
Host Faculty: Y.N. Srikant
 Series: M.Sc. (Engg) Colloquium Title: HAShCache: Heterogeneity Aware Shared DRAMCache for Integrated Heterogeneous Systems  Speaker: Adarsh Patil
 Faculty Advisor: R. Govindarajan
 Date and Time: Wednesday, May 31, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Integrated Heterogeneous System (IHS) processors pack throughput oriented GPGPUs
alongside latency oriented CPUs on the same die sharing certain resources, e.g.,
shared last level cache, networkonchip (NoC), and the main memory. The demands for
memory accesses and other shared resources from GPU cores can exceed that of CPU
cores by 2 to 3 orders of magnitude. This disparity poses significant problems in
exploiting the full potential of these architectures.
In this work, we propose adding a large capacity stacked DRAM, used as a shared last
level cache, for the IHS processors. However, adding the DRAMCache naively, leaves
significant performance on the table due to the disparate demands from CPU and GPU
cores for DRAMCache and memory accesses. In particular, the imbalance can
significantly reduce the performance benefits that the CPU cores would have
otherwise enjoyed with the introduction of the DRAMCache, necessitating a
heterogeneity aware management of this shared resource for improved performance.
Further, in this work we propose three simple techniques to enhance the performance
of CPU application while ensuring very little to no performance impact to the GPU.
Specifically, we propose (i) PrIS, a prioritization scheme for scheduling CPU
requests at the DRAMCache controller (ii) ByE, a selective and temporal bypassing
scheme for CPU requests at the DRAMCache (iii) Chaining, an occupancy controlling
mechanism for GPU lines in the DRAMCache through pseudoassociativity. The resulting
cache, HAShCache, is heterogeneityaware and can adapt dynamically to address the
inherent disparity of demands in an IHS architecture.
We enhance the gem5gpu simulator to model an IHS architecture with a stacked DRAM
as cache, complete with coherence and unified physical memory. Using this setup we
perform detailed experimental evaluation of the proposed HAShCache which results in
an average system performance improvement of 41% over a naive DRAMCache and over
200% improvement over a baseline system with no stacked DRAMCache.
Speaker Bio: Adarsh Patil is an M.Sc[Engg] student of CSA. He has obtained his B.Tech
from M.S.Ramaiah Institute of Technology and has 2 years industry experience
at Goldman Sachs in the Virtualization and Linux engineering team.
 Series: CSA Distinguished Lecture Title: Old and New Algorithms for Guarding Polygons  Speaker: Prof. Subir Kumar Ghosh
Professor
Ramakrishna Mission Vivekananda University  Date and Time: Thursday, May 25, 2017, 10:30 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract The art gallery problem is to determine the number of guards (or
cameras) that are sufficient to cover or see every point in the interior
of an art gallery. An art gallery can be viewed as a polygon P (with or
without holes) with a total of n vertices, and guards as points in P. Any
point z in P is said to be visible from a guard g if the line segment
joining z and g does not intersect the exterior of P. This problem was
first posed by Victor Klee in a conference in 1973, and in the course of
time, it has become one of the important problems in computational
geometry with extensive applications to surveillance of buildings like
airport terminals, railway stations etc.
A polygon is said to be guarded by a set of chosen guards if every interior
point of P is visible from some guard within the guard set. In this lecture,
we will first present an O(log n)approximation algorithm to find the
minimum number of guards necessary for guarding P, which was designed by
Ghosh way back in 1986. For this problem, the approximation ratio was
improved to O(loglog OPT) by King and Kirkpatrickin 2011. Ghosh had also
conjectured in 1986 that a constantfactor approximation algorithm exists
for this problem in case of polygons without holes. This conjecture was
settled recently for a special class of P (without holes) by Bhattacharya,
Ghosh and Roy, when they presented a 6approximation algorithm for this
problem. An outline of this algorithm will be provided in the lecture.
Chromatic art gallery problems arise from applications in areas like
landmarkbased navigation and wireless networks. One such problem is the
weakconflict free guarding of polygons, where the objective is to colour
a guard set for P using minimum number of colours such that each point
within P sees at least one guard with an unique colour. So finally, we
will present an algorithm for weak conflictfree guarding of P (without
holes)where the guard set is coloured using only O(log n) colours.
Speaker Bio: Subir Kumar Ghosh was a professor of computer science and discrete applied
mathematics at Tata Institute of Fundamental Research, Mumbai till July,
2015. Since then, he is a professor at RKM Vivekananda University, Belur,
West Bengal. He is a Fellow of Indian Academy of Sciences since 1998.
Recently, he has been awarded a professorship by the National Board for
Higher Mathematics for his lifetime contributions to Mathematics. He
worked
as visiting scientists at many reputed universities and research institutes
around the world, and gave several invited lectures in the national and
international conferences, workshops and in the institutes in India and
abroad. Currently, he is the Chair of the Steering Committee of an
international conference on algorithms and discrete applied mathematics
(CALDAM). He is the author of a book and around sixty six papers in the
fields of computational geometry, robot path planning, geometric graph
theory and algorithms (sequential, parallel, online and approximation).
Host Faculty: Prof. Sathish Govindarajan
 Series: M.Sc. (Engg) Colloquium Title: COMPILATION OF GRAPH ALGORITHMS FOR HYBRID, CROSSPLATFORM AND DISTRIBUTED ARCHITECTURES  Speaker: Parita Patel
 Faculty Advisor: Y.N. Srikant
 Date and Time: Tuesday, May 23, 2017, 12:00 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Graph algorithms are abundantly used in various disciplines. These algorithms
perform poorly due to random memory access and negligible spatial locality. In
order to improve performance, parallelism exhibited by these algorithms can be
exploited by leveraging modern high performance parallel computing resources.
Implementing graph algorithms for these parallel architectures requires manual
thread management and memory management which becomes tedious for a programmer.
Large scale graphs cannot fit into the memory of a single machine. One solution
is to partition the graph either on multiple devices of a single node or on
multiple nodes of a distributed network. All the available frameworks for such
architectures demand unconventional programming which is difficult and error
prone.
To address these challenges, we propose a framework for compilation of graph
algorithms written in an intuitive graph domainspecific language, Falcon. The
framework targets shared memory parallel architectures, computational
accelerators and distributed architectures (CPU and GPU cluster). First, it
analyses the abstract syntax tree (generated by Falcon) and gathers essential
information. Subsequently, it generates optimized code in OpenCL for
sharedmemory parallel architectures and computational accelerators, and OpenCL
coupled with MPI code for distributed architectures. Motivation behind generating
OpenCL code is its platformagnostic and vendoragnostic behavior, i.e., it is
portable to all kinds of devices. Our framework makes memory management, thread
management, message passing, etc., transparent to the user. To the best of our
knowledge, none of the available domainspecific languages, frameworks or parallel
libraries handle portable implementations of graph algorithms.
Experimental evaluations demonstrate that the generated code performs comparably
to the stateoftheart nonportable implementations and handtuned implementations.
The results also show portability and scalability of our framework.
Speaker Bio: Parita Patel is an M.Sc(Engg.) student in the CSA department.
Host Faculty: Y.N. Srikant
 Series: Department Seminar Title: Deep Learning made easy with DARVIZ  Speaker: Shreya Khare and Anush Sankaran
 Date and Time: Thursday, May 18, 2017, 3:30 PM
 Venue: CSA Lecture Hall (Room No. 117, Ground Floor)
Abstract Struggling with implementing research papers in deep learning? Confused
between which deep learning framework to use  Tensorflow, Theano,
CAFFE, etc? Wasting hours of precious time in debugging code for building
the right deep learning model? You will not longer face these challenges!
DARVIZ. Its a visual programming IDE, where you could design a deep learning
model using an intuitive draganddrop framework. Once designed, DARVIZ
could write the execution ready deep learning code for you in three platforms
 Tensorflow, CAFFE, and Theano. This tool helps in extreme rapid prototyping
of deep learning models and implementation of state of art papers.
Speaker Bio: Shreya Khare is software engineer working in the area of deep learning and automation at IBM Research in Bangalore, India. She is currently working on DARVIZ, a deep learning IDE for implementing, visualizing and validating deep learning models. Her research interests include Machine Learning, Speech Processing, Computer Vision and Natural Language Processing. She has completed her MS by Research from Department of Computer Science, IIT Madras.
Anush Sankaran is a Researcher in the Cognitive Solutions and Services team. His research interests include deep learning, image processing, human cognition, and their applications. He is now primarily leading efforts in two projects: DARVIZ and Machine Learning for Creativity. He is currently pursuing the Ph.D. degree with the Indraprastha Institute of Information Technology, New Delhi, India. He was a recipient of the TCS Ph.D. Research Fellowship from 2010 to 2015. He has written many peerreviewed conferences and journals and also has the Best Poster Awards in the IEEE BTAS 2013 and the IEEE IJCB 2014. Anush received the B.Tech. degree (Gold Medal) in computer science from the Coimbatore Institute of Technology, Coimbatore, India, in 2010.
Host Faculty: Aditya Kanade
 Series: Department Seminar Title: Software Transactional Memory Systems for Processing LargeScale Graph Analytics  Speaker: Sathya Peri
 Date and Time: Wednesday, May 17, 2017, 3:30 PM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Nowadays, multicore systems are quite ubiquitous from cell phones to
high performance computation servers. These systems were introduced
around 2004, 50 years of exponential improvement in the performance of
sequential computers ended. In order to get a continued speedup on
these processors, applications need to be able to harness the
parallelism of the underlying hardware. This is commonly achieved
using multithreading. Yet writing correct and scalable multithreaded
programs is far from trivial. In multithreaded programs sets of
semantically related actions may need to execute in mutual exclusion
to avoid semantic inconsistencies.
Traditionally, multithreaded programs were developed in conjunction
with locks to address these issues. But programming with locks has
many disadvantages such as deadlocks, priority inversion etc. and
makes it difficult to build scalable software systems. Importantly,
lock based software components are difficult to compose i.e. build
larger software systems using simpler software components. In recent
years, Software Transactional Memory (STM) has garnered significant
interest as an elegant alternative for developing concurrent
code. Software transactions address many of the shortcomings of lock
based systems. Multithreaded programs used with STMs address many of
these challenges.
In the past few years, analytics over large data, (also commonly
referred to as BigData Analytics) has become an very important in
various industries. Many of the problems of large data and datamining
can be cast as problems of largescale graph analytics problems. This
would typically involve looking for operations on large graphs having
millions of vertices and edges.
In this talk, I will give an overview of Software Transactional
Memory (STMs). I will describe the correctness requirements of STM
systems, describe its current state. Then, I will briefly describe how
STMs can benefit largescale graph analytics. I will specifically
describe about how STMs can benefit the problem of graph coloring
having millions of vertices.
Speaker Bio: Dr. Sathya Peri is an Associate Professor at IIT Hyderabad. He
received his masters in Computer Science from Madurai Kamaraj University, India in
2001. After working as a software engineer at HCL Technologies, India
for around one year he joined University of Texas at Dallas as a
graduate student. He obtained Masters degree and Ph.D. degrees in
Computer Science from the University of Texas at Dallas in 2004 and
2007 respectively. He worked under the guidance of Prof. Neeraj Mittal
in the area of monitoring distributed systems. Then he worked at
INRIA, Rennes, France as a postdoc from Sept 2007 to Dec 2008 in the
area of Gossip Protocols for Peer to Peer systems. From Jan 2009 to
Apr 2010, he was a Postdoc at Memorial University working with
Prof. Vidyasankar in the area of Software Transactional Memory.
Host Faculty: Deepak D'Souza
 Series: M.Sc. (Engg) Colloquium Title: An Exploratory Framework for Cyclone Identification and Tracking  Speaker: Akash Anil Valsangkar
 Faculty Advisor: Prof. Vijay Natarajan
 Date and Time: Wednesday, May 17, 2017, 11:00 AM
 Venue: CSA Multimedia Class (Room No. 252, First Floor)
Abstract Analyzing depressions plays an important role in meteorology, particularly
in the study of cyclones. In particular, the study of the temporal evolution
of cyclones requires a robust depression tracking framework. Depressions
and cyclones are not welldefined objects and their shape and size
characteristics change over time. Further, current extraction and tracking
methods depend on many parameters and are often specialized for particular
regions. In visualization literature, most of the tracking algorithms use
spatial analysis for feature identification and noise removal. With state
of the art methods, temporal noise leads to cluttered visualization.
We propose a pipeline for identification and tracking of cyclones over time.
Our method requires only a few intuitive parameters. This method for cyclone
identification within each time step is based on wellestablished topological
concepts enabling a robust tracking. Candidate tracks are computed from an
optical flow field. These tracks are clustered within a moving time window
and forwarded to a final tracking step. An exploratory framework helps in
the study of cyclone movement and identifying smooth, representative tracks.
Multiple case studies demonstrate the effectiveness of the method in
tracking cyclones, both in the northern and southern hemisphere.
 Series: M.Sc. (Engg) Thesis Defense Title: Utilizing Worker Groups and Task Dependencies in Crowdsourcing  Speaker: Mr. Prakhar Ojha
 Faculty Advisor: Dr. Partha Pratim Talukdar
 Date and Time: Monday, May 08, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Crowdsourcing has evolved from solving simpler tasks, like
imageclassification, to more complex tasks such as document editing,
language translation, product designing etc. Unlike microtasks performed
by a single worker, these complex tasks require a team of workers and group
output is often the atomic unit of evaluation. If the taskrequester is
interested in making individual payments based on their respective efforts
in the group, she will need a strategy to discriminate between workers (who
contribute positively) from idlers (who do not contribute to group task).
It is a nontrivial problem, especially since only group output is
evaluated. In the first part of this talk, I shall demonstrate how ideas
from Group Testing may be effectively used for this challenging problem.
This is the first application of Group Testing to crowdsourcing. I shall
also list out open problems in this area.
In the second part of the talk, I shall focus on Relational Crowdsourcing
(RelCrowd), a novel crowdsourcing paradigm where human intelligence tasks
(HITs) are created by taking task interdependencies. I shall describe how
this framework may be used in the evaluation of large knowledge graphs
(KG), we call the resulting system KGeval. Estimating accuracy of
automatically constructed KGs is a challenging problem due to their size
and diversity. We shall show that the objective optimized by KGEval is
submodular and NPhard, thereby allowing for approximation algorithms with
guarantees. Through experiments on realworld datasets, our results
demonstrate that KGEval is able to estimate KG accuracy more accurately
compared to other competitive baselines, while requiring significantly
lesser number of human evaluations.
 Series: CSA Distinguished Lecture Title: The Magnificent Seven  Speaker: Prof. Vivek Shripad Borkar
Institute Chair Professor
Department of Electrical Engineering
Indian Institute of Technology, Bombay  Date and Time: Thursday, May 04, 2017, 2:30 PM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract This talk will give an overview of restless bandits and then
will introduce individually, seven restless bandits. The summary in
brief is:
Bandits are a pest
if they don't get much rest,
so do relax a little
your constraints a la Whittle.
Then all that it takes
is a simple index.
Speaker Bio: Prof. Vivek Borkar has made stellar contributions in the areas
of stochastic control and optimization, reinforcement learning, random
processes, and several other areas in control and optimization. He is a
J.C. Bose National Fellow and a Fellow of IEEE, INSA, IASc, INAE, NASI, and
TWAS. He is a recipient of several best paper awards and is widely known
for many pathbreaking fundamental results in the above areas.
Host Faculty: Prof. Y Narahari
 Series: M.Sc. (Engg) Thesis Defense Title: Stochastic Newton methods with enhanced Hessian estimation  Speaker: Mr. Danda Sai Koti Reddy
M.Sc (Engg.) student at CSA department  Faculty Advisor: Prof. Shalabh Bhatnagar
 Date and Time: Thursday, May 04, 2017, 11:00 AM
 Venue: CSA Seminar Hall (Room No. 254, First Floor)
Abstract Optimization problems involving uncertainties are common in a variety
of engineering disciplines such as transportation systems, manufacturing,
communication networks, healthcare and finance. The large number of input
variables and the lack of a system model prohibit a precise analytical
solution and a viable alternative is to employ simulationbased optimization.
The idea here is to simulate a few times the stochastic system under consideration
while updating the system parameters until a good enough solution is obtained.
Formally, given only noisecorrupted measurements of an
objective function, we wish to find a parameter which minimizes the
objective function. Iterative algorithms using statistical methods search
the feasible region to improve upon the candidate parameter. Stochastic
approximation algorithms are best suited, most studied and applied
algorithms for finding solutions when the feasible region is a
continuously valued set. One can use information on the gradient/Hessian
of the objective to aid the search process. However, due to the lack of
knowledge of the noise distribution, one needs to estimate the
gradient/Hessian from noisy samples of the cost function obtained from
simulation. Simple gradient search schemes take many iterations to
converge to a local minimum and are heavily dependent on the choice of
stepsizes. Stochastic Newton methods, on the other hand, can counter the
illconditioning of the objective function as they incorporate
secondorder information into the stochastic updates. Stochastic Newton
methods are often more accurate than simple gradient search schemes.
We propose enhancements to the Hessian estimation scheme used in
two recently proposed stochastic Newton methods, based on ideas of
random directions stochastic approximation (2RDSA) and simultaneous
perturbation stochastic approximation with three function measurements
(2SPSA3), respectively. The proposed scheme reduces the error in the
Hessian estimate by (i) incorporating a zeromean feedback term; and (ii)
optimizing the stepsizes used in the Hessian recursion. We prove that
both 2RDSA and 2SPSA3 with our Hessian improvement procedure converge
asymptotically to the true Hessian. The advantage with both 2RDSA and
2SPSA3 is that they require only 75% of the simulation cost periteration
for 2SPSA with improved Hessian estimation (2SPSAIH). Numerical
experiments show that 2RDSAIH outperforms both 2SPSAIH and 2RDSA without
the improved Hessian estimation procedure.

