Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

PAST SEMINARS

Series: Distinguished AI Seminar
Title: Recent Research in Machine Perception at Google

  • Speaker: Rahul Sukthankar, VP Research, Google
  • Date and Time: Friday, May 20, 2022, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In this talk I will present some recent progress in machine perception at Google Research, both on fundamental research problems and on the tech powering several popular Google products. I will give a historical retrospective on some important research problems, a snapshot of some relevant current work and observations about the future. And on the applied side, I will provide some background on how we build systems at Google that interpret, reason about and transform sensory data -- and why this aspect of AI is increasingly critical across so many products.

Speaker Bio:
Dr. Rahul Sukthankar is a VP at Google Research, where he co-leads the Machine Perception org. He also served as an adjunct research professor at the Robotics Institute at Carnegie Mellon (1997-2020), courtesy faculty at the University of Central Florida (2007-2020). Dr. Sukthankar received his Ph.D. in Robotics from Carnegie Mellon in 1997 and his B.S.E. in Computer Science from Princeton in 1991. He has organized several major conferences in his field (e.g., General Chair, CVPR'21), served as Editor in Chief of the Machine Vision and Applications journal, and is a Fellow of the IEEE. A high-tea at CSA lawns will follow the talk. Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzUwZGE2ZTUtZTExYi00OWMwLTg0MDktNmJmM2JiZGMxNzI4%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22c747ccaa-ceaa-4197-b4cb-ce2f1d4694da%22%7d

Host Faculty: Prof. Partha Pratim Talukdar

Video Archive Go Top

 

Series: Theory Seminar
Title: Interventional Complexity of Learning Causal Graphs

  • Speaker: Vibhor Porwal,
                   Research Associate II,
                   Adobe Research, Bangalore
  • Date and Time: Friday, May 20, 2022, 11:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
A well-studied challenge that arises in the structure learning problem of causal directed acyclic graphs (DAG) is that using observational data, one can only learn the graph up to a "Markov equivalence class" (MEC). The remaining undirected edges have to be oriented using interventions, which can be very expensive to perform in applications. Thus, the problem of minimizing the number of interventions needed to fully orient the MEC has received a lot of recent attention. In this talk, I will describe a new universal lower bound on the number of single-node interventions that any algorithm (whether active or passive) would need to perform in order to orient a given MEC. I will then discuss the tightness of this lower bound and compare it with previously known lower bounds. I will also describe the notion of CBSP orderings, which are topological orderings of DAGs without v-structures, and underly the main technical idea for proving our lower bound.

This is based on joint work with Piyush Srivastava (TIFR) and Gaurav Sinha (Adobe Research). Paper link - https://arxiv.org/abs/2111.05070 -----------

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Sruthi Gorantla and Rahul Madhavan

Video Archive Go Top

 

Series: Department Seminar
Title: Automating Distributed Heterogeneous Computing for Python Programmers

  • Speaker: Prof. Vivek Sarkar,
                   Chair, School of Computer Science,
                   and Stephen Fleming Chair for Telecommunications,
                   College of Computing,
                   Georgia Institute of Technology
  • Date and Time: Thursday, May 19, 2022, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Multiple simultaneous disruptions are currently under way in both hardware and software, as we consider the implications for future HPC systems. In hardware, “extreme heterogeneity” has become critical to sustaining cost and performance improvements after Moores Law, but poses significant productivity challenges for developers. In software, the rise of large-scale data science and AI applications is being driven by domain experts from diverse backgrounds who demand the programmability that they have come to expect from high-level languages like Python.

While current foundations for compiler and runtime technologies have served us well for many decades, we now see signs of their limitations in the face of these disruptions. This talk makes a case for new approaches to enable productivity and programmability of future HPC systems for domain scientists, and discusses preliminary results obtained in response to the challenge of automating distributed heterogeneous computing for Python programmers.

Speaker Bio:
Vivek Sarkar is Chair of the School of Computer Science and the Stephen Fleming Chair for Telecommunications in the College of Computing at Georgia Institute of Technology. He conducts research in multiple aspects of programmability and productivity in parallel computing, including programming languages, compilers, runtime systems, and debuggers for parallel, heterogeneous, and high-performance computer systems. Sarkar started his career in IBM Research after obtaining his Ph.D. from Stanford University, supervised by John Hennessy. His research projects at IBM include the PTRAN automatic parallelization system led by Fran Allen, the ASTI optimizer for IBMs XL Fortran product compilers, the open-source Jikes Research Virtual Machine for the Java language, and the X10 programming language developed in the DARPA HPCS program. He was a member of the IBM Academy of Technology during 1995-2007. After moving to academia, Sarkar has mentored over 30 Ph.D. students and postdoctoral researchers in the Habanero Extreme Scale Software Research Laboratory, first at Rice University since 2007, and now at Georgia Tech since 2017. Researchers in his lab have developed the Habanero-C/C++ and Habanero-Java programming systems for parallel, heterogeneous, and distributed platforms. While at Rice, Sarkar was the E.D. Butcher Chair in Engineering, served as Chair of the Department of Computer Science, created a new sophomore-level course on the fundamentals of parallel programming, as well as a three-course Coursera specialization on parallel, concurrent, and distributed programming. Sarkar is an ACM Fellow and an IEEE Fellow. He has been serving as a member of the US Department of Energys Advanced Scientific Computing Advisory Committee (ASCAC) since 2009, and on CRAs Board of Directors since 2015. He is also the recipient of the 2020 ACM-IEEE CS Ken Kennedy Award. Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZjgwMGJhNTItMDUyMi00MTRhLWI3MmItMmI5M2JmM2ZkYTJm%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22c747ccaa-ceaa-4197-b4cb-ce2f1d4694da%22%7d

Host Faculty: Prof. Uday Kumar Reddy B

Video Archive Go Top

 

Series: Theory Seminar
Title: Finding Adversarially Robust Representations

  • Speaker: Aravindan Vijayaraghavan,
                   Associate Professor,
                   CS department,
                   Northwestern University.
  • Date and Time: Thursday, May 12, 2022, 9:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Adversarial robustness measures the susceptibility of a machine learning algorithm to small perturbations made to the input either at test time or at training time. Our current theoretical understanding of adversarial robustness is limited, and has mostly focused on supervised learning tasks. In this talk, I will consider a natural extension of Principal Component Analysis (PCA) where the goal is to find a low dimensional subspace to represent the given data with minimum projection error, and that is in addition robust to small perturbations. Unlike PCA which is solvable in polynomial time, this formulation is computationally intractable to optimize as it generalizes a well-studied sparse PCA problem. I will describe an efficient algorithm that find approximately optimal solutions and show how this can be used as a robust subroutine for many downstream learning tasks, including training more certifiably robust neural networks. Based on joint works with Pranjal Awasthi, Xue Chen, Vaggos Chatziafratis, Himanshu Jain and Ankit Singh Rawat. -----------

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Sruthi Gorantla and Rahul Madhavan

Video Archive Go Top

 

Series: Department Seminar
Title: Contextual Concurrency Control

  • Speaker: Prof. Sanidhya Kashyap
                   
                   EPFL, Switzerland
  • Date and Time: Wednesday, May 11, 2022, 2:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Kernel synchronization primitives are of paramount importance to achieving good performance and scalability for applications. However, they are usually invisible and out of the reach of application developers. Instead, kernel developers and synchronization experts make all the decisions regarding kernel lock design.

In this talk, I will propose a paradigm, called contextual concurrency control (C3), that enables applications to tune concurrency control in the kernel. C3 allows developers to change the behavior and parameters of kernel locks, to switch between different lock implementations and to dynamically profile one or multiple locks for a specific scenario of interest. This approach opens up a plethora of opportunities to fine-tune concurrency control mechanisms on the fly.

Speaker Bio:
Sanidhya Kashyap is a systems researcher and an assistant professor at the EPFL School of Computer and Communication Sciences (IC). His research focuses on designing systems software that are performant and robust for heterogeneous hardware. He is broadly interested in the area of systems with a particular focus on operating systems, concurrency, storage, scheduling, data analytics, and robustness.

Host Faculty: Arkaprava Basu

Video Archive Go Top

 

Series: Theory Seminar
Title: New techniques and results for Support Recovery in Mixture Models

  • Speaker: Soumyabrata Pal,
                   Post-Doctoral Researcher,
                   Google Research, India
  • Date and Time: Friday, May 06, 2022, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Mixture models with high dimensional parameter vectors are widely used to fit complex multimodal datasets as they allow representation of latent sub-populations within the overall population. The primary difficulty in learning mixture models is that the observed data does not identify the subpopulation to which an individual observation belongs.

We study the problem of support recovery in mixture models parameterized by sparse vectors i.e. our goal is to recover the set of non-zero indices of each of the unknown vectors. We present a very generic framework (including a novel tensor-based algorithm) for support recovery by using estimates of the number of unknown vectors having non-zero entries in small subsets of indices. We apply this framework by showing a variety of techniques to estimate the aforementioned quantities in different mixture models. Our results for support recovery are quite general, namely they are applicable to 1) Mixtures of many different canonical distributions including Uniform, Poisson, Laplace, Gaussians, etc. 2) Mixtures of linear regressions and linear classifiers.

Based on joint works (https://arxiv.org/abs/2202.11940, https://arxiv.org/abs/2106.05951) with Arya Mazumdar and Venkata Gandikota

-----------

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Sruthi Gorantla and Rahul Madhavan

Video Archive Go Top

 

Series: Theory Seminar
Title: Near Optimal Split-state Non-malleable Codes

  • Speaker: Dr. Sai Lakshmi Bhavana,
                   Post Doctoral Researcher,
                   Microsoft Research, India
  • Date and Time: Friday, April 29, 2022, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
At ITCS 2010, Dziembowski, Pietrzak, and Wichs introduced Non-malleable Codes (NMCs) which protect against tampering of a codeword of a given message into the codeword of a related message. A well-studied model of tampering is the 2-split-state model where the codeword consists of two independently tamperable states. As with standard error-correcting codes, it is of great importance to build codes with high rates.

Following a long line of work, Aggarwal and Obremski (FOCS 2020) showed the first constant rate non-malleable code in the 2−split state model; however, this constant was a minuscule 10^{-6}! In our work[1], we build a Non-malleable Code with rate 1/3 (nearly matches the rate 1/2 lower bound for this model). This work will be the focus of my talk!

[1] Rate One-Third Non-malleable Codes, STOC 2022. Divesh Aggarwal, Sruthi Sekar, Bhavana Kanukurthi, Maciej Obremski, Sai Lakshmi Bhavana Obbattu

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Arindam Khan

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium- ON-LINE
Title: Boolean Functional Synthesis using Gated Continuous Logic Networks

  • Speaker: Mr. Ravi Raja,
                   M.Tech (Research) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Chiranjib Bhattacharyya and Prof. Deepak DSouza
  • Date and Time: Thursday, April 28, 2022, 9:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Boolean Functional Synthesis (BFS) is a well-known challenging problem in the domain of automated program synthesis from logical specifications. This problem aims to synthesize a Boolean function that is correct-by-construction with respect to the declared specification; this specification symbolically relates the inputs and outputs of the function to be synthesized. Since Boolean functions are the basic building blocks of modern digital systems, BFS has applications in a wide range of areas, including QBF-SAT solving, circuit repair and debugging. This has motivated the community to develop practically efficient algorithms for synthesizing compact Boolean functions, which is a non-trivial endeavor. However, to the best of our knowledge, current techniques are unable to specify a bound on the Boolean function size during synthesis. Specifying a bound on the size of the formula offers flexibility in synthesizing minimal-sized Boolean functions.

Learning Boolean functions from logical specifications using neural networks is a difficult problem as it requires the network to represent Boolean functions. Boolean functions are discrete functions and consequently, non-differentiable. Thus, learning a Boolean function directly using traditional neural networks is not possible. Recently Ryan et al proposed the Gated Continuous Logic Network (GCLN) model that builds on Fuzzy Logic to represent Boolean and linear integer operator, in the context of learning invariants for programs. In this work, we investigate the use of the GCLN model to synthesize solutions to the BFS problem. Our model lets us bound the number of clauses used in the synthesized Boolean function.

We implement this approach in our tool BNSynth (for Bounded Neural Synthesis), that also uses sampling and counterexample guided techniques to synthesize Boolean functions. We validate our hypothesis that this system can learn smaller functions as compared to a state-of-the-art tool, over custom benchmarks. We observe a 2.4X average improvement in formula size, for these benchmarks. This empirically proves that our system is capable of synthesizing smaller Boolean functions as compared to the state of the art.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzJlZDc0YTMtYTRmYS00N2Y0LWE5YWQtYzI1YjY2OGI2ODk1%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22e2a9daf6-b4d3-4abe-ae30-2ccb517ed18d%22%7d

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium- ON-LINE
Title: Neural Approaches for Natural Language Query Answering over Source Code

  • Speaker: Ms. Madhurima Mandal,
                   M.Tech (Research) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Aditya Kanade and Prof. Shirish Shevade
  • Date and Time: Thursday, April 28, 2022, 11:30 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
During software development, developers need to ensure that the developed code is bug-free and the best coding practices are followed during the code development process. To guarantee this, the developers require answers to queries about specific aspects of the code relevant to the development. Powerful code-query languages such as CodeQL have been developed for this purpose. Use of such code-query languages, however, requires expertise in writing formal queries. For each separate query, one needs to write several lines in a code-query language.

To remedy these problems, we propose to represent each query by a natural language phrase and answer such queries using neural networks. We aim to perform model training such that a single model can answer multiple queries as opposed to writing separate formal queries for each task. Such a model can answer these queries against unseen code. With this motivation, we introduce the novel NlCodeQA dataset. It includes 171,346 labeled examples where each input consists of a natural language query and a code snippet. The labels are answer spans in the input code snippet with respect to the input query. State-of-the-art BERT-style neural architectures were trained using the NlCodeQA data. Preliminary experimental results show that the proposed model achieves the exact match accuracy of 86.30%.

The proposed use of natural language query and neural models for query understanding will help increase the productivity of software developers and pave the way for designing machine learning based code analysis tools that can complement the existing code analysis systems for complex code queries that are either hard or expensive to represent using a formal query language.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZjAzZTBhYzQtYmVjZC00MjEyLTlkZDUtYTNmMTE4OGQ0MTdh%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%229c3e2bfe-1b0f-4d7b-a589-832878069dc6%22%7d

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium- ON-LINE
Title: A Context-Aware Neural Approach for Explainable Citation Link Prediction

  • Speaker: Ms. Soumya Jain,
                   M.Tech (Research) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Shirish Shevade
  • Date and Time: Thursday, April 28, 2022, 10:30 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Citations have become an integral part of scientific publications. They play a crucial role in supporting authors claims throughout a scientific paper. However, citing related work is a challenging and laborious task, especially for novice researchers who are not much familiar with the literature and have little or no experience in writing citation text. In this work, we study the task of Citation Link Prediction and propose a novel neural architecture called ExCite, that predicts the existence of a citation link between a pair of scientific documents within a given context. More importantly, it also generates the corresponding citation text at the same time. For this purpose, ExCite leverages diverse role-based views of the documents to learn robust document representations. The proposed model achieves state-of-the-art performance on both citation link prediction and citation text generation subtasks. We performed an extensive set of experiments to show the effectiveness of each module in the proposed neural architecture and evaluated our explanations using a wide range of state-of-the-art automatic evaluation metrics. By performing qualitative and quantitative analyses, we showed that ExCite is capable of generating high-quality citation text that is highly coherent with the citation context.

Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzQxMmQ2OTktNGVlNi00NDI0LTgwYTQtNjc5ZmNlMWJjNDhk%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%229c3e2bfe-1b0f-4d7b-a589-832878069dc6%22%7d

Video Archive Go Top

 

Series: Department Seminar
Title: Future of Compute

  • Speaker: Jim Keller, CTO and President, Tenstorrent
  • Date and Time: Wednesday, April 27, 2022, 2:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Designing a high performance architecture to run continually evolving machine learning models is a complex problem. At the same time, rapid adoption of AI is increasing the need for scalable systems to run such models. This technical talk will apply first-principles thinking to system architecture that can be the foundation for AI/ML compute.

We will discuss the structure of a machine learning model and how next generation AI compute uses Scalar, Vector, Matrix and Tensor architecture. We will also expand on the components that will become the backbone of AI architectures, such as new data formats and RISCV - an open source instruction set architecture. Finally we will look at the implications of building chips and systems in this new era of artificial intelligence and how it will drive Foundry, Silicon, SOC, IPs, Data Center, Cloud and Software 2.0, the next iteration in software development.

The talk will be followed by a Networking and Interaction Session with the Students.

Speaker Bio:
Jim Keller is the CTO and President of Tenstorrent, a next-generation computing company with the mission of addressing the rapidly growing compute demands for software 2.0. Prior to joining Tenstorrent, he served two years as Senior Vice President of Intel's Silicon Engineering Group. He has held roles as Tesla's Vice President of Autopilot and Low Voltage Hardware, Corporate Vice President and Chief Cores Architect at AMD, and Vice President of Engineering and Chief Architect at P.A. Semi, which was acquired by Apple Inc. During his long and impactful career he has helped create some of today's most iconic technology including the AMD Zen, K7 (Athlon) and K8 micro-architectures, Apple A4-A7 processors, the x86-64 instruction set and HyperTransport interconnect. Dear All Pls note a slight change in the schedule. The talk will start at 2.30pm. The Networking/interaction session with coffee/snacks would be from 3.30 -- 4.15pm. Thanks Govind

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: Department Seminar
Title: Performance Characterization of .NET Benchmarks

  • Speaker: Dr. Gagan Gupta
                   Principal Researcher, Microsoft Research
  • Date and Time: Tuesday, April 26, 2022, 2:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Managed language frameworks are pervasive, especially in modern datacenters. .NET is one such framework that is used widely in Microsoft Azure but has received little attention from computer architects. In this work we organize a set of representative .NET benchmarks and characterize them for performance bottlenecks on modern hardware. Our study reveals that .NET applications have different characteristics than traditional programs due to the managed runtime. This affects the tradeoffs when designing hardware for such applications. We use Principal Component Analysis (PCA) and hierarchical clustering to create representative subsets of open-source .NET and ASP.NET benchmarks. We perform microarchitecture and application-level characterization of these subsets and show that they are significantly different from SPEC CPU17 benchmarks in branch and memory behavior. We also analyze the effect of managed runtime events such as JIT (Just-in-Time) compilation and GC (Garbage Collection). Among other findings, GC improves cache performance and JITing could benefit from aggressive prefetching. As computing increasingly moves to the cloud and managed languages become more popular, it is important to include .NET-like benchmarks in architecture studies.

Speaker Bio:
Gagan Gupta is a Principal Researcher at Microsoft, where he leads a team of architects to study workloads and inform next-gen datacenter architecture. At Microsoft Gagan has also researched DNA storage and ISA designs for more sustainable computing. Gagan has held engineering and executive leadership positions in the semiconductor industry: architecting and launching commercially successful processors at LSI Logic (now Broadcom), Huawei, and ARC International (now Synopsys), and influencing R&D strategies at major corporations such as Intel and Sandisk. Gagan has a Ph.D. in Computer Sciences from the University of Wisconsin-Madison. His work has been recognized with awards at multiple industrial and academic fora, and gets cited in trade publications such as New York Times

Host Faculty: Arkaprava Basu

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: Novel First-order Algorithms for Non-smooth Optimization Problems in Machine Learning

  • Speaker: Mr. Achintya Kundu,
                   Ph.D (Engg.) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Chiranjib Bhattacharyya
  • Date and Time: Tuesday, April 12, 2022, 10:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
This thesis is devoted to designing efficient optimization algorithms for machine learning (ML) problems where the underlying objective function to be optimized is convex but not necessarily differentiable. Such non-smooth objective functions are ubiquitous in ML mainly due to the use of one or more of the following: (a) non-differentiable loss function (hinge loss in binary classification), (b) sparsity promoting regularization term (L1 norm in regression), (c) constraint sets (elliptope in graph embedding) to induce specific structure on the parameters to be learned. Motivated by such a wide range of learning problems that can be cast as optimization of a non-smooth convex objective, we focus on developing first-order algorithms with non-asymptotic convergence rate guarantees to solve such problems in a large-scale setting. Based on shortcomings of the existing research in this domain, we address the following specific issues in this thesis.

First, we consider the problem of minimizing a convex function over a feasible set given by the intersection of finitely many simple sets, each of which is equipped with a projection oracle. Examples of constraint sets that possess such structure include the set of doubly stochastic matrices, elliptope, the intersection of PSD cone with an L1-norm ball, etc. The main difficulty lies in computing the projection of a point onto the feasible set. Existing approaches yield an infeasible point with no guarantees on its in-feasibility. Exploiting the intersecting sets linear regularity property, we present an exact penalty approach that leads to first-order algorithms with explicit guarantees on the approximate solutions distance from the feasible set. Further, we show improved iteration-complexity when the objective possesses structural smoothness / strong convexity. This is achieved through a saddle-point reformulation where the proximal operators required by the primal-dual algorithms can be computed in closed form. We illustrate the benefits of our approach on a graph transduction problem and graph matching.

Next, we consider the classic composite problem of minimizing the sum of two convex functions: a smooth one (possessing Lipschitz continuous gradient) and a simple non-smooth one with easy to compute proximal operator. The well-known FISTA algorithm (also Nesterovs accelerated gradient method) achieves the optimal O(1/T^2) non-ergodic convergence rate for this problem. One of the drawbacks of these fast gradient methods is that they require computing gradients of the smooth function at points different from those on which the convergence rate guarantee applies. Inspired by Polyaks Heavy Ball method and the use of past gradients as momentum in training deep nets, we propose an accelerated gradient algorithm to rectify this drawback keeping the convergence rate intact. We achieve this through a judicious choice of momentum in both primal and dual space. To the best of our knowledge, this is the first accelerated gradient algorithm that achieves O(1/T^2) convergence rate guarantee on the iterates at which gradients are evaluated. Next, we propose modifications of Nesterovs accelerated gradient method by calling the first-order oracle at the same points on which the convergence rate guarantees apply. Algorithms thus obtained additionally achieve a linear convergence rate when the objective function is strongly convex. This fills a significant research gap as Polyaks Heavy Ball method guarantees accelerated convergence rate through gradient momentum only for a restrictive class of twice differentiable and strongly convex objective functions.

Third, we focus on the problem of learning a positive semidefinite (PSD) kernel matrix from m similarity matrices under a general convex loss. The existing algorithms do not apply if one employs arbitrary loss functions and often can not handle m>1 case. Based on the black-box approach of Mirror Descent (MD), we present several provably convergent iterative algorithms that exploit the availability of off-the-shelf support vector machine (SVM) solvers. One of the significant contributions involves an extension of the well-known MD algorithm for simplex to handle the Cartesian product of PSD matrices leading to a novel algorithm called Entropic Multiple Kernel Learning. We also show simulation results on protein structure classification involving several similarity matrices to demonstrate the proposed algorithms efficacy.

Finally, we consider the class of problems possessing convex-concave saddle point structure with bilinear interaction. This model encompasses most convex optimization problems arising in ML and includes minimizing the sum of many simple non-smooth convex functions as a special case; thereby, it subsumes learning problems involving complex regularization terms such as total-variation based image denoising, overlapping group lasso, isotonic regression, etc. We first propose a primal-dual algorithm for this general class of problems that can achieve the O(1/T) convergence rate guarantee on the non-ergodic primal-dual iterate pair. Further, assuming strong convexity in the primal domain, we show an improved non-ergodic convergence rate of O(1/T^2). In contrast, the existing primal-dual algorithms achieve such convergence rates only in the ergodic or semi-ergodic setting. Further, we propose another primal-dual algorithm utilizing an additional prox-operator computation (in the primal space) per iteration. This variant additionally enjoys a emph{non-ergodic} accelerated linear convergence rate when the objective function is strongly convex-strongly concave. The proposed algorithms were designed by cleverly incorporating the key ingredients from Nesterovs accelerated gradient method in the standard primal-dual algorithmic framework of Chambolle-Pock.

Microsoft Teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_YmY5ZDc3YTctNGJiNC00MDkyLThjNTktM2RjNDcxNjQxZTVh%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22620fb6db-36c5-4f95-ba45-6ed8e824aa28%22%7d

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: Reinforcement Learning Algorithms for Off-Policy, Multi-Agent Learning and Applications to Smart Grids

  • Speaker: Mr. D Raghu Ram Bharadwaj,
                   Ph.D (Engg.) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Shalabh Bhatnagar
  • Date and Time: Friday, April 08, 2022, 3:30 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Reinforcement Learning (RL) algorithms are a popular class of algorithms for training an agent to learn desired behavior through interaction with an environment whose dynamics is unknown to the agent. RL algorithms combined with neural network architectures have enjoyed much success in various disciplines like games, medicine, energy management, economics, and supply chain management. In our thesis, we study interesting extensions of standard single-agent RL settings, like off-policy and multi-agent settings. We discuss the motivations and importance of these settings and propose convergent algorithms to solve these problems. Finally, we consider one of the important applications of RL, namely smart grids. The goal of the smart grid is to develop a power grid model that intelligently manages its energy resources. In our thesis, we propose RL models for efficient smart grid design.

Learning the value function of a given policy (target policy) from the data samples obtained from a different policy (behavior policy) is an important problem in Reinforcement Learning (RL). This problem is studied under the setting of off-policy prediction. Temporal Difference (TD) learning algorithms are a popular class of algorithms for solving prediction problems. TD algorithms with linear function approximation are convergent when the data samples are generated from the target policy (known as on-policy prediction) itself. However, it has been well established in the literature that off-policy TD algorithms under linear function approximation may diverge. In the first part of the thesis, we propose a convergent online off-policy TD algorithm under linear function approximation. The main idea is to penalize updates of the algorithm to ensure convergence of the iterates. We provide a convergence analysis of our algorithm. Through numerical evaluations, we further demonstrate the effectiveness of our proposed scheme.

Subsequently, we consider the ``off-policy control setup in RL, where an agents objective is to compute an optimal policy based on the data obtained from a behavior policy. As the optimal policy can be very different from the behavior policy, learning optimal behavior is very hard in the ``off-policy setting compared to the ``on-policy setting wherein the data is collected from the new policy updates. In this work, we propose the first deep off-policy natural actor-critic algorithm that utilizes state-action distribution correction for handling the off-policy behavior and the natural policy gradient for sample efficiency. Unlike the existing natural gradient-based actor-critic algorithms that use only fixed features for policy and value function approximation, the proposed natural actor-critic algorithm can utilize a deep neural networks power to approximate both policy and value function. We illustrate the benefit of the proposed off-policy natural gradient algorithm by comparing it with the Euclidean gradient actor-critic algorithm on benchmark RL tasks.

In the third part of the thesis, we consider the problem of two-player zero-sum games. In this setting, there are two agents, both of whom aim to optimize their payoffs. Both the agents observe the same state of the game, and the agents objective is to compute a strategy profile that maximizes their payoffs. However, the payoff of the second agent is the negative of the payoff obtained by the first agent. Therefore, the objective of the second agent is to minimize the total payoff obtained by the first agent. This problem is formulated as a min-max Markov game in the literature. In this work, we compute the solution of the two-player zero-sum game utilizing the technique of successive relaxation. Successive relaxation has been successfully applied in the literature to compute a faster value iteration algorithm in the context of Markov Decision Processes. We extend the concept of successive relaxation to the two-player zero-sum games. We then derive a generalized minimax Q-learning algorithm that computes the optimal policy when the model information is unknown. Finally, we prove the convergence of the proposed generalized minimax Q-learning algorithm utilizing stochastic approximation techniques. Through experiments, we demonstrate the advantages of our proposed algorithm.

Next, we consider a cooperative stochastic games framework where multiple agents work towards learning optimal joint actions in an unknown environment to achieve a common goal. In many real-world applications, however, constraints are often imposed on the actions that the agents can jointly take. In such scenarios, the agents aim to learn joint actions to achieve a common goal (minimizing a specified cost function) while meeting the given constraints (specified via certain penalty functions). Our work considers the relaxation of the constrained optimization problem by constructing the Lagrangian of the cost and penalty functions. We propose a nested actor-critic solution approach to solve this relaxed problem. In this approach, an actor-critic scheme is employed to improve the policy for a given Lagrange parameter update on a faster timescale as in the classical actor-critic architecture. Using this faster timescale policy update, a meta actor-critic scheme is employed to improve the Lagrange parameters on the slower timescale. Utilizing the proposed nested actor-critic scheme, we develop three Nested Actor-Critic (N-AC) algorithms.

In recent times, actor-critic algorithms with attention mechanisms have been successfully applied to obtain optimal actions for RL agents in multi-agent environments. In the fifth part of our thesis, we extend this algorithm to the constrained multi-agent RL setting considered above. The idea here is that optimizing the common goal and satisfying the constraints may require different modes of attention. Thus, by incorporating different attention modes, the agents can select useful information required for optimizing the objective and satisfying the constraints separately, thereby yielding better actions. Through experiments on benchmark multi-agent environments, we discuss the advantages of our proposed attention-based actor-critic algorithm.

In the last part of our thesis, we study the applications of RL algorithms to Smart Grids. We consider two important problems - on the supply-side and demand-side, respectively, and study both in a unified framework. On the supply side, we study the problem of energy trading among microgrids to maximize profit obtained from selling power while at the same time satisfying the customer demand. On the demand side, we consider optimally scheduling the time-adjustable demand - i.e., of loads with flexible time windows in which they can be scheduled. While previous works have treated these two problems in isolation, we combine these problems and provide a unified Markov decision process (MDP) framework for these problems.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzEzMmI3MTUtOGUxMS00ZWZiLTgwZjAtMDcwMTI4ZDYzMzRh%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22d931ce1d-4f82-48c0-8053-96272991d288%22%7d

Video Archive Go Top

 

Series: Theory Seminar
Title: Alice in the PA-Land

  • Speaker: Maciej Obremski,
                   Senior Research Fellow,
                   National University of Singapore, Centre for Quantum Technologies.
  • Date and Time: Thursday, April 07, 2022, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
The speaker will take us on a wondrous journey through different variants of the Privacy Amplification problem (with different levels of Eavesdroppers power) and associated with the variants of extractors. We'll start with a toy version of the problem then discuss a classical DW09 result finally culminating in a grand finale: variant with corrupted randomness sources and tampered memory from AORSS20 and COA21(order randomized). The main result includes the latest construction of a two-source non-malleable extractor, which surpasses all known constructions of non-malleable seedless/seeded extractors. We will only show the general overview of the construction/compiler.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Prof. Bhavana Kanukurthi

Video Archive Go Top

 

Series: M.Tech (Research) Thesis Defense
Title: 2-Level Page Tables (2-LPT): A building block for efficient address translation in virtualized environment

  • Speaker: Akshay Baviskar
  • Faculty Advisor: R. Govindarajan
  • Date and Time: Monday, April 04, 2022, 3:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Efficient address translation mechanisms are gaining more and more attention as the virtual address range of the processors keeps expanding and the demand for machine virtualization increases with cloud and data center-based services.  Traditional radix-tree based address translations can incur significant overheads in big data applications, particularly under virtualization, due to multi-level tree walks and nested translation. The overheads stem primarily from unnecessary generality --- ability to support several hundreds of thousands of virtual memory regions in the virtual address space --- supported by current processors.  

We observe that in the common case, however, a process's virtual address space contains only a few contiguously allocated sections, which can be efficiently translated using a shallow tree with two levels. We propose such a compact structure, called 2-Level Page Table(2-LPT),  which is a key building block for address translation in virtualized environment. A key advantage of 2-LPT is that it maintains two levels of page tables irrespective of the size of the virtual address space. Translating a virtual address (VA) using 2-LPT is fast. A walk on a 2-LPT requires up to two memory accesses. In practice, however,  the root level table is well cached in the PWC, thus, single memory access is sufficient. Under native execution, 2-LPT reduces the time spent in page walks by up to 20.9% (9.38% on average) and improves performance by up to 10.1% (1.66% on average) over the conventional four-level radix tree page tables, on a set of memory-intensive applications.

2-LPT is more beneficial under virtualization. The proposed 2-LPT design reduces the cost of nested page walk from 24 to 8 memory accesses.  To achieve further reduction, we propose two optimizations: (i)  Enhanced Partial Shadow Paging (ePSP) which employs a limited form of shadow paging for the root-level of 2-LPT, and (ii) Host PTE Mirroring (HPM) which allows accessing the host page table entry without performing host page table walk. These allow us to largely avoid slow VM exits while effectively reducing the number of memory access on a nested address translation to just one, on average. 2-LPT speeds up applications by 5.6%-50.9% (24.4%, on average) over the baseline with conventional nested page walks. Importantly, it reduces page walk cycles and execution time of the best performing state-of-the-art proposal by 17.1%-57.1% and by 3.9%-43.9%, respectively.

Online Meeting Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_OWU2Y2IxZDAtMGVkNC00NmUwLWJkMjMtZWVhNTk0ZmYyODEx%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%224bcd3d56-e405-4b06-99fb-27742262f261%22%7d

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium- ON-LINE
Title: Model-based Safe Deep Reinforcement Learning and Empirical Analysis of Safety via Attribution

  • Speaker: Mr. Ashish Kumar Jayant,
                   M.Tech (Research) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. Shalabh Bhatnagar
  • Date and Time: Friday, April 01, 2022, 2:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
During initial iterations of training in most Reinforcement Learning (RL) algorithms, agents perform a significant number of random exploratory steps, which in the real-world limit the practicality of these algorithms as this can lead to potentially dangerous behavior. Hence safe exploration is a critical issue in applying RL algorithms in the real world. This problem is well studied in the literature under the Constrained Markov Decision Process (CMDP) Framework, where in addition to single-stage rewards, state transitions receive single-stage costs as well. The prescribed cost functions are responsible for mapping undesirable behavior at any given time-step to a scalar value. Then we aim to find a feasible policy that maximizes reward returns and keeps cost returns below a prescribed threshold during training as well as deployment.

We propose a novel On-policy Model-based Safe Deep RL algorithm in which we learn the transition dynamics of the environment in an online manner as well as find a feasible optimal policy using Lagrangian Relaxation-based Proximal Policy Optimization. This combination of transition dynamics learning, and a safety-promoting RL algorithm leads to ~3-4 times less environment interactions and less cumulative hazard violations compared to the model-free approach. We use an ensemble of neural networks with different initializations to tackle epistemic and aleatoric uncertainty issues faced during environment model learning. We present our results on a challenging Safe Reinforcement Learning benchmark - the Open AI Safety Gym.

In addition to this, we perform an attribution analysis of actions taken by the Deep Neural Network-based policy at each time step. This analysis helps us to :

1. Identify the feature in state representation which is significantly responsible for the current action.

2.Empirically provide the evidence of the safety-aware agents ability to deal with hazards in the environment provided that hazard information is present in the state representation. In order to perform the above analysis, we assume state representation has meaningful information about hazards and goals. Then we calculate an attribution vector of the same dimension as state using a well-known attribution technique known as Integrated Gradients. The resultant attribution vector provides the importance of each state feature for the current action.

Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_YzU5MGY4YTctNThmYi00NjlkLTllZmItNDc5ZjExMzY2ZTU4%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%2271844033-661c-432d-9a6f-418de5b8c819%22%7d

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium- ON-LINE
Title: An Evaluation of Basic Protection Mechanisms in Financial Apps on Mobile Devices

  • Speaker: Mr. Nikhil Agrawal,
                   M.Tech (Research) Student,
                   Dept. of CSA.
  • Faculty Advisor: Prof. K Gopinath and Prof. Vinod Ganapathy
  • Date and Time: Friday, April 01, 2022, 10:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Mobile devices have become an integral part of the payment ecosystem. Payments are facilitated by financial applications (like Mobile Banking, UPI Apps, etc.), which have in turn soared in popularity. With the increasing dependence on the financial app ecosystem and the sensitive nature of the data handled by financial apps (including the bank/card details of the payees and the payers), we set out to study fundamental question: do the app developers of financial apps put various self-defense checks to make their apps more secure? If yes, how trivial is it for the attackers to bypass such checks?

This thesis concerns the robustness of security checks in financial mobile applications. The best practices recommended by the Open Web Application Security Project (OWASP) for developing such apps, demand that developers include several checks in these apps, such as detection of running on a rooted device, certificate checks, and so on. Ideally, these checks must be introduced in a sophisticated way and must not be locatable through trivial static analysis, so that attackers cannot bypass them trivially. In this work, we conduct a large-scale study focused on financial apps on the Android platform and determine the robustness of these checks.

Our study shows that a significant fraction of the financial apps dont have the various self-defense checks recommended by the OWASP. Then we showed that among the apps with at least one security check, > 50% of such apps at least one check could be trivially bypassed. Some of such financial apps have installation counts exceeding 100 million from Google Play. This entire process of detecting the self-defense check and bypassing it is automated. We believe that the results of our study can guide developers of these apps in inserting security checks in a more robust fashion.

Speaker Bio:
Nikhil Agrawal is an M.Tech (Research) student at the CSA department. He is a member of the Computer Architecture System Lab (CASL) and Computer System Security Lab (CSSL). During his research journey, he analyzed many financial apps in-depth on the Android platform. Before joining IISc in August 2019, he completed engineering from R.V. College of Engineering (RVCE), Bengaluru, in 2018. He will soon join as a Software Engineer at Cradlewise. Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_N2VmMzYwNGMtMjFhYS00Zjc3LThjZjAtOGY5NzVlYmQ3NzY4%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%220432f3a0-d225-405c-b0f4-ff1ffaf4f1fd%22%7d

Video Archive Go Top

 

Series: Theory Seminar
Title: Partitioning over Submodular Structures – Part II

  • Speaker: Prof. Karthekeyan Chandrasekaran,
                   Associate Professor, Department of Industrial and Enterprise Systems Engineering,
                   Affiliate, Department of Computer Science,
                   University of Illinois, Urbana-Champaign,
  • Date and Time: Thursday, March 24, 2022, 9:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
In submodular k-partitioning problems, the input is a submodular set function (given via an evaluation oracle) along with a fixed positive integer k (e.g., k = 2, 3, 4, …) and the goal is to partition the ground set into k non-empty parts in order to minimize certain natural objectives of interest. Submodular k-partitioning generalizes partitioning problems over several interesting structures including graphs and hypergraphs. The case of 2-partitioning corresponds to the classic and well-studied submodular minimization problem which is polynomial-time solvable. In this talk, I will present a polynomial time algorithm for minmax symmetric submodular k-partitioning for every fixed k.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Theory Seminar
Title: Partitioning over Submodular Structures – Part I

  • Speaker: Prof. Karthekeyan Chandrasekaran,
                   Associate Professor, Department of Industrial and Enterprise Systems Engineering,
                   Affiliate, Department of Computer Science,
                   University of Illinois, Urbana-Champaign
  • Date and Time: Thursday, March 17, 2022, 9:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
In submodular k-partitioning problems, the input is a submodular set function (given via an evaluation oracle) along with a fixed positive integer k (e.g., k = 2, 3, 4, …) and the goal is to partition the ground set into k non-empty parts in order to minimize certain natural objectives of interest. Submodular k-partitioning generalizes partitioning problems over several interesting structures including graphs and hypergraphs. The case of 2-partitioning corresponds to the classic and well-studied submodular minimization problem which is polynomial-time solvable. In this talk, I will survey some recent progress towards polynomial-time solvability of submodular k-partitioning for fixed constants k>=3. As a main technical result, I will present a random contraction algorithm for min-cut in hypergraphs – this is a generalization of Kargers algorithm for min-cut in graphs.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2018      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits