Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

Series: Theory Seminar
Title: Matrix Discrepancy from Quantum Communication

  • Speaker: Abhishek Shetty,
                   PhD student,
                   Department of Computer Science,
                   University of California at Berkeley
  • Date and Time: Friday, January 28, 2022, 11:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
In this talk, we will discuss a novel connection between discrepancy minimization and (quantum) communication complexity. As an application, we resolve a substantial special case of the Matrix Spencer conjecture. In particular, we show that for every collection of $n$ $n times n$ symmetric matrices $A_1 dots A_n$ with spectral norm bounded by 1 and Frobenius norm bounded by$n^{1/4}$, there is a signing $x$ such that $|| sum x_i A_i|| leq sqrt{n}$ We give a polynomial-time algorithm based on partial coloring and semidefinite programming to find such a sign.

The proof of our main result combines a simple compression scheme for transcripts of repeated (quantum) communication protocols with quantum state purification, the Holevo bound from quantum information, and tools from sketching and dimensionality reduction. Our approach also offers a promising avenue to resolve the Matrix Spencer conjecture completely -- we show it is implied by a natural conjecture in quantum communication complexity.

The talk is based on joint work with Sam Hopkins (MIT) and Prasad Raghavendra (UC Berkeley).

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: Neural Models for Personalized Recommendation Systems with External Information

  • Speaker: Mr. Vijaikumar M,
                   Ph.D (Engg.) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. M Narasimha Murty and Prof. Shirish K Shevade
  • Date and Time: Thursday, January 27, 2022, 3:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Personalized recommendation systems use the data generated by user-item interactions (for example, in the form of ratings) to predict different users interests in available items and recommend a set of items or products to the users. The sparsity of data, cold start, and scalability are some of the important challenges faced by the developers of recommendation systems. These problems are alleviated by using external information, which can be in the form of a social network or a heterogeneous information network, or cross-domain knowledge. This thesis develops novel neural network models for designing personalized recommendation systems using the available external information.

The first part of the thesis studies the top-N item recommendation setting where the external information is available in the form of a social network or heterogeneous information network. Unlike a simple recommendation setting, capturing complex relationships amongst entities (users, items, and connected objects) becomes essential when a social and heterogeneous information network is available. In a social network, all socially connected users do not have equal influence on each other. Further, estimating the quantum of influence among entities in a user-item interaction network is important when only implicit ratings are available. We address these challenges by proposing a novel neural network model, SoRecGAT, which employs a multi-head and multi-layer graph attention mechanism. The attention mechanism helps the model learn the influence of entities on each other more accurately. Further, we exploit heterogeneous information networks (HIN) to gather multiple views for the items. A novel neural network model -- GAMMA (Graph and Multi-view Memory Attention mechanism) is proposed to extract relevant information from HINs. The proposed model is an end-to-end model which eliminates the need for learning a similarity matrix offline using some manually selected meta-paths before optimizing the desired objective function.

In the second part of the thesis, we focus on top-N bundle recommendation and list continuation setting. Bundle recommendation is the task of recommending a group of products instead of individual products to users. We study two interesting challenges -- (1) how to personalize and recommend existing bundles to users and (2) how to generate personalized novel bundles targeting specific users. We propose GRAM-SMOT -- a graph attention-based framework that considers higher-order relationships among the users, items, and bundles and the relative influence of items present in the bundles. For efficiently learning the embeddings of the entities, we define a loss function based on the metric-learning approach. A strategy that leverages submodular optimization ideas is used to generate novel bundles.

We also study the problem of top-N personalized list continuation where the task is to curate the next items to user-generated lists (ordered sequence of items) in a personalized way by using the sequential information of the items in the list. The main challenge in this task is understanding the ternary relationships among the users, items, and lists. We propose HyperTeNet -- a self-attention hypergraph and Transformer-based neural network architecture for the personalized list continuation task. Here, graph convolutions are used to learn the multi-hop relationship among entities of the same type. A self-attention-based hypergraph neural network is proposed to learn the ternary relationships among the interacting entities via hyperlink prediction in a 3-uniform hypergraph. Further, the entity embeddings are shared with a Transformer-based architecture and are learned through an alternating optimization procedure.

The final part of the thesis focuses on the personalized rating prediction setting where external information is available in the form of cross-domain knowledge. We propose an end-to-end neural network model, NeuCDCF, that provides a way to alleviate data sparsity problems by exploiting the information from related domains. NeuCDCF is based on a wide and deep framework and learns the representations jointly using matrix factorization and deep neural networks. We study the challenges involved in handling diversity between domains and learning complex non-linear relationships among entities within and across domains.

We conduct extensive experiments in each of these settings using several real-world datasets and demonstrate the efficacy of the proposed models.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_NjRmMDg2OTktMjM5ZC00NjMyLWFiZmMtNmQ1NGY2NTkzNGU2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%229c3e2bfe-1b0f-4d7b-a589-832878069dc6%22%7d

Video Archive Go Top

 

PAST SEMINARS

Series: Theory Seminar
Title: Planar Partition Oracles and When they strike gold: An exp(1/epsilon^2) Tester for All Planar Properties

  • Speaker: Akash Kumar,
                   Postdoctoral Researcher,
                   Theoretical Computer Science,
                   EPFL, Switzerland,
  • Date and Time: Friday, January 21, 2022, 5:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Take a planar graph with maximum degree d. These graphs admit a hyperfinite decompositions, where, for a sufficiently small ϵ > 0, one removes ϵdn edges to get connected components of size independent of n. An important tool for sublinear algorithms and property testing for such classes is the partition oracle. A partition oracle is a local procedure that gives consistent access to a hyperfinite decomposition, without any preprocessing. Given a query vertex v, the partition oracle outputs the component containing v in time independent of n. All the answers are consistent with a single hyperfinite decomposition. In this talk, I will describe a partition oracle that runs in time poly(d/ϵ). I will also describe how this machinery strikes a fortune and helps in obtaining a constant time tester for all planar properties. This is easily obtained by a better error analysis on the seminal result of Newman and Sohler [SICOMP 2013]. Based on Joint works with Sabyasachi Basu, C. Seshadhri and Andrew Stolman.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: Achieving practical secure non-volatile memory system with in-Memory Integrity Verification (iMIV)

  • Speaker: Rajat Jain
  • Faculty Advisor: Arkaprava Basu
  • Date and Time: Wednesday, January 19, 2022, 2:00 PM
  • Venue: https://teams.microsoft.com/l/meetup-join/19%3ameeting_NWNlOTU1NGYtZWJmNC00MWQyLWI1YTctYWI0MTEzN2QwZDcw%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%2282f39501-c5b2-4bfb-87c3-f17ca74c00b6%22%7d

Abstract
Recent commercialization of Non-Volatile Memory (NVM) technology in the form of Intel Optane enables programmers to write recoverable programs. However, the data on NVM is susceptible to a plethora of data remanence attacks, which makes confidentiality and integrity protection of data essential for a secure NVM system. However, that requires computing and maintaining a large amount of security metadata (encryption counters, message authentication code (MAC), and integrity tree nodes (BMT)). Furthermore, crash consistency guarantees require the system to persist the security metadata and data atomically back to NVM, incurring high overheads. So there is a trade-off between providing complete security guarantees, the performance and recovery time of an NVM system.

To ensure the confidentiality and integrity of data, a substantial quantity of security metadata is required. Of these, persisting Bonsai Merkel Tree (BMT) nodes, which are essential for fine-grain integrity verification, add substantial cost owing to the massive amount of data that must be moved off-chip to the bandwidth-constrained NVM. Thus, prior works often make a trade-off between performance and fine-grain verifiability or forego it entirely in favor of performance.

The goal of this thesis is to maintain strong security and verifiability guarantees while limiting the cost of BMT updates and my thesis accomplishes this by leveraging the in-memory integrity verification. We make the fine-grain integrity verifiability realizable with a radically different approach of using in-memory computing for integrity verification. Our proposal, iMIV draws inspiration from the fact that today's commercial Optane NVM performs encryption onboard the DIMM. We argue that memory-intensive integrity verification operation should be performed near the (non-volatile) memory to avoid off-chip data movement. iMIV targets to minimize the off-chip memory transfer and mitigate the effect of bandwidth wall and also scales to larger NVM capacity in future systems with per-DIMM BMT.

The experiments and analysis are carried out on a trace-driven cycle-accurate simulator VANS, which mimics the internal micro-architecture of Intel Optane memory DIMMs. The experimental results show that in comparison to the Baseline scheme with write-through caches and strict persistency model, which also provides complete security guarantees, iMIV reduces system runtime by 1.8 times for persistent-memory aware workloads and 3.4 times for persistent-memory agnostic workloads. iMIV's recovery time on system crashes is microseconds-scale without compromising on detecting tampering and fast pin-point of the unverifiable region. iMIV limits the performance overheads associated with fine-grain integrity verifiability to less than 55 percent compared to a system that offers no security.

Speaker Bio:
I, Rajat Jain, am an MTech Research student in the Department of Computer Science and Automation at IISc, Bangalore. I am part of the Computer Systems Lab (CSL) and works under the guidance of Prof. Arkaprava Basu. During MTech(Res), I have worked on improving the performance of secure NVM systems to make them practical to use by mitigating the constrained NVM write bandwidth bottleneck. I am interested in continuing my research in the domain of persistent memory (PM) and figuring out ways to optimize the performance of various workloads utilizing PM. I am also interested in figuring out ways to best utilize the heterogeneous systems comprising DRAM, and PM.

Host Faculty: Arkaprava Basu

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: Analysis and Methods for Knowledge Graph Embeddings

  • Speaker: Mr. Chandrahas,
                   Ph.D (Engg.)student,
                   Dept. of CSA,
  • Faculty Advisor: Prof. Partha Pratim Talukdar
  • Date and Time: Tuesday, January 11, 2022, 10:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Knowledge Graphs (KGs) are multi-relational graphs where nodes represent entities, and typed edges represent relationships among entities. These graphs store real-world facts such as (Lionel Messi, plays-for-team, Barcelona) as edges, called triples. KGs such as NELL, YAGO, Freebase, and WikiData have been very popular and support many AI applications such as Web Search, Query Recommendation, and Question Answering. Although popular, these KGs suffer from incompleteness. Learning Knowledge Graph Embeddings (KGE) is a popular approach for predicting missing edges (i.e., link prediction) and representing entities and relations in downstream tasks. While numerous KGE methods have been proposed in the past decade, our understanding and analysis of such embeddings have been limited. Further, such methods only work well with ontological KGs. In this thesis, we address these gaps.

Firstly, we study various KGE methods and present an extensive analysis of these methods, resulting in interesting insights. Next, we address an under-explored problem of link prediction in Open Knowledge Graphs (OpenKGs) and present a novel approach that improves the type compatibility of predicted edges. Lastly, we present an adaptive interaction framework for learning KG embeddings that generalizes many existing methods.

Analysis of KGE Embeddings In the first part, we present a macro and a micro analysis of embeddings learned by various KGE methods.

Despite the popularity and effectiveness of KG embeddings, their geometric understanding (i.e., arrangement of entity and relation vectors in vector space) is unexplored. We initiate a study to analyze the geometry of KG embeddings and correlate it with task performance and other hyper-parameters. Firstly, we present a set of metrics (e.g., Conicity, ATM) to analyze the geometry of a group of vectors. Using these metrics, we find sharp differences between the geometry of embeddings learned by different classes of KGE methods. The vectors learned by a multiplicative model lie in a narrow cone, unlike additive models where the vectors are spread out in the space. This behavior of multiplicative models is amplified by increasing the number of negative samples used for training. Further, a very high Conicity value is negatively correlated with the performance on the link prediction task.

We also study the problem of understanding KG embeddings semantics and propose an approach to learn more coherent dimensions. A dimension is coherent if the top entities have similar types (e.g., person). In this work, we formalize the notion of coherence using entity co-occurrence statistics and propose a regularizer term that maximizes coherence while learning KG embeddings. The proposed approach significantly improves coherence while having a comparable performance with baseline in the link prediction and triple classification tasks. Further, based on the human evaluation, we demonstrate that the proposed approach learns more coherent dimensions than the baseline.

A method for OpenKG Embedding In the second part, we address the problem of learning KG embeddings for Open Knowledge Graphs (OpenKGs), focusing on improving link prediction. An OpenKG refers to a set of (head noun phrase, relation phrase, tail noun phrase) triples such as (tesla, return to, new york) extracted from a text corpus using OpenIE tools. While OpenKGs are easy to bootstrap for a domain, they are very sparse. Therefore, link prediction becomes an important step while using these graphs in downstream tasks.

Learning OpenKG embeddings is one approach for link prediction that has received some attention lately. However, on careful examination, we find that current algorithms often predict noun phrases (NPs) with incompatible types for given noun and relation phrases. We address this problem and propose OKGIT that improves OpenKG link prediction using novel type compatibility score and type regularization. With extensive experiments on multiple datasets, we show that the proposed method achieves state-of-the-art performance while producing type compatible NPs in the link prediction task.

An Adaptive Framework for KG Embeddings In the third part, we address the problem of improving the KGE models.

Firstly, we show that the performance of existing approaches varies across different datasets, and a simple neural network-based method can consistently achieve better performance on these datasets. Upon analysis, we find that KGE models depend on fixed sets of interactions among the dimensions of entity and relation vectors.

Therefore, we investigate ways to learn such interactions automatically during training. We propose an adaptive interaction framework for learning KG embeddings, which can learn appropriate interactions while training. We show that some of the existing models could be seen as special cases of the proposed framework. Based on this framework, we also present two new models, which outperform the baseline models on the link prediction task. Further analysis demonstrates that the proposed approach can adapt to different datasets by learning appropriate interactions.

Microsoft teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_NmVhNDk3ZDctOWVlNC00MWE0LWEzMTItNWIyNDE3YTlhNDFh%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%2258c2a810-9762-463d-90d0-20832b3d1592%22%7d

Video Archive Go Top

 

Series: Theory Seminar
Title: Playing Unique Games on Certifiable Small-set Expanders

  • Speaker: Mitali Bafna,
                   PhD Student,
                   Theoretical Computer Science,
                   Harvard University
  • Date and Time: Friday, January 07, 2022, 5:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
The Unique Games Conjecture (UGC) is a central open question in computational complexity and algorithms. In short, the UGC stipulates that distinguishing between almost satisfiable and highly unsatisfiable instances of a certain 2-variable constraint satisfaction problem (CSP) called Unique Games is NP-hard. We build algorithms for UG on a large class of restricted instances called certifiable small-set expanders. In doing so we give new tools to analyze Sum-of-Squares SDPs and connect the UGC to another important complexity theoretic conjecture, the Small-Set-Expansion Hypothesis. The talk will start from a basic introduction to the UG problem and will not assume prior knowledge about UG or sum-of-squares algorithms.

This is based on joint work - https://arxiv.org/abs/2006.09969

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Ph.D. Colloquium
Title: Operating System and Hypervisor Support for Mitigating the Address Translation Wall

  • Speaker: Ashish Panwar
  • Faculty Advisor: K. Gopinath and Arkaprava Basu
  • Date and Time: Monday, January 03, 2022, 5:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_YjA2OGFmMmYtNjBjMi00NDJlLTk2NTItZjI5YTBlOTY3Yjgw%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%2282f39501-c5b2-4bfb-87c3-f17ca74c00b6%22%7d

Virtual memory has proven to be an extremely powerful abstraction for its programmability benefits. Unfortunately, virtual memory is becoming a performance bottleneck due to the address translation wall. Modern applications with large memory footprints necessitate frequent page table walks to perform the virtual to physical address translation. Consequently, the hardware spends 30-50% of the total CPU cycles in servicing TLB misses alone. Virtualization and non-uniform memory access (NUMA) architectures further exacerbate this overhead. For example, virtualized systems involve two-dimensional page table walks that require up to 24 memory accesses for each TLB miss, with current 4-level page tables. The address translation performance drops further on NUMA systems, depending on the distance between the CPU and page tables. These overheads will increase in the future, where deeper page tables and multi-tiered memory systems will enable even larger applications. Virtual memory, therefore, is showing its age in the era of data-centric computing.

This thesis investigates the role of an operating system (OS) and hypervisor in improving the address translation performance. First, we focus on huge pages that can significantly reduce the frequency and cost of TLB misses. Huge pages are widely available in modern systems e.g., x86 architecture supports 2MB and 1GB huge pages, in addition to regular 4KB pages. While huge pages are great in theory, real-world OSs have often delivered disappointing performance while using them. This is because memory management of huge pages is fraught with multiple challenges. We propose several enhancements in OS-level policies and mechanisms to make huge pages beneficial, even under multi-dimensional constraints such as latency, capacity, and fairness.

Second, we investigate the effect of NUMA on address translation performance. NUMA architectures mandate careful data placement to hide the effect of variable memory access latency from applications. Several decades of research on NUMA systems have optimized access to user-level application data. However, prior research has ignored the access performance of kernel data, including page tables, due to their small memory footprint. We argue that it is time to revisit page table management for NUMA-like systems.

The core contributions of this thesis include four systems: Illuminator, HawkEye, Trident, and vMitosis, as summarized below:

Illuminator: We first expose some subtle implications of external memory fragmentation on huge pages. We show that despite proactive measures employed in the memory management subsystem of Linux, unmovable kernel objects (e.g., inodes, page tables, etc.) can deny huge pages to user applications. In a long-running system, unmovable objects fragment physical memory, often permanently, and cause high de-fragmentation overheads. Over time, their effects manifest in performance regressions, OS jitter, and latency spikes. Illuminator effectively clusters kernel objects in a subset of physical memory regions and makes huge page allocations feasible even under heavily fragmented scenarios..

HawkEye: In this work, we deal with OS-based huge page management policies that need to balance complex trade-offs between TLB coverage, memory bloat, latency, and the number of page faults. In addition, we consider performance and fairness issues that appear under fragmentation when memory contiguity is limited. In HawkEye, we propose asynchronous page pre-zeroing to simultaneously optimize for low latency and few page faults. We propose automatic bloat recovery to effectively deal with the trade-offs between TLB coverage and memory bloat at runtime. HawkEye addresses the performance and fairness challenges by allocating huge pages based on their estimated profitability.

Trident: Illuminator and HawkEye try to extract maximum benefits from 2MB huge pages. However, recent findings have shown that even after employing 2MB pages, more than 20% of the total CPU cycles are wasted in handling TLB misses for data center applications. We address this problem using 1GB huge pages that provide up to 1TB per-core TLB coverage on modern systems. Leveraging insights from our earlier work, we propose a multi-level huge page framework called Trident that judiciously allocates 1GB, 2MB, and 4KB pages as deemed suitable at runtime.

vMitosis: In this work, we focus on the effect of NUMA on address translation in virtualized servers. We show that page table walks often involve remote memory accesses on NUMA systems that can slow down large memory applications by more than 3x. Interestingly, the slow down observed due to remote page table accesses can even outweigh that of accessing remote data, even though page tables consume less than 1% memory of overall application footprint. vMitosis mitigates the effect of NUMA on page table walks by enabling each core to handle TLB misses from its local socket. We achieve this by judiciously migrating and replicating page tables across NUMA sockets.

Overall, with this thesis, we show that adequate OS and hypervisor support can help virtual memory thrive even in the era of data-centric computing. We have implemented our proposed systems in the Linux OS kernel and KVM hypervisor. Our optimizations are transparent to the users, and using them does not require any hardware or application modifications.

Speaker Bio:
Ashish Panwar is a Ph.D. candidate at Indian Institute of Science. He is expected to graduate by February 2022. During PhD, he has worked on improving memory access performance and mitigating NUMA overheads for large memory CPU-based applications. He is currently involved in optimizing GPU software runtime to accelerate machine learning workloads, especially deep neural networks. Prior to Ph.D., he worked at Intel and NetApp for a total of three years in Bangalore, India.

Host Faculty: Arkaprava Basu

Video Archive Go Top

 

Series: Department Seminar
Title: Energy-efficient Communication Architectures for beyond von-Neumann DNN Accelerators

  • Speaker: Sumit Mandal, Univ. of Wisconsin, USA
  • Date and Time: Monday, December 20, 2021, 8:00 AM
  • Venue: Online Meeting

Abstract
Data communication plays a significant role in overall performance for hardware accelerators of Deep Neural Networks (DNNs). For example, crossbar-based in-memory computing significantly increases on-chip communication volume since the weights and activations are on-chip. State-of-the-art interconnect methodologies for in-memory computing deploy a bus-based network or mesh-based NoC. Our experiments show that up to 90% of the total inference latency of a DNN hardware is spent on on-chip communication when the bus-based network is used. To reduce communication latency, we propose a methodology to generate an NoC architecture and a scheduling technique customized for different DNNs. We prove mathematically that the developed NoC architecture and corresponding schedules achieve the minimum possible communication latency for a given DNN. Experimental evaluations on a wide range of DNNs show that the proposed NoC architecture enables 20%-80% reduction in communication latency with respect to state-of-the-art interconnect solutions.

Graph convolutional networks (GCNs) have shown remarkable learning capabilities when processing data in the form of graph which is found inherently in many application areas. To take advantage of the relations captured by the underlying graphs, GCNs distribute the outputs of neural networks embedded in each vertex over multiple iterations. Consequently, they incur a significant amount of computation and irregular communication overheads, which call for GCN-specific hardware accelerators. We propose a communication-aware in-memory computing architecture (COIN) for GCN hardware acceleration. Besides accelerating the computation using custom compute elements (CE) and in-memory computing, COIN aims at minimizing the intra- and inter-CE communication in GCN operations to optimize the performance and energy efficiency. Experimental evaluations with various datasets show up to 174x improvement in energy-delay product with respect to Nvidia Quadro RTX 8000 and edge GPUs for the same data precision.

MS Teams Meeting Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZTIxYTBlY2ItZDkyZS00YjVhLTg0YTEtMzNiN2M3OTA4NjMy%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%224bcd3d56-e405-4b06-99fb-27742262f261%22%7d

Speaker Bio:
Sumit K. Mandal received his dual (B.Tech + M.Tech) degree in Electronics and Electrical Communication Engineering from IIT Kharagpur in 2015. After that, he was a Research & Development Engineer in Synopsys, Bangalore (2015-2017). Currently, he is pursuing Ph.D. in University of Wisconsin-Madison. He is expected to graduate on June, 2022. Details of his research work can be found in https://sumitkmandal.ece.wisc.edu/

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: M.Tech (Research)Thesis Defence -ONLINE
Title: Security of Post-Quantum Multivariate Blind Signature Scheme: Revisited and Improved

  • Speaker: Ms. Aalo Majumdar,
                   M.Tech (Research) student,
                   Dept. of C.S.A
  • Faculty Advisor: Prof. Sanjit Chatterjee
  • Date and Time: Monday, December 20, 2021, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Current ubiquitous cryptosystems face an imminent threat from quantum algorithms like Shors and Grovers, leading us to a future of post-quantum cryptography. Multivariate signatures are prominent in post-quantum cryptography due to their fast, low-cost implementations and shorter signatures. Blind signatures are a special category of digital signatures with two security notions: blindness and one-more unforgeability (OMF). Our work primarily focuses on the multivariate blind signature scheme (MBSS) proposed by Petzoldt et al. We construct a formal proof along the lines of the heuristic sketch given by the authors of MBSS based on the proposed universal one-more unforgeability (UOMF) model, a weaker variant of OMF. The construction of this proof led us to identify the various issues in the security argument - mainly the difficulty in simulating the response to the blind signature queries without knowing the secret key of the underlying Rainbow scheme used. Since our investigation revealed the difficulty in reducing the UOMF security to the hardness assumption used by the authors, therefore we design a new class of hardness assumptions: (1) Single Target Inversion Problem, PR-STI (2) Modified version of Single Target Inversion Problem, PR-mSTI (3) Chosen Target Inversion Problem, PR-CTI. Armed with the new class of computational problems, we reduce the UOMF and OMF security of MBSS to these problems. We begin by giving an improved security argument of MBSS in the UOMF security model using the PR-mSTI assumption. We employ the general forking algorithm in this security reduction. However, we cannot apply the forking algorithm directly owing to the wrapper algorithm being split and the presence of the blind signature oracle. We thus suitably modify the algorithm and then derive the corresponding forking probability. To argue the security of MBSS in the standard security model, i.e., in the OMF model, we try using the PR-CTI assumption. The PR-CTI problem demands computing the solution for more than one target. With the high cost of forking, a significant degradation factor appears in the success probability. So, instead, we reduce the OMF security of MBSS to the PR-mSTI assumption (in a restricted setting) and give a comparative analysis between the security argument in the UOMF and OMF models.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_YWM1NzljNGYtOTM1MS00NDY1LTliMDktYTRiZTZlOGU5MWJj%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22f24581ee-9350-45bb-b76b-26beecd9d45f%22%7d

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning

  • Speaker: Mr. Ajith S,
                   Ph.D (Engg.) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. Arpita Patra
  • Date and Time: Monday, December 20, 2021, 11:30 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
In the modern era of computing, machine learning tools have demonstrated their potential in vital sectors, such as healthcare and finance, to derive proper inferences. The sensitive and confidential nature of the data in such sectors raises genuine concerns for data privacy. This motivated the area of Privacy-preserving Machine Learning (PPML), where privacy of data is guaranteed. Typically, machine learning techniques require significant computing power, which leads clients with limited infrastructure to rely on the method of Secure Outsourced Computation (SOC). In the SOC setting, the computation is outsourced to a set of specialized and powerful cloud servers and the service is availed on a pay-per-use basis. In this thesis, we design an efficient platform, MPCLeague, for PPML in the SOC setting using Secure Multi-party Computation (MPC) techniques.

MPC, the holy-grail problem of secure distributed computing, enables a set of n mutually distrusting parties to perform joint computation on their private inputs in a way that no coalition of t parties can learn more information than the output (privacy) or affect the true output of the computation (correctness). While MPC, in general, has been a subject of extensive research, the area of MPC with a small number of parties has drawn popularity of late mainly due to its application to real-time scenarios, efficiency and simplicity. This thesis focuses on designing efficient MPC frameworks for 2, 3 and 4 parties, with at most one corruption and supports ring structures.

Our platform aims at achieving the most substantial security notion of robustness, where the honest parties are guaranteed to obtain the output irrespective of the behaviour of the corrupt parties. A robust protocol prevents the corrupt parties from repeatedly causing the computations to rerun, thereby upholding the trust in the system. While on the roadmap to attain robustness, our frameworks also demonstrate constructions with improved performance that achieve relaxed notions of security: security with abort and fairness. A fair protocol enforces the restriction that either all parties or none of them receive the output. On the other hand, honest parties may not receive the output while corrupt parties do for the case of security with abort.

The general structure of the computation involves the execution of the protocol steps once the participating parties have supplied their inputs. Finally, the output is distributed to all the parties. However, to enhance practical efficiency, many recent works resort to the preprocessing paradigm, which splits the computation into two phases; a preprocessing phase where input-independent (but function-dependent), computationally heavy tasks can be computed, followed by a fast online phase. Since the same functions in ML are evaluated several times, this paradigm naturally fits the case of PPML, where the ML algorithm is known beforehand.

At the heart of this thesis are four frameworks - ASTRA, SWIFT, Tetrad, ABY2.0 - catered to different settings.

- ASTRA: We begin with the setting of 3 parties (3PC), which forms the base case for honest majority. If a majority of the participating parties are honest, then the setting is deemed an honest majority setting. In the set of 3 parties, at most one party can be corrupt, and this framework tackles semi-honest corruption, where the corrupt party follows the protocol steps but tries to glean more information from the computation. ASTRA acts as a stepping stone towards achieving a stronger security guarantee against active corruption. Our protocol requires communication of 2 ring elements per multiplication gate during the online phase, attaining a per-party cost of less than one element. This is achieved for the first time in the regime of 3PC.

- SWIFT: Designed for 3 parties, this framework tackles one active corruption where the corrupt party can arbitrarily deviate from the computation. Building on ASTRA, SWIFT provides a multiplication that improves the communication by at least 3x over state of the art, besides improving security from abort to robustness. In the regime of malicious 3PC, SWIFT is the first robust and efficient PPML framework. It achieves a dot product protocol with communication independent of the vector size for the first time.

- Tetrad: Designed for 4 parties in the honest majority, the fair multiplication protocol in Tetrad requires communication of only 5 ring elements instead of 6 in the state-of-the-art. The fair framework is then extended to provide robustness without inflating the costs. A notable contribution is the design of the multiplication protocol that supports on-demand applications where the function to be computed is not known in advance.

- ABY2.0: Moving on to the stronger corruption model where a majority of the parties can be corrupt, we explore the base case of 2 parties (2PC). Since we aim to achieve robustness which is proven to be impossible in active corruption, we restrict ourselves to semi-honest corruption. The prime contribution of this framework is the scalar product for which the online communication is two ring elements irrespective of the vector dimension. This is a feature achieved for the first time in the 2PC literature.



Our frameworks provide the following contributions in addition to the ones mentioned above. First, we support multi-input multiplication for arithmetic and boolean worlds, improving the online phase in rounds and communication. Second, all our frameworks except SWIFT, incorporate truncation without incurring any overhead. Finally, we introduce efficient instantiation of garbled-world, tailor-made for the mixed-protocol framework for the first time. The mixed-protocol approach, combining arithmetic, boolean and garbled style computations, has demonstrated its potential in several practical use-cases like PPML. To facilitate the computation, we also provide the conversion mechanisms to switch between the computation styles.

The practicality of our framework is argued through improvements in the benchmarking of widely used ML algorithms -- Linear Regression, Logistic Regression, Neural Networks, and Support Vector Machines. We propose two variants for each of our frameworks, with one variant aiming to minimise the execution time while the other focuses on the monetary cost.

The concrete efficiency gains of our frameworks coupled with the stronger security guarantee of robustness make our platform an ideal choice for a real-time deployment of privacy-preserving machine learning techniques.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MDgwNTIyZmQtM2M1NC00ZGNjLTg4NjItMWNlYWM4NzliM2Fl%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22a787cc01-57cc-4fc1-b7e1-4e9d51923f6d%22%7d

Video Archive Go Top

 

Series: Department Seminar
Title: Provable and Efficient Algorithms for Federated, Reinforcement and Batch Learning

  • Speaker: Dr. Avishek Ghosh, Univ. of California, San Diego, USA
  • Date and Time: Thursday, December 16, 2021, 8:00 AM
  • Venue: Online Meeting

Abstract
In this talk, I will propose and analyze iterative algorithms that are computationally efficient, statistically sound and adaptive to the problem complexity. Three different frameworks in which data is presented to the learner are considered : (a) Federated (Distributed) Learning (FL) setup, where data is only available at the edge, and a center machine learns various models via iteratively interacting with the edge nodes; (b) Reinforcement Learning (RL) framework, where agents successively interacts with the environment to learn an optimal policy and (c) Supervised batch setup, where all the data and label pairs are available to the learner at the beginning. In particular, I will explain my contribution in the FL and RL framework and (very) briefly mention the supervised batch setup. In the Federated Learning (FL) framework, I'll address the canonical problems of device heterogeneity, communication bottleneck and adversarial robustness for large scale high dimensional problems, and propose efficient and provable algorithms that obtain the optimal statistical rate. In The Reinforcement Learning (RL) setup, I will focus on the problem of structure adaptation (model selection) for large scale problems---an aspect most practical RL based systems like autonomous cars, recommendation systems crucially require. The model selection problem is studied both with function approximation (parameterized RL) as well as for the generic model based RL setup, and the algorithms we propose obtain (near) optimal regret guarantees. Teams Meeting Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MGUwNjY3NmEtZWZkYy00NzJkLWIzZjMtOWUzMTAyOTgwNzg3%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%224bcd3d56-e405-4b06-99fb-27742262f261%22%7d

Speaker Bio:
I am an HDSI (Data Science) Post-doctoral fellow at the University of California, San Diego, working with Prof. Arya Mazumdar and Prof. Tara Javidi. Prior to this, I completed my PhD from the EECS department of UC Berkeley, advised by Prof. Kannan Ramchandran and Prof. Aditya Guntuboyina. My research interests are broadly in Theoretical Machine Learning, including Federated Learning and multi-agent Reinforcement/Bandit Learning. I spent the summer of 2020 working as a Research Scientist at Amazon, New York; in the Supply Chain Optimization Team (SCOT), where I worked with Dean Foster and Alexandar (Sasha) Rakhlin. Before coming to Berkeley, I completed my masters degree from Indian Institute of Science (IISc), Bangalore working with Prof. Anurag Kumar, and Prof. Aditya Gopalan. Before IISc, I spent 4 years at Jadavpur University, where I completed my bachelors degree and worked with Prof. Amit Konar. I am a recipient of the `Excellence Award' from UC Berkeley and Gold medal from Jadavpur University.

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: Ph.D (Engg.)Thesis Defence -ON-LINE
Title: Deep Learning over Hypergraphs

  • Speaker: Mr. Naganand Yadati,
                   Ph.D (Engg.) student
                   Dept. of CSA
  • Faculty Advisor: Prof. Partha Pratim Talukdar
  • Date and Time: Tuesday, December 14, 2021, 9:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Though graphs have been extensively used for modelling real-world relational datasets, they are restricted to pairwise relationships, i.e., each edge connects exactly two vertices. Many real-world relational datasets such as academic networks, chemical reaction networks, email communication networks contain group-wise relationships that go beyond pairwise associations. Hypergraphs can flexibly model such datasets by relaxing the notion of an edge to connect an arbitrary number of vertices and providing a mathematical foundation for understanding and learning from large amounts of real-world heterogeneous data. The state-of-the-art techniques for learning from graph data with pairwise relationships use graph-based deep learning models such as graph neural networks. A prominent observation that inspires this thesis is that deep neural networks are still under-explored for hypergraph data with group-wise relationships. Hypergraphs have been utilised as primary data structures in many machine learning tasks such as vertex classification, hypergraph link prediction, and knowledge base completion. However, the fundamental limitation of most existing non-neural techniques is that they cannot leverage high-dimensional features on vertices, especially those which are not present in relational data (e.g., text attributes of documents in academic networks). In this thesis, we propose novel deep learning-based methods for hypergraph data with high dimensional vertex features. 1) Deep Learning for Hypergraph Vertex-level Predictions In the first part of the thesis, we focus on addressing limitations of existing methods for vertex-level tasks over hypergraphs. In particular, we propose HyperGraph Convolutional Network (HyperGCN) for semi-supervised vertex classification over hypergraphs. Unlike existing methods, HyperGCN principally bridges tools from graph neural networks and spectral hypergraph theory.

2) Deep Learning for Hypergraph Link Prediction In the second part, we focus on the task of predicting groupwise relationships (i.e., link prediction over hypergraphs). We propose Neural Hyperlink Predictor (NHP), a novel neural network-based method for link prediction over hypergraphs. NHP uses a novel scoring layer that principally enables us to predict group relationships on incomplete hypergraphs where hyperedges need not represent similarity.

3) Deep Learning for Multi-Relational and Recursive Hypergraphs In the third and final part, we explore more complex structures such as multi-relational hypergraphs in which each hyperedge is typed (i.e., belongs to a relation type) and recursive hypergraphs in which hyperedges can act as vertices in other hyperedges. We first propose Generalised Message Passing Neural Network (G-MPNN) for learning vertex representations on multi-relational ordered hypergraphs. G-MPNN generalises existing MPNNs on graphs, hypergraphs, multi-relational graphs, heterogeneous graphs, and multi-layer networks. We then propose MPNN-Recursive, a novel framework, to handle recursively structured data. Extensive experimentation on real-world hypergraphs shows the effectiveness of our proposed models.

Microsoft teams Link to Online Defence:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTk2NjE1YTYtMGNjYS00NzE0LTkxZTEtYTJhZDFkMzViZTAx%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%2258c2a810-9762-463d-90d0-20832b3d1592%22%7d

Video Archive Go Top

 

Series: M.Tech (Research)Thesis Defence -ONLINE
Title: Modeling and verification of database-accessing applications

  • Speaker: Mr. Geetam Chawla,
                   M.Tech (Research) student
                   Dept. of CSA
  • Faculty Advisor: Prof. K V Raghavan
  • Date and Time: Monday, December 13, 2021, 9:30 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Databases are central to the functioning of most IT-enabled processes and services. In many domains, databases are accessed and updated via applications written in general-purpose languages, as such applications need to contain the business logic and workflows that are key to the organization. Therefore, automated tools are required not only for creation and testing of database schemas and queries, etc., but also for analysis, testing, and verification of database-accessing applications. In this work we describe a novel approach for modeling, analysis and verification of database-accessing applications. We target applications that use Object Relational Mapping (ORM), which is the common database-access paradigm in most Model- View Controller (MVC) based application development frameworks. In contrast with other approaches that try to directly analyze and prove properties of complex database accessing ORM-based code, our approach infers a relational algebra specification of each controller in the application. This specification can then be fed into any off-the-shelf relational algebra solver to check properties (or assertions) given by a developer. We have implemented this approach as a tool that works for Spring based MVC applications. A preliminary evaluation reveals that the approach is scalable and quite precise.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MDdkODcwY2MtNzM3OS00NWVlLWI1YTItNzVhZTIyMTEyYjZl%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22cd42250e-1d66-4966-a431-6a8d7d5235ba%22%7d

Video Archive Go Top

 

Series: M.Tech (Research)Thesis Defence -ONLINE
Title: Hadwigers conjecture and total coloring.

  • Speaker: Mr. Ankur Naskar
                   M.Tech (Research) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. Sunil L Chandran
  • Date and Time: Monday, December 13, 2021, 11:00 AM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Hadwigers conjecture is one of the most important and long-standing conjectures in graph theory. It was established recently that proving Hadwigers conjecture is equivalent to proving the conjecture for the special class of graphs that can be expressed as the square of some other graph. However, it is difficult to prove Hadwigers conjecture even for the squares of highly specialised graph classes. We decided to investigate the squares of subdivided graphs (a subdivided graph is a graph that can be obtained from another graph H by replacing each edge uv of H by a path uwv, where w is a new vertex). It turns out that squares of subdivided graphs are exactly the class of total graphs, well-studied in the context of the total coloring conjecture, another well-known and long-standing conjecture in graph theory. The total graph of G, denoted byT(G), is defined on the vertex set V(G) ⊔ E(G), with c1,c2 ∈ V(G) ⊔ E(G) adjacent whenever c1 and c2 are adjacent to or incident on each other in G. The total-chromatic number χ′′(G) of a graph G is defined to be equal to the chromatic number of its total graph. That is, χ′′(G) = χ(T(G)). The total coloring conjecture or TCC states that for every graph G, χ′′(G) ≤ ∆(G) + 2. Though Hadwigers conjecture was proved for the closely related class of line graphs 20 years ago itself, Hadwigers conjecture is not yet studied in the context of total graphs. In this thesis, the following results are proved :

(i) There exists a constant C such that, if the connectivity of G ≥ C, then Hadwigers conjecture is true for T(G).

(ii) Let F be a class of graphs that is closed under the operation of taking subgraphs. If a weaker version of the total coloring conjecture (weak TCC) namely, χ′′(G) ≤ ∆(G) + 3, is true for the class F, then Hadwigers conjecture is true for the class {T(G) : G ∈ F}.

The second statement motivates one to look for classes of graphs that satisfy weak TCC. It may be noted that a complete proof of TCC for even 4-colorable graphs (in fact even for planar graphs) has remained elusive even after decades of effort; but weak TCC can be proved easily for 4-colorable graphs. It was noticed that in spite of interest in studying χ′′(G) in terms of χ(G) right from the initial days, weak TCC is not proven to be true for k-colorable graphs, even for k= 5. In the latter half of the thesis, an important contribution to the total coloring literature is made by proving that χ′′(G) ≤ ∆(G) + 3, for every 5-colorable graph G. Being close to some of the well studied topics in total coloring, it seems that this is a long-pending addition to the literature.

Microsoft teams link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZWEyMmNiMWUtNzY1ZS00ZmY3LTkyYmUtNGVjOTNkNGU0NWQx%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22ed2e5ccd-b870-455c-862e-26e9ab1908be%22%7d

Video Archive Go Top

 

Series: Theory Seminar
Title: Some Recent Advances in Dynamic Algorithms for Maximum Matching and Minimum Set Cover (Part 2)

  • Speaker: Dr. Sayan Bhattacharya,
                   Associate Professor,
                   Department of Computer Science,
                   University of Warwick, United Kingdom
  • Date and Time: Friday, December 10, 2021, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Consider a dynamic graph G = (V, E) that is undergoing a sequence of edge insertions/deletions. We want to design an algorithm that maintains an (approximately) maximum matching in this dynamic graph G with small update time. Here, the update time of an algorithm refers to the time it takes to handle the insertion/deletion of an edge in G. We will like to ensure that the update time of our algorithm is polylogarithmic in the number of nodes G.

This problem has received considerable attention in the past decade. In these two talks, I will present an overview of some recent advances on this question. Specifically, I will describe a simple primal-dual algorithm that maintains a (2+epsilon)-approximate "fractional" matching in G in polylogarithmic update time (Part 1), and show how to efficiently "round" this fractional matching into an integral one in the dynamic setting (Part 2). Along the way, I will explain some immediate connections between dynamic fractional matching algorithms and the literature on dynamic set cover.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d

For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Arindam Khan

Video Archive Go Top

 

Series: Theory Seminar
Title: Some Recent Advances in Dynamic Algorithms for Maximum Matching and Minimum Set Cover (Part 1)

  • Speaker: Dr. Sayan Bhattacharya,
                   Associate Professor,
                   Department of Computer Science,
                   University of Warwick,
                   United Kingdom
  • Date and Time: Friday, December 03, 2021, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Consider a dynamic graph G = (V, E) that is undergoing a sequence of edge insertions/deletions. We want to design an algorithm that maintains an (approximately) maximum matching in this dynamic graph G with small update time. Here, the update time of an algorithm refers to the time it takes to handle the insertion/deletion of an edge in G. We will like to ensure that the update time of our algorithm is polylogarithmic in the number of nodes G.

This problem has received considerable attention in the past decade. In these two talks, I will present an overview of some recent advances on this question. Specifically, I will describe a simple primal-dual algorithm that maintains a (2+epsilon)-approximate "fractional" matching in G in polylogarithmic update time (Part 1), and show how to efficiently "round" this fractional matching into an integral one in the dynamic setting (Part 2).

Along the way, I will explain some immediate connections between dynamic fractional matching algorithms and the literature on dynamic set cover.

Microsoft Teams Link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZGE3NDg5NzktMWQ0Zi00MzFmLTg5OTgtMTMyYWM4MWQyYjI2%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227c84465e-c38b-4d7a-9a9d-ff0dfa3638b3%22%7d



For more details about the seminar please visit the website at https://www.csa.iisc.ac.in/iisc-msr-seminar/

Host Faculty: Dr. Arindam Khan

Video Archive Go Top

 

Series: M.Tech (Research)Thesis Defence -ONLINE
Title: Quantum-Safe Identity-Based Signature Scheme in Multivariate Quadratic Setting

  • Speaker: Ms. Akansha Dimri,
                   M.Tech (Research) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. Sanjit Chatterjee
  • Date and Time: Wednesday, December 01, 2021, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Cryptographic techniques are essential for the security of communication in modern society. Today, nearly all public key cryptographic schemes used in practice are based on the two problems of factoring large integers and solving discrete logarithms. However, as the world grapples with the possibility of widespread quantum computing, these schemes are the ones most threatened. Multivariate Public Key Cryptography is one of the possible candidates for security in a post-quantum society, especially in the area of digital signature. This thesis uses the setting of multivariate cryptography to propose an identity-based signature scheme.

Our proposal is based on the Rainbow signature scheme and the multivariate 3-pass identification scheme, both of which have been subjected to scrutiny by cryptographers all over the world and have emerged as strong post-quantum candidates. In our construction, we use the identity of users to generate their private key using Rainbow signature scheme. Thereafter, we use these user private keys to sign messages by applying Fiat-Shamir transform to the 3-pass identification scheme. We support the proposed scheme with suitable proof under appropriate computational assumptions, using the standard notions of security. We study the known attacks against multivariate schemes in general, and Rainbow and MQDSS in particular. We then use this analysis to propose concrete parameter sets for our construction. We implement our proposed scheme on an x86-64 PC platform and provide timing results. Our implementation shows that our construction is both practical and efficient. Thus, our proposed scheme stands as a potential post-quantum multivariate signature candidate in the identity-based setting.

Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_YjQzMmQ1N2YtMDY5NC00YTk5LWE1ZGQtOWZjNzFjOTkyYzVi%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%227b997a2d-ed18-48f7-99c1-eaec40b37793%22%7d

Video Archive Go Top

 

Series: M.Tech (Research) Colloquium
Title: 2-Level Page Tables (2-LPT): A building block for efficient address translation in virtualized environment

  • Speaker: Akshay Baviskar
  • Faculty Advisor: R. Govindarajan
  • Date and Time: Tuesday, November 30, 2021, 4:30 PM
  • Venue: Online Meeting

Abstract
Efficient address translation mechanisms are gaining more and more attention as the virtual address range of the processors keeps expanding and the demand for machine virtualization increases with cloud and data center-based services. Traditional radix-tree based address translations can incur significant overheads in big data applications, particularly under virtualization, due to multi-level tree walks and nested translation. The overheads stem primarily from unnecessary generality --- ability to support several hundreds of thousands of virtual memory regions in the virtual address space --- supported by current processors.

We observe that in the common case, however, a process's virtual address space contains only a few contiguously allocated sections, which can be efficiently translated using a shallow tree with two levels. We propose such a compact structure, called 2-Level Page Table(2-LPT), which is a key building block for address translation in virtualized environment. A key advantage of 2-LPT is that it maintains two levels of page tables irrespective of the size of the virtual address space. Translating a virtual address (VA) using 2-LPT is fast. A walk on a 2-LPT requires up to two memory accesses. In practice, however, the root level table is well cached in the PWC, thus, single memory access is sufficient. Under native execution, 2-LPT reduces the time spent in page walks by up to 20.9% (9.38% on average) and improves performance by up to 10.1% (1.66% on average) over the conventional four-level radix tree page tables, on a set of memory-intensive applications.

2-LPT is more beneficial under virtualization. The proposed 2-LPT design reduces the cost of nested page walk from 24 to 8 memory accesses. To achieve further reduction, we propose two optimizations: (i) Enhanced Partial Shadow Paging (ePSP) which employs a limited form of shadow paging for the root-level of 2-LPT, and (ii) Host PTE Mirroring (HPM) which allows accessing the host page table entry without performing host page table walk. These allow us to largely avoid slow VM exits while effectively reducing the number of memory access on a nested address translation to just one, on average. 2-LPT speeds up applications by 5.6%-50.9% (24.4%, on average) over the baseline with conventional nested page walks. Importantly, it reduces page walk cycles and execution time of the best performing state-of-the-art proposal by 17.1%-57.1% and by 3.9%-43.9%, respectively.

MS Team's Meeting Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZTEwNmU0NmEtYTIxMS00YWFkLTk1ZTktZDY4MGJmMjRhMzZl%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%224bcd3d56-e405-4b06-99fb-27742262f261%22%7d

Host Faculty: R. Govindarajan

Video Archive Go Top

 

Series: Department Seminar
Title: Optimal Resource Allocation Problems in Large-Scale Networks

  • Speaker: Dr. Abhishek Sinha,
                   Assistant Professor,
                   Department of Electrical Engineering,
                   Indian Institute of Technology Madras,
  • Date and Time: Monday, November 29, 2021, 4:00 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
Optimal resource allocation in networks gives rise to some of the most fundamental problems at the intersection of algorithms, stochastic processes, and learning. In this talk, we will discuss our recent contributions to three canonical resource allocation problems, namely caching, routing, and scheduling. First, we will consider the optimal caching problem in both stand-alone and networked settings. However, instead of minimizing the competitive ratio - the classical metric of choice for caching problems, we will look at the problem from an online learning perspective that aims to minimize the regret. We will show that this viewpoint leads to an entirely new class of caching policies with provably better performance than the classical ones. We will also discuss some tight regret lower bounds for this problem. Next, we will consider the problem of throughput-optimal dynamic routing of a broad traffic class, including unicast, multicast, broadcast, and anycast flows on a network with arbitrary link scheduling constraints. We will present a unified algorithmic framework based on precedence relaxations, leading to an efficient policy that provably outperforms the state-of-the-art Backpressure routing algorithm. Finally, we will discuss the user scheduling problem for reliable and fresh information delivery over unreliable wireless channels. However, contrary to the existing literature, which predominantly considers stochastic channels, we investigate a non-stationary environment modeled using a new adversarial framework. We will describe competitive algorithms in this setting along with some tight lower bounds. We will supplement each part of the talk with a set of open problems.

Speaker Bio:
Abhishek Sinha is currently an Assistant Professor in the Department of Electrical Engineering at the Indian Institute of Technology Madras. He received his Ph.D. degree from the Massachusetts Institute of Technology in 2017, where he was affiliated with the Laboratory for Information and Decision Systems (LIDS). After his Ph.D., Abhishek worked as a senior engineer at Qualcomm Research, San Diego, in the 5G standardization group. He obtained his M.E. degree in Telecommunication Engg. from the Indian Institute of Science, Bangalore, and B.E. degree in Electronics and Telecommunication Engg. from Jadavpur University, Kolkata, India. He is a recipient of the INSA Medal for Young Scientist (2021), Best Paper Awards in INFOCOM 2018 and MobiHoc 2016, Prof. Jnansaran Chatterjee memorial gold medal, and T.P. Saha Memorial gold-centered silver medal from Jadavpur University and Jagadis Bose National Science Talent Search (JBNSTS) scholarship, Kolkata, India. His areas of interest include networks, information theory, theoretical machine learning, and applied probability. Microsoft teams link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MTI5NTU5YjUtYTRjYS00MWYxLWJkNGUtNDg0M2E3N2E2NTNj%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22c747ccaa-ceaa-4197-b4cb-ce2f1d4694da%22%7d

Host Faculty: Prof. Chiranjib Bhattacharyya

Video Archive Go Top

 

Series: M.Tech (Research)Thesis Defence -ONLINE
Title: High-Performance GPU Tensor Core Code Generation for Matmul using MLIR

  • Speaker: Mr. Navdeep Kumar Katel,
                   M.Tech (Research) student,
                   Dept. of CSA
  • Faculty Advisor: Prof. Uday Kumar Reddy .B
  • Date and Time: Monday, November 29, 2021, 3:30 PM
  • Venue: Microsoft Teams - ON-LINE

Abstract
State of the art in high-performance deep learning is primarily driven by highly tuned libraries. These libraries are often hand-optimized and tuned by expert programmers using low-level abstractions with significant effort. A lot of the effort may have to be repeated for similar hard- ware and future ones. This process is thus not modular or reusable to the same extent that compiler infrastructures like LLVM are. Manual optimization does not typically use a standard intermediate representation (IR) or transformations and passes on such IRs, although the optimizations performed can be encoded as a sequence of transformation steps and customized passes on an IR.

We believe that until the recent introduction of MLIR (Multi-level intermediate representation), IR infrastructure was not geared to tackle the problem of automatic generation of libraries in an effective manner. In particular, it was hard to represent and transform compute abstractions at high, middle, and low levels using a single IR. Multiple levels of abstractions in a single IR permits the user to apply transformations and optimizations at the most suitable level and even reuse them for different targets or front-ends.

Some previous works have optimized matrix-matrix multiplication (matmul) for different GPU microarchitectures. All of these works exploit really low-level details of the hardware. Some of them are written directly in assembly, while some use a combination of CUDA C++ with inline assembly. While the set of high-level optimizations is the same, the very dependence on low-level hardware details drifts them away from re-usability. Going against this trend, we show that, by using a set of simple optimizations, suitable abstractions, and lowering passes on such abstractions in MLIR, we can get competitive performance with hand-written libraries.

To achieve this, we put together a lowering pipeline that can automatically generate (with- out hand-writing any code) code for matmul on NVIDIA GPUs while utilizing its tensor cores. We have used and extended some existing utilities in MLIR, such as tiling, loop unrolling, loop permutation, and generation of fast memory buffers for input operands. Additional utilities, types, and operations necessary for optimal code generation were implemented from scratch. These include adding WMMA operations and types to provide fundamental support for programming tensor cores, adding loop normalization support, adding multi-level tiling support in affine dialect, creating WMMA operations to load, compute, and store matrix products in a given matmul nest, detection, and hoisting of invariant WMMA load-store pairs, hiding latency of global to shared data movement, and adding support for mapping and converting parallel loops to warps.

On a set of problem sizes we evaluated, performance results show that we can attain performance that is 95-119% and 80-160% of cuBLAS, for FP32 and FP16 accumulate respectively, on NVIDIAs Ampere microarchitecture based GeForce 3090 RTX. A similar evaluation on NVIDIAs Turing-based RTX 2080 Ti revealed that we achieve 86-111% and 72-89% of cuBLAS for FP32 and FP16 accumulate, respectively.

We take our approach further by fusing common pointwise operations with matrix-matrix multiplication. This is the first work to demonstrate fusion of operations for tensor core matmul using a systematic IR based approach. Fusion is done with the support of additional WMMA operations, which perform warp level matrix operations such as ReLU and constant addition. We see significant gains on small to medium problem sizes when evaluating our fused kernels against a combination of library kernels and custom kernels. On Ampere, consumer fusion performance ranges from 95% to 167% compared with the respective implementations. Similar ranges on Turing are 84% to 150%. We also present preliminary results, which serve as a proof of concept, for producer fusion, i.e., fusion of pointwise operations on the inputs with matmul. Performance of ReLU on C input fused with matmul against a custom ReLU kernel followed by cuBLAS matmul, ranges from 98% to 138% on Ampere and 91% to 133% on Turing.

We believe that these results could be used as a foundation and motivation for further research and development on automatic code and library generation using IR infrastructure for similar specialized accelerators.

Microsoft teams online link:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_OTcwZjA3NjktMDA2MC00YjE4LWIwYjAtNzQzNmIzNWQ1OTNl%40thread.v2/0?context=%7b%22Tid%22%3a%226f15cd97-f6a7-41e3-b2c5-ad4193976476%22%2c%22Oid%22%3a%22171d9abc-cf43-429a-9680-c05b9523fa9a%22%7d

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2018      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits