Department of Computer Science and Automation Department of Computer Science and Automation, IISc, Bangalore, India Indian Institute of Science
HOME | ABOUT US | PEOPLE | RESEARCH | ACADEMICS | FACILITIES | EVENTS / SEMINARS | NEWS | CONTACT US

UPCOMING SEMINARS

Series: Department Seminar
Title: Randomized Incremental Construction of Delaunay Triangulations of Epsilon Nets.

  • Speaker: Dr. Kunal Dutta
                   Postdoc Researcher
                   Data Shape group
                   Inria Sophia
                   France
  • Date and Time: Monday, July 23, 2018, 3:30 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The randomized incremental construction (RIC) paradigm for building geometric data structures introduced by Clarkson and Shor [Disc.& Comp. Geom., 1989] has found wide application in the field of computational geometry and related areas such as graphics, statistical inference and machine learning. It has been analyzed extensively from the point of view of worst-case distributions. In many practical situations however, we face nicer distributions. A natural question that arises is: do the usual RIC algorithms automatically adapt when the point samples are nicely distributed? We answer positively to this question for the case of the Delaunay triangulation of epsilon-nets.Under some additional assumptions, our proof works in geodesic metric spaces.A key component of our proof is a bound on the probabilities of certain non-monotone events under sampling without replacement, which may be of general interest.Joint work with J.-D. Boissonnat, O. Devillers and M. Glisse.

Host Faculty: Prof. Sunil L Chandran

Video Archive Go Top

 

Series: CSA Faculty Colloquium
Title: Efficient Algorithms for Structured Prediction Problems

  • Speaker: Prof. Shirish K Shevade.
  • Date and Time: Friday, July 20, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In many real-world prediction problems, the output is a structured object like a sequence or a tree or a graph. Such problems are referred to as structured prediction problems. Support vector machines (SVMs) have shown promise for solving these problems. Learning structural SVMs amounts to solving a convex quadratic program with a huge number of constraints. The number of constraints is typically exponential, which makes the problem intractable by general purpose optimization methods. In this talk, we discuss a fast sequential dual method for training structural SVMs. We also discuss a simple and efficient algorithm for solving a non-convex optimization problem for structured prediction in a semi-supervised setting.

Speaker Bio:
Shirish Shevade received his Ph.D. from the Indian Institute of Science, Bangalore, India, in 2002. He is currently an Associate Professor in the Department of Computer Science and Automation at the Indian Institute of Science. His research interests span many areas of Machine Learning such as Support Vector Machines, Gaussian Processes and semi-supervised learning. He is a Senior Member of IEEE.

Host Faculty: Prof. Sunil Chandran & Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: M.Tech (Research) Thesis Defense
Title: Design of Trusted Market Platforms using Permissioned Blockchains and Game Theory

  • Speaker: Ms. Shivika Narang
                   M.Tech. (Research) Student
                   Dept. of CSA
                   IISc
  • Faculty Advisor: Prof. Y. Narahari
  • Date and Time: Thursday, July 19, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
The blockchain concept forms the backbone of a new wave technology that promises to be deployed extensively in a wide variety of industrial and societal applications. Governments, financial institutions, banks, industrial supply chains, service companies, and even educational institutions and hospitals are investing in a substantial manner in the hope of improving business efficiency and operational robustness through deployment of blockchain technology. This thesis work is concerned with designing trustworthy business-to-business (B2B) market platforms drawing upon blockchain technology and game theory.

The proposed platform is built upon three key ideas. First, we use permissioned blockchains with smart contracts as a technically sound approach for building the B2B platform. The blockchain deploys smart contracts that govern the interactions of enterprise buyers and sellers. Second, the smart contracts are designed using a rigorous analysis of a repeated game model of the strategic interactions between buyers and sellers. We show that such smart contracts induce honest behavior from buyers and sellers. Third, we embed cryptographic regulation protocols into the permissioned blockchain to ensure that business sensitive information is not revealed to the competitors. We believe our work is an important step in the direction of building a powerful B2B platform that maximizes social welfare and enables trusted collaboration between strategic enterprise agents.

Video Archive Go Top

 

Series: CSA Distinguished Lecture
Title: A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics

  • Speaker: Professor Moses Charikar
                   Donald E. Knuth Professor Computer Science
                   Stanford University
  • Date and Time: Tuesday, July 17, 2018, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
We study the Stochastic Gradient Langevin Dynamics (SGLD) algorithm for non-convex optimization. The algorithm performs stochastic gradient descent, where in each step it injects appropriately scaled Gaussian noise to the update. We analyze the algorithm’s hitting time to an arbitrary subset of the parameter space. Two results follow from our general theory: First, we prove that for empirical risk minimization, if the empirical risk is pointwise close to the (smooth) population risk, then the algorithm finds an approximate local minimum of the population risk in polynomial time, escaping suboptimal local minima that only exist in the empirical risk. Second, we show that SGLD gives an alternate approach and an improvement for learning linear classifiers under zero-one loss.

Joint work with Yuchen Zhang and Percy Liang. (COLT '17)

Speaker Bio:
Moses Charikar is the Donald E. Knuth professor of Computer Science at Stanford University. He obtained his PhD from Stanford in 2000, spent a year in the research group at Google, and was on the faculty at Princeton from 2001-2015. He is broadly interested in the design and analysis of algorithms with an emphasis on approximation algorithms for hard problems, metric embeddings and algorithmic techniques for big data. He won the best paper award at FOCS 2003 for his work on the impossibility of dimension reduction, the best paper award at COLT 2017 and the 10 year best paper award at VLDB 2017. He was jointly awarded the 2012 Paris Kanellakis Theory and Practice Award for his work on locality sensitive hashing, and was named a Simons Investigator in theoretical computer science in 2014.

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Department Seminar
Title: A Case for (Limited) Explicit Dataflow Graph Execution

  • Speaker: Dr. Gagan Gupta
                   Computer Architecture Researcher
                   Microsoft Research
                   USA
  • Date and Time: Tuesday, July 17, 2018, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
As Moore’s law slows down, we must find new ways of scaling processor performance. Von Neumann ISAs have been successful because they provide a clean conceptual target to software while running the complete gamut of algorithms reasonably well. We badly need clean new abstractions that utilize fine-grain parallelism and run energy efficiently. Prior work (such as the UT-Austin TRIPS EDGE ISA and others) showed how to form blocks of computation containing limited-scope dataflow graphs, which can be thought of as small structures (DAGs) mapped to silicon. In this presentation I describe some post-TRIPS work that addresses the limitations of the original ISA, and how those extensions can provide energy-efficient execution for single threads compared to a conventional out-of-order superscalar design. I will also describe a specific microarchitecture based on these extensions, and show early results. Finally, I will describe how this notion of “structured computation” can be extended to form accelerators dynamically with minor changes to the CPU, or extended to synthesize efficient accelerators that are specialized to a specific workload.

Speaker Bio:
Gagan Gupta has worked in the semiconductor industry for over two decades. Presently he is a computer architecture researcher at Microsoft Research. He has led engineering and marketing teams to launch commercially successful processors at LSI Logic, Huawei, and ARC International, and influenced R&D strategies in the industry (e.g., at Intel and Sandisk). Gagan has a Ph.D. in Computer Sciences from the University of Wisconsin-Madison. His work has been recognized with awards at multiple industrial and academic fora, and is regularly cited in trade publications.

Host Faculty: Dr. Arkaprava Basu

Video Archive Go Top

 

PAST SEMINARS

Series: M.Sc. (Engg) Colloquium
Title: Optimizing Matrix Computations with PolyMage

  • Speaker: Ms. Kumudha K N
  • Faculty Advisor: Prof. Uday Kumar Reddy
  • Date and Time: Friday, July 13, 2018, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Linear algebra computations and other arbitrary affine accesses are ubiquitous in applications from domains like scientific computing and digital signal processing (DSP). Libraries such as OpenBLAS, MKL, and FFTW provide efficient hand-optimized implementations for matrix and vector primitives used in these domains for various architectures. Applications are then built upon these standard library routines to obtain high performance. These libraries do not perform well for all matrix sizes and obtain sub-optimal performance for small matrices. The interface of these libraries can also be fairly complex requiring several input parameters. Thus, an even higher level of abstraction is often desired to improve productivity. Further, by using these libraries the opportunity to optimize across different library calls is lost and traditional programming to exploit this locality using library functions becomes complex.

The work in this thesis proposes (i) a tile size selection model which works for any arbitrary affine access and eschews auto-tuning, (ii) a simple heuristic to determine profitability of library call mapping and falling back to generated code otherwise, (iii) an intra-tile optimization technique to expose inner-loop parallelism thus enabling general purpose compiler's vectorizer to generate vector instructions, (iv) a DSL approach with high level primitives and functions to allow expressing computations efficiently. The optimizations proposed are implemented in the PolyMage DSL, a domain specific language (DSL) for image processing pipelines. Its optimizer is able to perform optimizations like fusion, tiling, and loop optimizations for image processing pipelines. We extend PolyMage's compiler to support the above optimizations.It is thus able to optimize computations from additional domains including dense linear algebra and certain DSP computations.

We experimentally evaluates these optimizations on modern multicore systems using representative benchmarks from PolyBench, digital signal processing, image processing and neural networks. The results are compared to state-of-the-art optimization approaches and frameworks in each domain. Experiments on DSP benchmarks show that our optimizations has a mean speed up of 7.7x over existing PolyMage optimizer, 5.1x over numpy and 1.9x over MATLAB toolboxes with parallel computing support. On linear algebra computations from PolyBench benchmark suite show that we obtain a speedup of 7.5x over existing PolyMage optimizer, 3.6x over Pluto and 4.1x over PPCG.

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: Active Learning for Efficient Testing of Student Programs

  • Speaker: Mr. Ishan Rastogi
                   M.Sc (Engg)Student
                   Dept. of CSA
  • Faculty Advisor: Prof. Shirish Shevade and Prof. Aditya Kanade
  • Date and Time: Friday, July 13, 2018, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Automated evaluation of student programs is an important problem since it eases instructor's burden by reducing or altogether eliminating the time required for program evaluation. A solution to this problem would also make it possible to scale up courses having computer programming content to a larger audience, such as to a massive online open courses setting. The problem is, however, inherently difficult as students can make a variety of mistakes spanning the entire gamut -- from shallow syntactic errors to more involved semantic ones. Among these, identifying whether a program has a logical error is perhaps the most challenging, since the error may only surface in a subset of all valid test cases. Additionally, unlike syntax errors, logical errors are task specific and hence require an additional input from the instructor (for example, in the form of a set of quality test cases) for every new task introduced into the system.

In this talk, we will discuss our work, Automated Testing using Active learning and Symbolic-execution (ATAS), an algorithm designed to identify whether a student program is logically incorrect in an online setting. This algorithm builds upon the recent advances in both symbolic execution and active learning. Symbolic execution is a program analysis technique which can generate test cases through symbolic constraint solving. Our method makes use of a reference implementation of the task as its sole input. We compare our method with a symbolic execution-based baseline on six programming tasks retrieved from CodeForces comprising of about 23000 student submissions. We demonstrate an average improvement of over $2.5$x over the baseline in terms of runtime (thus making it more suitable for online evaluation), without significant degradation in evaluation accuracy.

Speaker Bio:
Ishan is a masters student at the CSA department, working in Intelligent Systems Lab. He is advised by Prof. Shirish Shevade and Prof. Aditya Kanade.

Video Archive Go Top

 

Series: Department Seminar
Title: Leveraging Data and AI for Early Warning of Social and Economic Disruption

  • Speaker: Prof. Rohini Srihari
                   Professor,
                   Dept. of Computer Science and Engineering
                   State University of New York, Buffalo.
  • Date and Time: Thursday, July 12, 2018, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
According to the Institute of Economics and Peace, conflict costs the world 13.3% of its GDP, or $13.6 trillion per year. And these numbers don’t even begin to tell the story of the human costs of violence, which disproportionately affects women and youth.

This talk focuses on the design and development of a software platform, groundTruth, for early warning of social and economic disruption in emerging markets and fragile states. This initiative leverages the following developments: (i) major advances in artificial intelligence, (ii) widespread adoption of mobile devices and the internet in emerging countries, and (iii) the availability of an unprecedented amount of data, including social media and sensor data. A taxonomy of risk drivers, types of disruptions, and key indicators useful in predicting disruptions will be presented. Predictive analytics algorithms based on machine learning models (including deep learning models} will be discussed along with initial results. These have been trained and tested on rich, manually curated, global event databases. A demo of the current system will be presented along with plans for further development.

Speaker Bio:
Rohini Srihari is an educator, entrepreneur and computer scientist. She is a Professor in the Dept. of Computer Science and Engineering at the State University of New York, Buffalo. Her research has focused on various aspects of multilingual text mining and natural language processing and has been funded by agencies such as DARPA and the National Science Foundation. She has founded and led language technology companies, delivering content analytics solutions and services to enterprise customers across multiple industries. She is currently serving as the Chief Data Scientist with PeaceTech Lab based in Washington, DC where she is overseeing data and technology programs with the potential for major societal impact. She received a B. Math. from the University of Waterloo, Canada and a PhD in Computer Science from the University at Buffalo.

Host Faculty: Prof. M Narasimha Murty

Video Archive Go Top

 

Series: Ph.D. Colloquium
Title: Recommendations in Complex Networks: Unifying Structure into Random Walk.

  • Speaker: Sharad Kumar Nandanwar
  • Faculty Advisor: Prof. M. Narasimha Murty
  • Date and Time: Thursday, July 12, 2018, 10:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Making recommendations or predicting links which are likely to exist in the future is one of the central problems in network science and graph mining. In spite of modern state-of-the-art approaches for link prediction, the traditional approaches like Resource Allocation Index, Adamic Adar still find heavy use, because of their simplistic nature yet competitive performance. Our preliminary investigation reveals that a major fraction of missing links which are observed in the near future, are the links which are between one-hop distant nodes. Current "friend of friend is a friend" based approaches, provide a mechanism for aggregating the effect of triangles getting closed by inclusion of an edge. In this work we look beyond these triangles, and hunt for higher order structures in common neighborhood. However, with an idealistic choice of cliques as higher order structures, the problem easily gets out of hand, even for common ego-networks involving 50-100 nodes. In wake of this, we define a dense structure k-support, as an approximation to the clique. Further, for a given k value, we propose a goodness measure to capture the contribution of the common neighborhood w.r.t. k-support, towards the link strength. The proposed goodness measure with different k values is then exploited in learning a supervised model.

We next take a multi-class classification view to the recommendation problem. This is suited for the cases where cardinality of the set of target objects is reasonably small. Classification of entities based on the underlying network structure is an important problem and has been studied extensively in literature. Networks encountered in practice are sparse and have many missing and noisy links. Statistical learning techniques have been used in intra-network classification; however, they typically exploit only the local neighborhood, so may not perform well. We propose a novel structural neighborhood-based classifier learning using a random walk, where we label a node based on how nodes in the respective k^{th}-level neighborhood are labeled. We observe that random walks of short length are helpful in classification, while emphasizing role of longer random walks may cause the underlying Markov chain to converge to a stationary distribution. Considering this, we take a lazy random walk based approach with variable termination probability for each node, based on the node's structural properties including its degree.

Further, we observe that conventional loss functions penalize a node based on a function of its predicted label and target label. Such loss functions under-perform while learning on a network having overlapping classes. In relational setting, even though the ground truth is not known for the unlabeled nodes, some evidence is present in the form of labeling acquired by the nodes in their neighborhood. Considering this, we propose a structural loss function for learning in networks based on the hypothesis that loss is induced when a node fails to acquire a label that is consistent with the labels of the majority of the nodes in its neighborhood. We further combine this with a novel semantic regularizer, which we call homophily regularizer, to capture the smooth transition of discriminatory power and behavior of semantically similar nodes.

Hybrid recommender systems have shown the power of exploiting relationships amongst objects which directly or indirectly effect the recommendation task. However, the effect of all relations is not equal, and choosing their right balance for a recommendation problem at hand is non-trivial. We model these interactions using a Heterogeneous Information Network, and propose a systematic framework for learning their influence weights for a given recommendation task. We address the issue of redundant results, which is very much prevalent in recommender systems. To alleviate redundancy in recommendations we use Vertex Reinforced Random Walk (a non-Markovian random walk) over a heterogeneous graph. It works by boosting the transitions to the influential nodes, while simultaneously shrinking the weights of others. This helps in discouraging recommendation of multiple influential nodes which lie in close proximity of each other, thus ensuring diversity.

Video Archive Go Top

 

Series: Department Seminar
Title: The coincidence based uniformity testing achieves $l_2$-tolerance

  • Speaker: Mr. Sutanu Gayen
                   Ph.D. student
                   Department of Computer Science and Engineering
                   University of Nebraska-Lincoln
  • Date and Time: Tuesday, July 10, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Distribution testing is an area in the intersection of computer science, statistics and information theory. In recent years, there has been a surge in the interest in this area, due to its importance in data analytics. In this talk I will start by defining the central question in this area: Given sample access to an unknown distribution $D$ on $n$ items, decide between the following two cases, 1) does $D$ have a certain property? versus 2) is $D$ far from having the property?

In particular, the focus of this talk is uniformity testing, the question of testing whether the unknown distribution $D$ is uniform or it is far from uniform. Algorithms have been proposed for a number of variations of uniformity testing, which I will mention. Then, I will present a new result of ours which gives new guarantees of a known algorithm for this problem. In particular, we give a tester by slightly modifying the coincidence based tester of Paninski from 2008, and show this new tester achieves the strongest known guarantee for uniformity testing. Interestingly, our analysis is quite simple.

This is a joint work with A. Pavan and N.V. Vinodchandran.

Speaker Bio:
Sutanu Gayen is a fifth year Ph.D. student in the Department of Computer Science and Engineering, University of Nebraska-Lincoln. He is working with Dr. Vinodchandran N. Variyam. He works in the area of streaming algorithms, distribution testing, and probabilistic algorithms in general. His homepage is at https://cse.unl.edu/~sgayen/.

Host Faculty: Dr. Arnab Bhattacharyya

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: IO Pattern Aware Methods to Improve the Performance and Lifetime of NAND SSD

  • Speaker: Mr. Arpith K
                   M.Sc (Engg)Student
                   Dept. of CSA
  • Faculty Advisor: Prof. K Gopinath
  • Date and Time: Tuesday, July 10, 2018, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Solid State Drives based on NAND flash use transistors to store information persistently. Data is stored as a threshold voltage in each flash cell. Due to its energy efficiency, small form factor and resilience to physical shock, they are the storage medium of choice in a variety of mobile devices. Internal parallelism facilitates SSDs to deliver a significantly higher IO performance when compared to traditional magnetic disks, and are hence used in data centers. Modern flash memories have transistors that allow it to store multiple bits, thus enabling the production of SSDs with higher capacities and a low cost-per-bit. As of 2018, cells with an ability to store three bits are being widely used, with Intel and Micron just announcing even the availability of the first commercial SSD with quad level cells.

However, such high-density SSDs suffer from longer latencies to program and read data, resulting in reduced throughputs, when compared to flash memories that store a single bit per cell. Also, they suffer from reduced reliability. Mechanisms to detect bit errors and prevent data loss add to performance overheads.

In the talk, we will discuss the two proposed system-level solutions, which with the knowledge of IO patterns of the workload can improve the performance and lifetime of NAND based solid state drives. The first work proposes to combine various page types in a wordline to a single logical page called a Melded-Page. This improves the read performance of an SSD by mitigating the overheads involved in the read operation. Using this method, we achieve performance improvements of up to 44% on distributed workloads that use Hadoop Distributed File System (HDFS).

Second, to improve the write performance and lifetime of an SSD, we propose a modified programming scheme called Hot Page Aware Relaxed Program Sequence scheme. Experimental results show an average improvement of 56% in the performance of the SSD when compared to methods that use shadow program sequence scheme. We also observe a reduction in the number of pages backed up by an average of 85%. When compared to methods that use dynamic SLC caching, the proposed scheme can reduce the number of block-erases by an average of 61%.

Speaker Bio:
Arpith is a masters student at the CSA department, working in Computer Architecture and Systems Lab. He is advised by Prof. K. Gopinath.

Video Archive Go Top

 

Series: CSA Faculty Colloquium
Title: Going beyond worst case analysis in the design of approximation algorithms

  • Speaker: Dr. Anand Louis
                   Assistant Professor
                   Dept. of CSA
  • Date and Time: Friday, June 29, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
In the study of algorithms for combinatorial optimization problems, often the best known algorithms provide a somewhat underwhelming performance guarantee, however simple heuristics perform remarkably well in practice. A possible explanation for this phenomenon could be that for many problems, the instances arising in practice tend to have some inherent structure that makes them "easier" than the "worst case instances". Many attempts have been made to understand the structural properties of these instances, and to use them in designing algorithms specifically for such instances, which could perform much better than algorithms for general instances. I will talk about a few vignettes in this direction.

Speaker Bio:
Anand is an Assistant Professor at CSA. He received his Ph.D. in 'Algorithms, Combinatorics, and Optimization' from Georgia Tech in 2014, and held a postdoctoral position at Princeton University before moving to IISc in September 2016. His research interests lie in the theory of algorithms and optimization.

Host Faculty: Prof. Sunil L Chandran & Prof. Shalabh Bhatnagar

Video Archive Go Top

 

Series: Department Seminar
Title: Hardness of learning noisy halfspaces using polynomial thresholds

  • Speaker: Mr. Suprovat Ghoshal
                   ​Ph.D. student
                   Department of CSA
  • Faculty Advisor: Dr. Arnab Bhattacharyya and Dr. Siddharth Barman
  • Date and Time: Wednesday, June 27, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
We prove the hardness of weakly learning halfspaces in the presence of adversarial noise using polynomial threshold functions (PTFs). In particular, we prove that for any ​positive integer d and ​parameter ​ε, it is NP-hard to decide: given a set of {−1,1}-labeled points in R ​^​ n whether (YES Case) there exists a halfspace that classifies (1−ε)-fraction of the points correctly, or (NO Case) any degree-d PTF classifies at most (1/2+ε)-fraction of the points correctly. This strengthens to all constant degrees the previous NP-hardness of learning using degree-2 PTFs shown by Diakonikolas et al. (2011). The latter result had remained the only progress over the works of Feldman et al. (2006) and Guruswami et al. (2006) ruling out weakly proper learning adversarially noisy halfspaces.

​Joint work with Arnab Bhattacharyya and Rishi Saket.

Speaker Bio:
Suprovat Ghoshal is a ​Ph.D. student at the CSA department, co-advised by Arnab Bhattacharyya and Siddharth Barman. *This is a practice talk for COLT 2018.*

Host Faculty: Dr. Arnab Bhattacharyya

Video Archive Go Top

 

Series: Department Seminar
Title: Multiplayer Parallel Repetition for Expanding Games

  • Speaker: Dr. Rakesh Venkat
                   Postdoctoral Fellow
                   Hebrew University of Jerusalem
                   Israel
  • Date and Time: Tuesday, June 26, 2018, 3:00 PM
  • Venue: CSA Multimedia Class (Room No. 252, First Floor)

Abstract
A two-player game is an important construct used in conjunction with the PCP theorem to show hardness of approximating various NP-Hard problems. It consists of two non-communicating players, Alice and Bob, trying to win against a verifier V, who sends them questions drawn from a known distribution D. The goal of Alice and Bob is to come up with strategies to provide answers to these questions that are consistent with a known predicate checked by the verifier. Since they cannot communicate after receiving their respective questions, they might not be able to win the game with probability 1. The maximum probability (over D) of winning such a game is called the value of the game.

Raz first showed that repeating a two-player game in parallel n-times (where n question pairs are drawn independently from D and given to the players) drives down the probability of the players winning *all* the rounds exponentially with n. In contrast to two-player games, very little is known about the parallel-repetition of games with k players for k>=3. The only known upper bound on the value of a general n-repeated, k-player game is due to Verbitsky, which proves a weak Inverse-Ackermann(n) decay. Some special classes of games have been shown to have exponential decay in value w.r.t. n.

In this talk, I will show that a large class of multiplayer games do indeed exhibit an exponential decay in value under parallel repetition. Our result recovers previously known exponential decay theorems for special classes as a corollary. We also point out a simple game not handled by the above class, and conjecture that it is in fact, the hardest case. (Based on joint work with Irit Dinur, Prahladh Harsha and Henry Yuen).

Speaker Bio:
The speaker is a postdoctoral fellow at the Hebrew University of Jerusalem, Israel. He obtained his PhD from TIFR, Mumbai. His research interests are in Approximation Algorithms, Spectral graph Theory and Hardness of Approximation.

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: Ph.D. Thesis Defense
Title: Algorithms for Multilingual IR in Low Resource Languages using Weakly Aligned Corpora

  • Speaker: Mr. Goutham Tholpadi
                   Ph.D student
                   Dept. of CSA
                   IISc
  • Faculty Advisor: Prof. Chiranjib Bhattacharyya & Prof. Shirish K Shevade
  • Date and Time: Tuesday, June 26, 2018, 2:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
A large section of the world population is multilingual, and a large proportion of the digitized textual content generated today is in non-English languages. As a result, multilingual information retrieval (MLIR) has become a real and pressing need. Most MLIR methods rely on machine translation and other linguistic resources such as dictionaries, taggers, parsers, etc. However, most of these resources are unavailable in low resource languages, e.g. in many Indian languages. For many MLIR problems, an alternative approach that has shown some success is to map all texts to an abstract semantic space of concepts where similarity can be computed uniformly across languages. These concepts are induced using statistical models learned from multilingual "weakly aligned" corpora which are available even in low resource languages. A number of models have been proposed for learning topical correspondence from weakly aligned corpora that are document-aligned. However, in recent times, there is a growing interest in modeling topical correspondence from user generated content (UGC) data from social networks, commenting widgets, etc. A general pattern seen in such data is the presence of a main object (such as an article, post, tweet, image, etc.) and a collection of dependent objects (such as comments, replies, retweets, tags, etc.). However very little work has been done on topical correspondence models for corpora where the dependent objects are multilingual. This task is made more challenging due to the presence of romanization, sparsity, and noise.

Topic models are often used to compute topical affinity between textual units, which is a core task in most MLIR problems. However, the topic models that can be learned from weak aligment are not directly applicable to the computation of all kinds of topical affinities. In particular, the size of the textual units involved in the affinity computation has a significant bearing on the applicability of a model, and on the method of application. In this thesis, we focus on low resource languages and develop a series of hierarchical Bayesian models for capturing topical correspondence in multilingual news-comment corpora. The final model, called Multi-glyphic Correspondence Topic Model (MCTM), captures topical correspondence between comments and the article, between comments and specific sections of the article, among the comments themselves, and between comments and the other articles. The model handles problems of data sparsity, noise and comment romanization and is shown to be a better fit for article-comment data when compared to existing models.

From an application perspective, we consider three different MLIR problems corresponding to different levels with respect to the size of the textual units involved in the topical affinity computation step. The first MLIR problem (at the level of single words) involves inducing translation correspondences in arbitrary language pairs using Wikipedia data. We present an approach for translation induction that leverages Wikipedia corpora in auxiliary languages using the notion of translingual themes. We propose extensions that enable the computation of cross lingual semantic relatedness between words and apply the method to the task of cross-lingual Wikipedia title suggestion. The second MLIR problem (at the level of sentences or paragraphs) involves the detection of topical relevance between specific portions of articles and comments. Using MCTM, we devise methods for topical categorization of comments with respect to the article. In the third MLIR problem (at the level of document collections), the task is to generate keyword summaries for a cluster of documents. We develop an architecture for performing Scatter/Gather on multilingual document collections and a labeling procedure that generates keyword labels on-the-fly to summarize a multilingual document cluster in any language.

Video Archive Go Top

 

Series: Department Seminar
Title: Using Inherent Structures to design Lean 2-layer RBMs

  • Speaker: Mr. Abhishek Bansal
                   Software Engineer
                   IBM India Research Lab, Delhi
  • Date and Time: Monday, June 25, 2018, 4:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Understanding the representational power of Restricted Boltzmann Machines (RBMs) with multiple layers is an ill-understood problem and remains an important area of active research. Motivated from the approach of Inherent Structure formalism (Stillinger & Weber, 1982), extensively used in analysing Spin Glasses, we propose a novel measure called Inherent Structure Capacity (ISC), which characterizes the capacity of a fixed architecture RBM by the expected number of modes of distributions emanating from the RBM with parameters drawn from a prior distribution. Though ISC is intractable, we provide a tractable upper bound. The bound suggests that for a single layer RBM ISC approaches a finite constant and to increase it one needs to add additional layers. Furthermore, we introduce Lean RBMs, which are multi-layer RBMs where each layer can have at-most O(n) units with the number of visible units being n. We show that for every single layer RBM with sigma(n^(2+r)); r>=0, hidden units there exists a two-layered lean RBM with theta(n2) parameters with the same ISC, establishing that 2 layer RBMs can achieve the same representational power as single-layer RBMs but using far fewer number of parameters. To the best of our knowledge, this is the first result which quantitatively establishes the need for layering.

Speaker Bio:
The speaker is a Software Engineer at IBM India Research Lab, Delhi. He completed his BTech from IIT Delhi in Manufacturing Sc & Engg and ME in CSE from IISc Bengaluru. *This is Practice talk for ICML 2018*

Host Faculty: Prof. Chiranjib Bhattacharyya

Video Archive Go Top

 

Series: CSA Distinguished Lecture
Title: Lecture 4: Generative Models in Deep learning

  • Speaker: Professor Sargur N. Srihari
                   SUNY Distinguished Professor
                   Department of Computer Science and Engineering
                   University at Buffalo
                   The State University of New York
                   USA
  • Date and Time: Monday, June 25, 2018, 10:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Although early approaches to statistical pattern recognition emphasized generative models (the “Bayes Classifier”, Naiive Bayes Classifier, Hidden Markov Models) they proved to be less useful than discriminative models (logistic regression, nearest-neighbor, conditional random fields, neural networks). Probabilistic graphical models made a comeback for generative models by overcoming some computational issues (fewer parameters), but others remained (inference is #P complete). Deep Learning has created a resurgence of interest in generative models. Starting with the Restricted Boltzmann machine, spectacular results have been obtained with Variational Autoencoders and Generative Adversarial Networks. The talk will outline this progression of models and indicate some of our results with deep generative models.

Speaker Bio:
Sargur Srihari is a computer scientist whose work is on automated systems for pattern recognition and machine learning. The principal impact of his work has been on statistical methods, on the analysis and recognition of handwriting and in computational methods for forensic impression evidence. Sargur Srihari is currently a SUNY Distinguished Professor in the Department of Computer Science and Engineering at the University at Buffalo, The State University of New York where he also holds adjunct professorships in the Department of Biostatistics and in the Department of Electrical Engineering. He teaches courses in machine learning and probabilistic graphical models. With support from the United States Postal Service for over 20 years, he founded CEDAR, the Center of Excellence for Document Analysis and Recognition, in 1991, which had a major impact. His research led to: (i) the first large-scale handwritten address interpretation systems in the world (deployed by the IRS and USPS), (ii) post-Daubert court acceptance of handwriting testimony based on handwriting individuality asessment, (iii) a software system in use by forensic document examiners worldwide (iv) statistical characterization of uncertainty in impression evidence, and (v) first characterization of document image analysis as a sub-field of pattern recognition. Srihari's honors include: Outstanding Acheivements Award of IAPR/ICDAR in Beijing China in 2011, Distinguished alumnus of the Ohio State University College of Engineering in 1999. Fellow of the International Association for Pattern Recognition in 1996, Life Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in

Host Faculty: Prof. M. Narasimha Murty

Video Archive Go Top

 

Series: M.Sc. (Engg) Colloquium
Title: Handling Overloads with Social Consistency

  • Speaker: Ms. Priyanka Singla
                   M.Sc (Engg.) student
                   Dept. of CSA
  • Faculty Advisor: Prof. K Gopinath
  • Date and Time: Friday, June 22, 2018, 11:30 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Cloud computing applications have dynamic workloads, and they often observe spikes in the incoming traffic which might result in system overloads. System overloads are generally handled by various load balancing techniques like replication and data partitioning. These techniques are effective when the incoming bursty traffic is dominated by reads and writes to partitionable data, but they become futile against bursts of writes to a single hot object. Further, the systems which use these load balancing techniques, to provide good performance, often adopt a variant of eventual consistency and do not provide strong guarantees to applications, and programmers. In this work, we propose a new client based consistency model, called social consistency, as a solution to this single object overload problem. Along with handling overloads, the proposed model also provides a stronger set of guarantees within subsets of nodes (socially related), and provides eventual consistency across different subsets. We argue that by using this approach, we can in practice ensure reasonably good consistency among the clients and a concomitant increase in performance.

We further describe the design of a prototype system, BLAST, which implements this model. It dynamically adjusts resource utilization in response to changes in the workload thus ensuring nearly constant latency, and throughput, which scales with the offered load. In particular, the workload spikes for a single hot object are handled by cloning the object and partitioning the clients according to their social connectivity, binding the partitions to different clones, where each partition has a unique view of the object. The clones and the client partitions are recombined when the spike subsides. We compare the performance of BLAST to Cassandra database system, and our experiments show that BLAST handles 1.6X (by performing one split) and 2.4X (by performing three splits) more workload. We also evaluate BLAST against another load balancing system and show that BLAST provides 37% better Quality of Experience (QoE) to the clients.

Video Archive Go Top

 

Series: Department Seminar
Title: Algorithms and Data Structures for Geometric Intersection Query Problems

  • Speaker: Dr. Rahul Saladi
                   Computer Science and Engineering
                   University of Illinois Urbana-Champaign
  • Date and Time: Thursday, June 21, 2018, 11:00 AM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
Geometric intersection queries (GIQ) a.k.a. range-searching queries have been very well studied by the computational geometry community and the database community. In a GIQ problem, the user is not interested in the entire input geometric dataset, but only in a small subset of it and requests an informative summary of that small subset of data. The goal is to organize the data into a data structure which occupies a small amount of space and yet responds to any user query in real-time.

In this talk, I will try to give an overview of the contributions I have made along with my co-authors on some of the GIQ problems. The focus will be on (a) top-k queries which are very popular in the

database community, and (b) point location and rectangle stabbing queries which are fundamental

problems in the computational geometry community

Speaker Bio:
Rahul Saladi got his Ph.D. from University of Minnesota. He obtained his bachelor's degree and master's degree in computer science from IIIT-Hyderabad. Rahul has been awarded the Computer Science Fellowship in Fall 2011 and the Doctoral Dissertation Fellowship (DDF) for the 2014-15. At the Fast Forward Preview Session at ACM SIGSPATIAL GIS 2012 he was awarded the first prize for his presentation. He has published his work in top theoretical computer science venues (SODA, SoCG) and in top databases venues (PODS, IEEE TKDE).

Host Faculty: Dr. Anand Louis

Video Archive Go Top

 

Series: M.Sc. (Engg) Thesis Defense
Title: Checking observational purity of procedures

  • Speaker: Mr. Himanshu Arora
                   M.Sc. (Engg.) student
                   Dept. of CSA
  • Faculty Advisor: Prof. K V Raghavan
  • Date and Time: Tuesday, June 19, 2018, 2:00 PM
  • Venue: CSA Seminar Hall (Room No. 254, First Floor)

Abstract
We provide two static analysis approaches (using theorem proving) that check if a given (recursive) procedure behaves as if it were stateless, even when it maintains state in global variables. In other words, we check if the given procedure behaves like a mathematical function. In order to eliminate the need for manual annotations, we make use of an invariant that makes use of uninterpreted function symbols. This invariant captures the set of reachable global states in all runs of the procedure, if the procedure is observationally pure. If the procedure is not observationally pure, this invariant has no semantics. Allowing function symbols makes it easier to generate the invariant automatically. The two static analysis are an existential checker and an impurity witness checker. The impurity witness checker outputs a formula whose unsatisfiability implies that the procedure is observationally pure. Whereas, the existential checker outputs a formula that constrains the definition of the function that the given procedure may implement. Satisfiability of the formula generated by the existential checker implies that the given procedure is observationally pure. The impurity witness approach works better (empirically) with SMT solvers, whereas the existential approach is more precise on paper. We illustrate our work on examples such as matrix chain multiplication. Examples such as these are not addressable by related techniques in the literature. The closest work to our is by Barnett et al.; this work cannot handle procedures with self recursion. We prove both our approaches to be sound. We have implemented the two static analyses using the Boogie framework and the Z3 SMT solver, and have evaluated our implementation on a number of examples.

Video Archive Go Top

 

 

 

 

 

 

 

 

 

 

Copyright: CSA, IISc 2018      Phone: +91-80-22932368          Fax: +91-80-23602911 Travel Blog    Feedback    Credits