SeminarsView all Seminars | Download ICal for this event
Decision Making under Uncertainty : Reinforcement Learning Algorithms and Applications in Cloud Computing, Crowdsourcing and Predictive Analytics
Series: Ph.D (Engg.) Thesis Defence - ON-LINE
Speaker: Ms. Indu John Ph.D (Engg.) Student Dept. of CSA
Date/Time: Oct 14 11:00:00
Location: Microsoft Teams - ON-LINE
Faculty Advisor: Prof. Shalabh Bhatnagar
In this thesis, we study both theoretical and practical aspects of decision making, with a focus on reinforcement learning based methods. Reinforcement learning (RL) is a form of semi-supervised learning in which the agent learns the decision making strategy by interacting with its environment. We develop novel reinforcement learning algorithms and study decision problems in the domains of cloud computing, crowdsourcing and predictive analytics.
In the first part of the thesis, we develop a model free reinforcement learning algorithm with faster convergence named Generalized Speedy Q-learning and analyze its finite time performance. This algorithm integrates ideas from the well-known Speedy Q-learning algorithm and the generalized Bellman equation to derive a simple and efficient update rule such that its finite time bound is better than that of Speedy Q-learning for MDPs with a special structure. Further, we extend our algorithm to deal with large state and action spaces by using function approximation.
Extending the idea in the above algorithm, we develop a novel Deep Reinforcement Learning algorithm by combining the technique of successive over-relaxation with Deep Q-networks. The new algorithm, named SOR-DQN, uses modified targets in the DQN framework with the aim of accelerating training. We study the application of SOR-DQN in the problem of auto-scaling resources for cloud applications, for which existing algorithms suffer from issues such as slow convergence, poor performance during the training phase and non-scalability.
Next, we consider an interesting research problem in the domain of crowdsourcing - that of efficiently allocating a fixed budget among a set of tasks with varying difficulty levels. Further, the assignment of tasks to workers with different skill levels is tackled. This problem is modeled in the RL framework and an approximate solution is proposed to deal with the exploding state space.
We also study the following problem in predictive analytics : predicting the future values of system parameters well in advance for a large-scale software or industrial system, which is important for avoiding disruptions. An equally challenging and useful exercise is to identify the 'important' parameters and optimize them in order to attain good system performance. In addition to devising an end-to-end solution for the problem, we present a case study on a large-scale enterprise system to validate the effectiveness of the proposed approach.
Link to the Online Thesis Defense: