Publications (All)
Book Chapters
A.G.Joseph and S.Bhatnagar, An Incremental Fast Policy Search Using a Single Sample Path, Shankar B.,
Ghosh K., Mandal D., Ray S., Zhang D., Pal S. (Eds) Pattern Recognition and Machine Intelligence,
Lecture Notes in Computer Science, vol 10597. Springer, 2017 online pdf
S.Bhatnagar, V.S.Borkar and Prashanth L.A., Adaptive feature pursuit: Online adaptation of features in
reinforcement learning, Reinforcement Learning and Approximate Dynamic
Programming for Feedback Control (Ed. F. Lewis and D. Liu), IEEE Press Computational Intelligence Series (jointly published
by IEEE Press and Wiley), Chapter 23, pp. 517534, 2013 online pdf.
S.Bhatnagar, Simultaneous perturbation and finite difference methods, Wiley Encyclopedia of Operations Research and Management Science (Ed. J. Cochran), Vol. 7, pp. 49694991, Wiley, Hoboken, NJ, 2011 online pdf.
P.Viswanath, M.N.Murty and S.Bhatnagar, Pattern synthesis for nonparametric pattern recognition, Encyclopedia of Data Warehousing and Mining, second edition, Ed. J. Wang, Montclair State University, USA, Published by Idea group inc.,USA, 2008.
V.Sudha, L.Gopal, V.Sridhar and S.Bhatnagar, Fuzzy clustering based Ad recommendation for TV programs, Interactive TV: A Shared Experience, Eds. P.Cesar, K.Chorianopoulos and J.F.Jensen, Springer, pp.175184, 2007.
P.Viswanath, M.N.Murty and S.Bhatnagar, Pattern synthesis for largescale pattern recognition, Encyclopedia of Data Warehousing and Mining, Ed. . J. Wang, Montclair State University, USA, Published by Idea group inc.,USA, 2005, pp. 902905.
S.Bhatnagar, M.Fu and S.I.Marcus, Two timescale SPSA algorithms for ratebased ABR flow control, Chapter 27, System Theory: Modeling, Analysis and Control, Ed. T.Djaferis and I.Schick, Kluwer Academic, Cambridge, Massachussets, pp.367378, 1999.
Journal Papers
A.Ramaswamy and S.Bhatnagar, Analyzing approximate value iteration algorithms, Mathematics of Operations Research,
Vol.47, No.3, pp. 21382159, 2022 online pdf arXiv
D.R.Bharadwaj, Chandramouli K., and S.Bhatnagar, A generalized minimax Qlearning algorithm for twoplayer zerosum stochastic games, IEEE Transactions on Automatic Control, Vol. 67, No. 9, pp. 48164823, 2022 online pdf arXiv
Chandramouli K., D.R.Bharadwaj, and S.Bhatnagar, Generalized Second Order Value Iteration in Markov Decision Processes,
IEEE Transactions on Automatic Control, Vol. 67, Issue 8, pp. 42414247, 2022 online pdf
arXiv
P.Karmakar and S.Bhatnagar, Stochastic approximation with iteratedependent Markov noise under
verifiable conditions in compact state space with the stability of
iterates not ensured, IEEE Transactions on Automatic Control, Vol.66, Issue 12, pp. 59415954, Dec 2021 online pdf arXiv
A.Ramaswamy, S.Bhatnagar and D.Quevedo, Asynchronous stochastic approximations with asymptotically biased
errors and deep multiagent learning, IEEE Transactions on Automatic Control, Vol. 66, Issue 9, pp. 39693983, Sep 2021 online pdf
arXiv
K.J.Prabuchandran, S.Penubothula, Chandramouli K., and S.Bhatnagar, Novel first order Bayesian optimization with an application
to reinforcement learning, Applied Intelligence, Springer, Vol. 51, pp. 15651579, 2021 online pdf
P.Karmakar and S.Bhatnagar, On tight bounds for function approximation error in risksensitive reinforcement learning,
Systems and Control Letters, Vol. 150, 104899:17, April 2021 online pdf
A.Singla, Sindhu P.R., and S.Bhatnagar, Memorybased Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge,
IEEE Transactions on Intelligent Transportation Systems, Vol.22, No.1, pp.107118, January 2021 online pdf
arXiv
V.G.Yaji and S.Bhatnagar, Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iteratedependent Markov Noise,
Mathematics of Operations Research, Vol. 45, No.4, pp. 14051444, November 2020 online pdf
arXiv
Sindhu P.R., Prabuchandran K.J., aqnd S.Bhatnagar, Reinforcement Learning Algorithm for NonStationary
Environments, Applied Intelligence, Springer, Vol.50, pp.35903606, 2020 online pdf
arXiv
I.John, Chandramouli K., and S.Bhatnagar, Generalized Speedy Qlearning, IEEE Control Systems Letters,
Vol.4, Issue 3, July 2020
online pdf
Prashanth L.A., S.Bhatnagar, N.Bhavsar, M.Fu and S.Marcus, Random directions
stochastic approximation with deterministic perturbations, IEEE Transactions on Automatic
Control, Vol. 65, Issue 6, pp. 24502465, June 2020 online pdf
arXiv
V.G.Yaji and S.Bhatnagar, Analysis of Stochastic Approximation Schemes with Setvalued Maps in
the Absence of a Stability Guarantee and their Stabilization, IEEE Transactions on Automatic Control, Vol. 65, Issue 3,
pp. 11001115, March 2020 online pdf
arXiv
Chandramouli K., D.R.Bharadwaj and S.Bhatnagar, Successive OverRelaxation QLearning,
IEEE Control Systems Letters (LCSS), Vol. 4, Issue 1, pp. 5560, Jan 2020
online pdf arXiv
Chandramouli K., D.R.Bharadwaj, Prabuchandran K.J., and S.Bhatnagar, An Online Sample Based Method for Mode Estimation using ODE Analysis
of Stochastic Approximation Algorithms, IEEE Control Systems Letters (LCSS), Vol. 3, Issue 3, pp. 697702,
July 2019 online pdf arXiv
A.Ramaswamy and S.Bhatnagar, Stability of Stochastic Approximations with ‘Controlled Markov’ Noise and Temporal Difference Learning, IEEE
Transactions on Automatic Control, Vol. 64, Issue 6, pp. 26142620, June 2019 online pdf
A.G.Joseph and S.Bhatnagar, An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method,
Machine Learning, Vol. 107, Issue 8–10, pp.1385–1429, 2018 online pdf
arXiv
D.R.Bharadwaj, K.J.Prabuchandran, and S.Bhatnagar, Novel sensor scheduling scheme for intruder tracking in energy efficient sensor networks, IEEE Wireless Communication Letters, Vol. 7, Issue 5, pp. 712715, Oct 2018
online pdf
A.G.Joseph and S.Bhatnagar, An incremental offpolicy search in a modelfree Markov decision
process using a single sample path, Machine Learning, Vol.107, Issue 6, pp. 969–1011, 2018 online pdf
arXiv
A.Ramaswamy and S.Bhatnagar, Analysis of Gradient Descent Methods with NonDiminishing, Bounded Errors, IEEE Transactions on Automatic Control, Vol. 63, Issue 5, pp.1465–1471, 2018 online pdf arXiv
Chandrashekar L., S.Bhatnagar, and C.Szepesvari, A Linearly Relaxed Approximate Linear Program for Markov Decision Processes, IEEE Transactions on Automatic Control, Vol. 63, Issue 4, pp. 1185–1191, 2018
online pdf arXiv
S.Bhatnagar, S.Patel, and Karmeshu, A Stochastic Approximation Approach to Active Queue Management,
Telecommunication Systems (Springer), Vol.68, No.1, pp.89–104, 2018 online pdf
V.G.Yaji and S.Bhatnagar, Stochastic Recursive Inclusions with NonAdditive IterateDependent Markov Noise,
Stochastics, Vol. 90, No. 3, pp. 330–363, 2018 online pdf
arXiv
E.Zhou and S.Bhatnagar, Gradientbased Adaptive Stochastic Search for Simulation Optimization over Continuous Space,
INFORMS Journal on Computing, Vol. 30, No. 1, pp. 154–167, 2018 online pdf
P.Karmakar and S.Bhatnagar, Two Timescale Stochastic Approximation with Controlled Markov noise and
Offpolicy Temporal Difference Learning, Mathematics of Operations Research,
Vol. 43, No.1, pp. 130–151, 2018 online pdf arXiv
Chandrashekar L. and S.Bhatnagar, A Stability Criterion
for Two Timescale Stochastic Approximation Schemes, Automatica, Vol.79, pp.108114, May 2017 online pdf
L.A.Prashanth, S.Bhatnagar, M.Fu, and S.Marcus, Adaptive system optimization using random directions stochastic approximation, IEEE Transactions on Automatic Control, Vol. 62, Issue 5, pp.2223–2238, 2017
online pdf
arXiv
A.Ramaswamy and S.Bhatnagar, A generalization of the BorkarMeyn theorem for stochastic recursive inclusions, Mathematics of Operations Research, Vol. 42, No. 3, pp. 648–661, 2017 online pdf arXiv
A.Ramaswamy and S.Bhatnagar, Stochastic recursive inclusion
in two timescales with an application to the Lagrangian dual problem, Stochastics, Vol.88, No.8, pp.11731187, 2016 online pdf,arXiv
Lakshmanan K. and S.Bhatnagar, QuasiNewton smoothed functional al
gorithms for unconstrained and constrained simulation optimization, Computational
Optimization and Applications (Springer), Vol.66, No.3, pp.533556, 2017 online pdf
Karmeshu, S.Patel, and S.Bhatnagar, Adaptive mean queue size and its rate of change: queue management with random dropping, Telecommunication Systems (Springer), Vol.65, Issue 2, pp.281295, 2017 online pdf
L.A.Prashanth, H.L.Prasad, S.Bhatnagar and P.Chandra, A constrained optimization perspective on actor critic algorithms and application to network routing, Systems and Control Letters, Vol.92, pp.4651, 2016 online pdf
Prabuchandran K.J., S.Bhatnagar and V.S.Borkar, Actor Critic
Algorithms with Online Feature Adaptation, ACM Transactions on Modeling and Computer Simulation, Vol.26, No.4, pp.24:124:26, 2016
online pdf
M.S.Abdulla and S.Bhatnagar, Multiarmed bandits based on a
variant of simulated annealing, Indian Journal of Pure and Applied Mathematics (Springer),
Special Issue in Honour of Prof.Vivek Borkar's 60th Birthday, Vol.47, Issue 2, pp.195212, 2016
online pdf
S.Bhatnagar and Lakshmanan K., Multiscale Qlearning with Linear
Function Approximation, Discrete Event Dynamic Systems, Vol.26, Issue 3, pp.477509, 2016
online pdf
V.G.Yaji and S.Bhatnagar, Necessary and sufficient
conditions for optimality in constrained general sum stochastic games, Systems and Control Letters, Vol. 85, pp.815, 2015 online pdf
Sindhu P.R., Prabuchandran K.J., and S.Bhatnagar,
Energy sharing for multiple sensor nodes with finite buffers, IEEE Transactions
on Communications, Vol.63, No.5, pp.18111823, 2015 online pdf
S.Bhatnagar and Prashanth L.A., Simultaneous Perturbation Newton Algorithms for Simulation Optimization, Journal of Optimization Theory and Applications (Springer), Vol. 164, Issue 2, pp.621643,
2015 online pdf
Prashanth L.A., H.L.Prasad, N.Desai, S.Bhatnagar and G.Dasgupta, Simultaneous perturbation methods for adaptive labor staffing in service systems, Simulation, Vol. 91, No. 5, pp.432455, 2015 online pdf
Prashanth L.A, A.Chatterjee and S.Bhatnagar, Two timescale convergent Qlearning for sleep–scheduling in wireless sensor networks, Wireless Networks (Springer), Vol. 20, Issue 8, pp.25892604, 2014
online pdf
D.Ghoshdastidar, A.Dukkipati and S.Bhatnagar, Newton based stochastic optimization using qGaussian smoothed functional algorithms, Automatica (Elsevier), Vol. 50, No.10, pp.26062614, 2014
online pdf
D.Ghoshdastidar, A.Dukkipati and S.Bhatnagar, Smoothed functional algorithms for stochastic optimization using qGaussian distributions, ACM Transactions on Modeling and Computer Simulation,
Vol. 24, No. 3, pp.17:1–17:26, 2014 online pdf
S. Chakravarty, Sindhu P.R. and S. Bhatnagar, A simulation based algorithm for optimal pricing policy under demand uncertainty, International Transactions in Operational Research (Wiley), Vol.21, Issue 5, pp.737760, 2014
online pdf
S.Bhatnagar, Smoothed functional algorithms for optimization, Annals of the Indian National Academy of Engineering (INAE), Vol.XI, pp.95105, April 2014.
S.Bhatnagar, V.S.Borkar and Prabuchandran K.J., Feature search in the Grassmanian in online
reinforcement learning, IEEE Journal of Selected Topics in Signal Processing, Vol.7, No.5, pp.746758, 2013 online pdf
Prabuchandran K.J., S.K.Meena and S.Bhatnagar, Qlearning based
energy management policies for a single sensor node with finite buffer,
IEEE Wireless Communication Letters, Vol.2, Issue 1, pp.8285, 2013
online pdf.
H.L.Prasad, L.A.Prashanth, S.Bhatnagar and N.Desai, Adaptive
Smoothed Functional Algorithms for Optimal Staffing Levels in Service
Systems, Service Science (INFORMS), Vol. 5, Issue 1, pp.2955, March 2013
online pdf.
L.A.Prashanth and S.Bhatnagar, Threshold tuning using stochastic
optimization for graded signal control, IEEE Transactions on Vehicular
Technology, Vol. 61, No. 9, pp.38653880, November 2012 online pdf.
H.L.Prasad and S.Bhatnagar, GeneralSum Stochastic Games:
Verifiability Conditions for Nash Equilibria, Automatica,
Vol. 48, Issue 11, pp.29232930, 2012 online pdf.
K.R.Vemu, S.Bhatnagar and N.Hemachandra, Optimal Multilayered Congestion Based Pricing Schemes for Enhanced QoS, Computer Networks (Elsevier), Vol.56, Issue 4, pp.12491262, March 2012. (DOI: 10.1016/j.comnet.2011.12.004)
S.Bhatnagar and Lakshmanan K., An Online Actor–Critic Algorithm with Function Approximation for
Constrained Markov Decision Processes, Journal of Optimization Theory and Applications (Springer), Vol. 153, No. 3, pp.688708, 2012. (DOI: 10.1007/s1095701299895)
S.Bhatnagar, V.Mishra and N.Hemachandra, Stochastic algorithms for discrete parameter simulation optimization, IEEE Transactions on Automation Science and Engineering, Vol. 8, Issue 4, pp. 780793, 2011. (DOI: 10.1109/TASE.2011.2159375)
Karmeshu, S.Bhatnagar and V.Mishra, An optimized SDE model for slotted Aloha, IEEE Transactions on Communications, Vol. 59, No. 6, pp.15021508, 2011. (DOI: 10.1109/TCOMM.2011.09.090113)
L.A.Prashanth and S.Bhatnagar, Reinforcement learning with function approximation for traffic signal control, IEEE Transactions on Intelligent Transportation Systems, Vol. 12, No. 2, pp.412421, 2011. (DOI: 10.1109/TITS.2010.2091408)
S.Bhatnagar, The BorkarMeyn Theorem for Asynchronous Stochastic Approximations, Systems and Control Letters, Vol. 60, pp. 472478, 2011. (DOI: 10.1016/j.sysconle.2011.04.002)
S.Bhatnagar and Karmeshu, MonteCarlo Estimation of TimeDependent Statistical Characteristics of Random Dynamical Systems, Applied Mathematical Modelling (Elsevier), Vol.35, pp.30633079, 2011. (DOI: 10.1016/ j.apm.2010.12.024).
S.Bhatnagar, N.Hemachandra and V.Mishra, Stochastic approximation algorithms for constrained optimization via simulation, ACM Transactions on Modeling and Computer Simulation, Vol. 21, Issue 3, pp:15:115:22, 2011.
S.Bhatnagar, An actorcritic algorithm with function approximation for discounted cost constrained Markov decision processes, Systems and Control Letters, Vol. 59, pp.760766, 2010 (DOI: 10.1016/ j.sysconle.2010.08.013).
G.R.Reddy, S.Bhatnagar, V.Rakesh and V.P.Chaturvedi, An efficient algorithmfor scheduling in bluetooth piconets and scatternets, Wireless Networks (Springer), Vol.16, No.7, pp.17991816, 2010 (DOI: 10.1007/ s1127600902293).
A.Chakraborty and S.Bhatnagar, Optimized policies for the retransmission probabilities in slotted Aloha, Simulation, Vol.86, No.4, pp.247261, 2010.
S.Bhatnagar and R.K.Patro, A proof of convergence of the BRED and PRED algorithms for random early detection, IEEE Communication Letters, Vol.13, No.10, pp.809811, 2009.
S.Bhatnagar, R.S.Sutton, M.Ghavamzadeh and M.Lee, Natural ActorCritic Algorithms, Automatica, Vol.45, Issue 11, pp.24712482, 2009.
S.Bhatnagar, Karmeshu and V.Mishra, Optimal Parameter Trajectory Estimation in Parameterized SDEs: An Algorithmic Procedure, ACM Transactions on Modeling and Computer Simulation (TOMACS), Vol. 19, No. 2, pp. 8:18:27, 2009.
R.K.Patro and S.Bhatnagar, A Probabilistic Constrained Nonlinear Optimization Framework to Optimize RED parameters, Performance Evaluation, Vol. 66, Issue 2, pp.81104, 2009.
S.Bhatnagar and M.S.Abdulla, Simulationbased optimization algorithms for finite horizon Markov decision processes, Simulation, Vol. 84, No. 12, pp. 577600, 2008.
V.Sudha, L.Gopal, S.Bhatnagar and V.Sridhar, A novel Adrecommendation system for TV programs Springer/ACM Multimedia Systems Journal, Vol.14, No.2, pp.7387, 2008.
C. Vignat and S. Bhatnagar, An extension of Wick's Theorem, Statistics and Probability Letters, Vol.78, Issue 15, pp.24042407, 2008.
S.Bhatnagar and K.M.Babu, New algorithms of the Qlearning type, Automatica, Vol.44, Issue 4, pp.11111119, 2008.
S.Bhatnagar, An adaptive multivariate threetimescale smoothed functional algorithm for simulation optimization, ACM Transactions on Modeling and Computer Simulation, Vol.18, No.1, pp.2:12:35, December 2007.
A.Dukkipati, S.Bhatnagar and M.N.Murty, GelfandYaglomPerez theorem for generalized relative entropy functionals, Information Sciences, Vol.177, pp.57075714, 2007.
A.Dukkipati, S.Bhatnagar and M.N.Murty, On measuretheoretic aspects of nonextensive entropy functionals and corresponding maximum entropy prescriptions, Physica A, Vol.384, pp.758774, 2007.
M.S.Abdulla and S.Bhatnagar, ‘Reinforcement learning based algorithms for average cost Markov decision processes’, Discrete Event Dynamic Systems, Vol.17, No.1, pp.2352, 2007.
S.Bhatnagar, V.S.Borkar and A.Madhukar, A simulation based algorithm for ergodic control of Markov chains conditioned on rare events, Journal of Machine Learning Research, Vol.7, pp.19371962, 2006.
R.Vaidya and S.Bhatnagar, Robust optimization of random early detection, Telecommunication Systems, Vol.33, No.4, pp.291316, 2006.
P.Viswanath, M.N.Murty and S.Bhatnagar, Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification, Pattern Recognition Letters, 27, pp.17141724, 2006.
S.Bhatnagar and J.R.Panigrahi, Actorcritic algorithms for hierarchical Markov decision processes, Automatica, Vol.42, Issue 4, pp.637644, 2006.
A.Dukkipati, M.N.Murty and S.Bhatnagar, Nonextensive triangle equality and other properties of Tsallis relativeentropy minimization, Physica A, Vol.361, pp.124138, 2006.
S.Bhatnagar and H.J.Kowshik, A discrete parameter stochastic approximation algorithm for simulation optimization, Simulation: Transactions of the Society for Modeling and Simulation International, Vol.81, No.11, pp.757772, 2005.
P.Viswanath, M.N.Murty and S.Bhatnagar, Overlap pattern synthesis with an efficient nearest neighbour classifier, Pattern Recognition, Vol.38, pp.11871195, 2005.
S.Bhatnagar and I.B.B.Reddy, Optimal threshold policies for admission control in communication networks via discrete parameter stochastic approximation, Telecommunication Systems, Vol.29, No.1, pp.931, 2005.
S.Bhatnagar, Adaptive multivariate threetimescale stochastic approximation algorithms for simulation based optimization, ACM Transactions on Modeling and Computer Simulation (TOMACS), Vol.15, No.1, pp.74107, January 2005.
P.Viswanath, M.N.Murty and S.Bhatnagar, Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification, Information Fusion, Vol. 5, pp.239250, 2004.
S.Bhatnagar and S.Kumar, A simultaneous perturbation stochastic approximation based actorcritic algorithm for Markov decision processes, IEEE Transactions on Automatic Control, Vol. 49, Number 4, pp.592598, April 2004.
S.Bhatnagar and V.S.Borkar, Multiscale chaotic SPSA and smoothed functional algorithms for simulation optimization, Simulation: Transactions of the Society for Modeling and Simulation International, Vol. 79, Issue 10, pp.568580, 2003.
S.Bhatnagar, M.C.Fu, S.I.Marcus and IJ.Wang, Two timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences, ACM Transactions on Modelling and Computer Simulation, Vol. 13, No. 2, pp.180209, 2003.
X.R.Cao, R.Zhiyuan, S.Bhatnagar, M.C.Fu and S.I.Marcus, A time aggregation approach to Markov decision processes, Automatica, Vol. 38, No. 6, 929943, 2002.
S.Bhatnagar, M.C.Fu, S.I.Marcus and P.J.Fard, Optimal structured feedback policies for ABR flow control using two timescale SPSA, IEEE/ACM Transactions on Networking, Vol.9, No.4, pp.479491, 2001.
S.Bhatnagar, M.C.Fu, S.I.Marcus and S.Bhatnagar, Two timescale algorithms for simulation optimization of hidden Markov models, IIE Transactions (Pritsker special issue on simulation), Vol.3, pp.245258, 2001.
S.Bhatnagar and V.S.Borkar, A two time scale stochastic approximation scheme for simulation based parametric optimization, Probability in the Engineering and Informational Sciences, Vol.12, pp.519531, 1998.
V.H.Gupta and S.Bhatnagar, An optimal fuelinjection policy for performance enhancement in internal combustion (I.C.) engines, Sadhana (Indian Academy of Sciences), Vol.22, Part 4, pp.545552, 1997.
S.Bhatnagar and V.S.Borkar, Multiscale stochastic approximation for parametric optimization of hidden Markov models, Probability in the Engineering and Informational Sciences, Vol.11, pp.509522, 1997.
S.Bhatnagar and V.S.Borkar, A convex analytic framework for ergodic control of semiMarkov processes, Mathematics of Operations Research, Vol.20, No.4, pp.923936, 1995.
Preprints Submitted to journals
Our recent papers on arXiv can be found here
Proceedings of International Conferences
S.Bhatnagar and L.A.Prashanth, Generalized simultaneous perturbation stochastic approximation with reduced estimator bias,
57th Annual Conference on Information Sciences and Systems (CISS), Invited Paper, Baltimore, Maryland, March 2224, 2023
A.K.Jayant and S.Bhatnagar, Modelbased Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm, NeurIPS 2022, New Orleans, Louisiana, USA, Nov 28 to Dec 04, 2022 arXiv
R.Deb, M.Gandhi, and S.Bhatnagar, Schedule Based Temporal Difference Algorithms, 58th Annual Allerton Conference on Communication, Control, and Computing (IEEE), Monticello, Illinois, USA, Sep 27 to 30, 2022 Online PDF (Invited Paper)
Sindhu P.R., Prabuchandran K.J., S. Ganguly, and S.Bhatnagar, Data Efficient Safe Reinforcement Learning, IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic, October 912, 2022
D.R.Bharadwaj, P.Jain, Prabuchandran K.J., and S.Bhatnagar, Neural network compatible offpolicy natural actorcritic algorithm, Int
ernational Joint Conference on Neural Networks (IJCNN), Padova, Italy, July 1823, 2022 arXiv (Best Student Paper Award)
U.A.Mishra, S.R.Samineni, P.Goel, C.Kunjeti, H.Lodha, A.Singh, A.Sagi, S.Bhatnagar and S.Kolathaya, Dtnamic mirror descent based model predictive control for accelerating robot learning, IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, May 2327, 2022 arXiv
R.Deb and S.Bhatnagar, Gradient Temporal Difference with Momentum: Stability and Convergence, AAAI Conference on Artificial Intelligence,
Vancouver, Feb 22  Mar 01, 2022 arXiv
Priya S. and S.Bhatnagar, Robust traffic signal timing control using multiagent twin delayed deep deterministic
policy gradients, 14th International Conference on Agents and Artificial Intelligence (ICAART), Online, Feb 35, 2022
P.Parnika, D.R.Bharadwaj, D.S.K.Reddy and S.Bhatnagar, Attention ActorCritic algorithm for MultiAgent Constrained Cooperative
Reinforcement Learning, AAMAS (Extended Abstract), Virtual Event, May 37, 2021
K.Paigwar, L.Krishna, S.Tirumala, N.Khetanm A.Sagi, A.Joglekar, S.Bhatnagar, A.Ghosal, B.Amrutur, and S.Kolathaya,
Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach, Conference on Robot Learning
(CoRL), Virtual Event, November 1618, 2020
S.Tirumala, S.G.Venkatesh, K.Paigwar, A.V.Sagi, A.Joglekar, S.Bhatnagar, B.Amrutur and S.N.Y.Kolathaya, Learning Stable
Manoevtres for Quadruped Robots from Expert Demonstrations, 29th IEEE International Conference on Robot and Human Interactive
Communication (RoMan), Naples, Italy, Aug.31Sep.04, 2020
S.Nayak, C.A.Ekbote, A.P.S.Chauhan, D.R.Bharadwaj, P.Ray, A.Sikdar, D.S.K.Reddy, and S.Bhatnagar, Stochastic
Game Framework for Efficient Energy Management in Microgrid Networks, IEEE PES Innovative Smart Grid Technologies Conference, The Hague, Netherlands, Oct. 2528, 2020 arXiv
Sindhu P.R., S. Rao, and S.Bhatnagar, LearningBased Resource Allocation in Industrial IoT Systems,
IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, London, UK,
Aug. 31  Sep. 3, 2020
I.John and S.Bhatnagar, Deep Reinforcement Learning with Successive OverRelaxation and its Application
in Autoscaling Cloud Resources, International Joint Conference on Neural Networks (IJCNN), Glasgow, UK,
July 1924, 2020
D.R.Bharadwaj, Chandramouli K., and S.Bhatnagar, A convergent offpolicy temporal difference algorithm, European Conference on Artificial Intelligence (ECAI), Santiago de Compostela, Spain, June 812, 2020
I.John, R.Karumanchi and S.Bhatnagar, Predictive and prescriptive analytics for performance optimization: framework and a case study on a
largescale enterprise system, IEEE International Conference on Machine Learning and Applications (ICMLA), Boca Raton, Florida, Dec 1619, 2019
A.Dharmavaram, M.Riemer and S.Bhatnagar, Hierarchical average reward policy gradient algorithms, AAAI20 Student Abstract and Poster Program,
(to appear in Proceedings of AAAI 2020), New York, Feb 712, 2020
A.G.Joseph and S.Bhatnagar, An incremental algorithm for estimating extreme quantiles, Indian Control
Conference (IEEE), pp. 286291, Hysderabad, Dec. 1820, 2019
online pdf
A.G.Joseph and S.Bhatnagar, An Adaptive and Incremental Approach to Quantile Estimation, IEEE Conference on Decision and Control,
Nice, France, Dec 1113, 2019 (to appear in Proceedings of CDC  to be published in IEEE XPlore)
Chandramouli K., D.R.Bharadwaj, Prabuchandran K.J., and S.Bhatnagar, An Online Sample Based Method for Mode Estimation using ODE Analysis
of Stochastic Approximation Algorithms, IEEE Conference on Decision and Control,
Nice, France, Dec 1113, 2019 (to appear in IEEE Control Systems Letters) online pdf
arXiv
S.Kolathaya, D.Dholakiya, S.Bhatnagar, A.Singla, S.Bhattacharya, A.Ghosal,
B.Amrutur, A.Singh, A.Joglekar, A.V.Sagi, S.Shetty, and A.Gunalan, Trajectory Based Deep Policy
Search for Quadrupedal Walking, 28th IEEE International Conference on Robot and Human Interactive
Communication (RoMan 2019), New Delhi, Oct 1418, 2019
S.Bhattacharya, A.Singla, A.Singh, D.Dholakiya, S.Bhatnagar, B.Amrutur, A.Ghosal,
and S.Kolathaya,
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots,
28th IEEE International Conference on Robot and Human Interactive
Communication (RoMan 2019), New Delhi, Oct 1418, 2019 arXiv
A.G.Joseph and S.Bhatnagar, Stochastic approximation trackers for modelbased search, 57th Annual Allerton Conference on Communication,
Control and Computing, Urbana, Illinois, Sep 2427, 2019
A.Singla, S.Bhattacharya, D.Dholakiya, S.Bhatnagar, A.Ghosal, B.Amrutur and S.Kolathaya, Realizing Learned Quadruped Locomotion
Behaviors through Kinematic Motion Primitives, IEEE International Conference on Robotics and Automation (ICRA), Montreal, 2019 (Accepted)
arXiv
D.R.Bharadwaj, D.S.K.Reddy, K.J.Prabuchandran and S.Bhatnagar, ActorCritic Algorithms for Constrained Multiagent Reinforcement Learning,
18th International Conference on Autonomous Agents and Multiagent Systems, Montreal, pp.19311933, 2019
online pdf
D.Dholakia, S.Bhattacharya, A.Gunalan, A.Singla, S.Bhatnagar, B.Amrutur, S.Kolathaya and A.Ghosal,
Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch, 5th
International Conference on Control, Automation and Robotics (ICCAR), Beijing, April 1922, 2019 (Accepted)
arXiv
Indu John and S.Bhatnagar, Efficient budget allocation and task assignment in crowdsourcing, CoDSCOMAD, Kolkata, pp.318321, Kolkata, Jan 35, 2019
(Special Mention Award in the Young Researchers’ Symposium to Indu John) online pdf
A.G.Joseph and S.Bhatnagar, An Adaptive Sampling Algorithm for Policy Evaluation, Fifth IEEE Indian Control Conference (ICC), IIT Delhi, pp.29, Jan 911, 2019 online pdf
N.Karanjkar, M.Desai and S.Bhatnagar, A simulationbased technique for continuousspace embedding of
discreteparameter queueing systems, European Simulation and Modeling Conference, Ghent, Belgium, Oct 2426, 2018 (accepted)
D.R.Bharadwaj, D.S.K.Reddy, K.Narayanam and S.Bhatnagar, A unified decision making framework for supply and demand management
in microgrid networks, IEEE SmartGridComm, Aalborg, Denmark, Oct 29Nov 1, 2018 online pdf
Chandramouli K., Prabuchandran K.J., D.S.K.Reddy and S.Bhatnagar, Generalized Deterministic Perturbations For Stochastic
Gradient Search, IEEE Conference on Decision and Control, pp. 57345739, Fontainebleau, Miami Beach, FL, USA, December 1719, 2018,
online pdf
S.Kumar, Sindhu P.R., Chandrashekar L., P.Parihar, K.Gopinath and S.Bhatnagar, Scalable Performance Tuning of Hadoop
MapReduce: A Noisy Gradient Approach, IEEE Cloud, Honolulu, Hawaii, June 2530, 2017
A.G.Joseph and S.Bhatnagar, A Model based Search Method for Prediction in Modelfree
Markov Decision Process, Proceedings of International Joint Conference on Neural Networks (IJCNN), Anchorage, Alaska, May 1419, 2017
A.G.Joseph and S.Bhatnagar, Bounds for Offpolicy Prediction in Reinforcement Learning , Proceedings of International Joint Conference on Neural
Networks (IJCNN), Anchorage, Alaska, May 1419, 2017
D.Saikoti Reddy, L.A.Prashanth, and S.Bhatnagar, Improved Hessian estimation for adaptive random directions stochastic approximation, Proceedings of IEEE Conference on Decision
and Control (CDC), Las Vegas, NV, Dec 1214, 2016
A.G.Joseph and S.Bhatnagar, Revisiting the Cross Entropy Method with Applications in Stochastic Global Optimization and RL (Full Paper), Proceedings of European Conference on Artificial Intelligence (ECAI), The Hague, Netherlands, Aug.29Sep.02, 2016
R.K.Maity, Chandrashekar L., Sindhu P.R., and S.Bhatnagar, Shaping ProtoValue Functions using Rewards (Short Paper), Proceedings of European Conference on Artificial Intelligence (ECAI), The Hague, Netherlands, Aug.29Sep.02, 2016
A.G.Joseph and S.Bhatnagar, A Randomized Algorithm for Continuous Optimization, Proceedings of Winter Simulation Conference (WSC), Arlington, Virginia, USA, Dec. 1114, 2016
B.N.Ranganath and S.Bhatnagar, Scalable Focussed Entity Resolution,
Proceedings of the International Joint Conference on Neural Networks (IJCNN),
IEEE Press, Vancouver, Canada, July 2529, 2016
A.G.Joseph and S.Bhatnagar, A stochastic approximation algorithm for t
he problem of quantile estimation, Proceedings of 22nd International Conference on Neural Information Processing (ICONIP), Istanbul, Turkey, Nov.912, 2015 (to appear)
Prasad H.L., Prashanth L.A., and S.Bhatnagar, TwoTimescale Algorithms for Learning Nash Equilibria in GeneralSum Stochastic Games, Proceedings of 14th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Istanbul, Turkey, pp.13711379, May 48, 2015
Chandrashekar L. and S.Bhatnagar, A Generalized Reduced Linear Program for Markov Decision Processes, Proceedings of Association for the Advancement
of Artificial Intelligence (AAAI), Austin, Texas, USA, Jan 2530, 2015 (to appear)
M.S.Abdulla and S.Bhatnagar, A Transitionsonly algorithm for Compact Action Set Markov Decision Processes, Proceedings
of Indian Control Conference (ICC), IEEE, Chennai, Jan 57, 2015 (to appear)
M.S.Abdulla and S.Bhatnagar, Stochastic multiarmed bandit algorithms based on simulated annealing, Proceedings
of Indian Control Conference (ICC), IEEE, Chennai, Jan 57, 2015 (to appear)
H.Yao, C.Szepesvari, R.Sutton, J.Modayil and S.Bhatnagar, Universal option models, Advances in Neural Information processing Systems (NIPS), pp.990998, Dec. 811, 2014, Montreal, Canada
Prabuchandran, K.J., Hemanth Kumar A.N. and S.Bhatnagar, Multiagent reinforcment learning for traffic signal control,
Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems, pp.25292534, Qingdao, China, Oct. 911, 2014
Chandrashekar L., A.Dubey, S.Bhatnagar and C.Balamurugan, A Markov decision process framework for predictable job completion
times on crowdsourcing platforms, Proceedings of HCOMP, pp.3435, Pittsburgh, Nov. 24, 2014
Chandrashekar L. and S.Bhatnagar, Maxplus methods for optimal control and zerosum games, Proceedings of the IEEE Conference on Decision and Control, Los Angeles, CA, Dec. 1517, 2014 (to appear)
Prabuchandran K.J., S.Bhatnagar and V.S.Borkar, An actorcritic algorithm based on Grassmanian search, Proceedings of the IEEE Conference on Decision and Control, Los Angeles, CA, Dec. 1517, 2014 (to appear)
E.Zhou, S.Bhatnagar and X.Chen, Simulation optimization via gradientbased stochastic search, Proceedings of the Winter Simulation Conference, pp.38693879, Savannah, GA, Dec. 710, 2014
Prashanth L.A., A. Chatterjee and S.Bhatnagar, Adaptive sleepwake control using reinforcement learning in sensor networks, Proceedings of International Conference on Communication Systems and Networks (COMSNETS), IEEE, pp.18, Jan 610, 2014, Bangalore online pdf.
Prashanth L.A., Prasad H.L., N.Desai, S.Bhatnagar, Mechanisms for Hostile Agents with Capacity Constraints, Proceedings of Twelfth International Conference on Autonomous Agents and Multiagent Systems (AAMAS2013), Ito, Jonker, Gini, and Shehory (eds.), pp. 659666, Saint Paul, Minnesota, May 610,
ISBN: 9781450319935, 2013 online pdf.
K.Laskshmanan and S.Bhatnagar, A Novel Qlearning Algorithm with
Function Approximation for Constrained Markov Decision
Processes, Proceedings of the Fiftieth Annual Allerton Conference on Communication, Control and Computing (IEEE Press), UIUC, Illinois, pp.400405, ISBN: 9781467345378, 2012.
D.Ghoshdastidar, A.Dukkipati and S.Bhatnagar, qGaussian based smoothed functional algorithms for stochastic optimization, Proceedings of IEEE International Symposium on Information Theory (ISIT’2012), pp. 10591063, EISBN: 9781467325783, July 16, 2012.
Prashanth L.A., H.L.Prasad, N.Desai, S.Bhatnagar and G.Dasgupta, Stochastic optimization for adaptive labor staffing in service systems, Proceedings of 9th International Conference on Service Oriented Computing (ICSOC), Cyprus, Dec 58, 2011, Published in
Service Oriented Computing, LNCS, Vol. 7084, pp.487494, 2011.
Prashanth L.A. and S.Bhatnagar, Reinforcement Learning with Average Cost for Adaptive Control of Traffic Lights at Intersections, Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC, pp. 16401645 (ISBN: 9781457721984), October 57, 2011.
K.Lakshmanan and S.Bhatnagar, Smoothed functional and QuasiNewton algorithms for routing in multistage queueing network with constraints, Proceedings of ICDCIT (Distributed Computing and Internet Technology, Lecture Notes in Computer Science, Vol. 65362011, pp.175186, DOI: 10.10079783642190568_12), Feb.912, 2011, Bhubaneswar, India.
H.R.Maei, C.Szepesvari,
S.Bhatnagar and R.S.Sutton, Toward OffPolicy Learning Control with Function Approximation, Proceedings of ICML, 2010.
L.A.Prashanth and S.Bhatnagar, Control of traffic lights at junctions using reinforcement learning, Proceedings of Workshop on Computer Aided Transportation Planning and Traffic Engineering, pp.129138, Dec.711, 2009, Bangalore, 2009.
H.Yao, R.S.Sutton, S.Bhatnagar and C.Szepesvari, MultiStep Dyna Planning for Policy Evaluation and Control, Proceedings of NIPS, 2009.
H.R.Maei, C.Szepesvari, S.Bhatnagar, D.Precup and R.S.Sutton, Convergent TemporalDifference Learning with Arbitrary Smooth Function Approximation, Proceedings of NIPS , 2009.
H.Yao, S.Bhatnagar and C.Szepesvari, LMS2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS, Proceedings of IEEE Conference on Decision and Control, Shanghai, 2009.
H.Yao, R.Sutton, S.Bhatnagar, D.Diao and C.Szepesvari, Dyna(k): A multistep Dyna planning, Proceedings of the ICMLUAICOLT Workshop on Abstraction in Reinforcement Learning, Montreal, 2009.
H.Yao, S.Bhatnagar and C.Szepesvari, Temporal difference learning by direct preconditioning’, Multidisciplinary Symposium on Reinforcement Learning (MSRL), Montreal, 2009
R.S.Sutton, H.R.Maei, D.Precup, S.Bhatnagar, D.Silver, C.Szepesvari and E.Wiewiora, Fast gradientdescent methods for temporaldifference learning with linear function approximation, Proceedings of the International Conference on Machine Learning (ICML), Montreal, 2009.
G.R.Reddy and S.Bhatnagar, An efficient and optimized bluetooth scheduling algorithm for scatternets, Proceedings of IEEE Advanced Networks and Telecommunication Systems (ANTS) Conference, Mumbai, 2008.
S.R.Kolavali and S.Bhatnagar, Ant colony optimization algorithms for shortest path problems, Proceedings of Second Workshop on Network Control and Optimization (NETCOOP) (Published in NETCOOP 2008, Eds. E. Altman and A. Chaintreau, LNCS 5425, pp.3744, Springer, 2008), September 810, 2008, Paris, France.
V.Sudha, S.Bhatnagar, S.V.Basavaraja and V.Sridhar, SPSA based feature relevance estimation for video retrieval, Proceedings of IEEE International Workshop on Multimedia Signal Processing (MMSP), Queensland, Australia, 2008.
R.Patro and S.Bhatnagar, An optimal RIO with statistical delay assurances, Proceedings of National Conference on Communications (NCC), Mumbai, February 23, 2008.
V.P.Chaturvedi, V.Rakesh and S.Bhatnagar, An efficient and optimized bluetooth scheduling algorithm for piconets, Proceedings of International Conference on Distributed Computing and Internet Technology (Published in Distributed Computing and Internet Technology, Eds.T.Janowski and H.Mohanty, LNCS 4882, pp.135145, Springer, 2007), December 1720, 2007, Bangalore, India.
K.R.Vemu, S.Bhatnagar and N.Hemachandra, An optimal weightedaverage congestion based pricing scheme for enhanced QoS, Proceedings of International Conference on Distributed Computing and Internet Technology (Published in Distributed Computing and Internet Technology, Eds.T.Janowski and H.Mohanty, LNCS 4882, pp.135145, Springer, 2007), December 1720, 2007, Bangalore, India.
M.S.Abdulla and S.Bhatnagar, Network flowcontrol using asynchronous stochastic approximation, Proceedings of IEEE Conference on Decision and Control, New Orleans, USA, December 1214, 2007.
K.R.Vemu, S.Bhatnagar and N.Hemachandra, Link route pricing for enhanced QoS, Proceedings of IEEE Conference on Decision and Control, New Orleans, USA, December 1214, 2007.
V.Mishra, S.Bhatnagar and N.Hemachandra, Discrete Parameter Simulation Optimization Algorithms with Applications to Admission Control with Dependent Service Times, Proceedings of IEEE Conference on Decision and Control, New Orleans, USA, December 1214, 2007.
S.Bhatnagar, A RED algorithm for the Internet, Proceedings of National Conference on Information Technology: Present Practices and Challenges, Aug.31Sep.1, 2007, New Delhi, India.
S.Bhatnagar, R.S.Sutton, M.Ghavamzadeh and M.Lee, Incremental update naturalgradient actorcritic algorithms, Proceedings of Neural Information Processing Systems (NIPS), Vancouver, Canada, December 36, 2007.
M.S.Abdulla and S.Bhatnagar, Solving MDPs using twotimescale simulated annealing with multiplicative weights, Proceedings of American Control Conference (ACC), New York, July 1113, 2007.
M.S.Abdulla and S.Bhatnagar, Parametrized actorcritic algorithms for finitehorizon MDPs, Proceedings of American Control Conference (ACC), New York, July 1113, 2007.
V.Sudha, L.Gopal, V.Sridhar and S.Bhatnagar, Fuzzy clustering based Ad recommendation for TV programs, Proceedings of fifth European Conference, EuroITV, Amsterdam, Netherlands, 2007.
S.Bhatnagar and K.M.Babu, TwoTimescale QLearning Algorithms with an Application to Routing in Networks, Proceedings of International Conference on Advances in Control and Optimization of Dynamical Systems (ACODS), Bangalore, 2007.
Diksha Sharma and S.Bhatnagar, Optimal parameterized policies for resource allocation in communication networks, Proceedings of IEEE International Conference on Signal and Image Processing, Hubli, Karnataka, 2006.
V.L.Raju Chinthalapati and S.Bhatnagar, A simultaneous deterministic perturbation actorcritic algorithm with an application to optimal mortgage refinancing, Proceedings of IEEE Conference on Decision and Control, San Diego, CA, USA, 2006.
S.Bhatnagar and M.S.Abdulla, An actorcritic algorithm for finite horizon Markov decision processes, Proceedings of IEEE Conference on Decision and Control, San Diego, CA, USA, 2006.
R.Patro and S.Bhatnagar, A fourtimescale algorithm for constrained stochastic optimization of RED, Proceedings of IEEE Conference on Decision and Control, San Diego, CA, USA, 2006.
M.S.Abdulla and S.Bhatnagar, SPSA with measurement reuse, Proceedings of Winter Simulation Conference, Monterey, CA, USA, 2006.
Diksha Sharma, S.Bhatnagar and S.Chakraborty, An Algorithm for Dynamic Optimal Bandwidth Allocation in Communication Networks, Proceedings of Fifth Asia Pacific International Symposium on Information Technology (APIS5), pp.489492, Hangzhou, China, 2006.
M.S.Abdulla and S.Bhatnagar, Solution of MDPs using simulation based value iteration, Proceedings of Second IFIP Conference on Artificial Intelligence Applications and Innovations, pp.749759, Beijing, China, 2005.
A.Dukkipati, M.N.Murty and S.Bhatnagar, Properties of KullbackLeibler crossentropy minimization in nonextensive framework, Proceedings of IEEE International Symposium on Information Theory, pp.23742378, Adelaide, Australia, 2005.
A.Dukkipati, M.N.Murty and S.Bhatnagar, Information theoretic justification of Boltzmann selection and its generalization to Tsallis case, Proceedings of IEEE Congress on Evolutionary Computation, pp.16671674, Vol.2, Edinburgh, U.K., 2005.
S.Bhatnagar and S.Kumar, A reinforcement learning based algorithm for Markov decision processes, Proceedings of International Conference on Intelligent Sensing and Information Processing (ICISIP), pp.199204, Chennai, India, 2005.
P.Viswanath, M.N.Murty and S.Bhatnagar, A pattern synthesis technique to reduce the curse of dimensionality effect, Proceedings of International Conference on Knowledge Based Computer Systems (KBCS), pp.219228, Hyderabad, India, 2004.
R.Vaidya and S.Bhatnagar, Correlation based optimization of random early detection, Proceedings of IEEE INDICON, pp.4751, Kharagpur, India, 2004.
R.Vaidya and S.Bhatnagar, Optimized RIO for diffserv networks, Proceedings of Information and Computer Science (ICICS), pp.227240, Dhahran, Saudi Arabia, 2004.
J.R.Panigrahi and S.Bhatnagar, Hierarchical decision making in semiconductor fabs using multitime scale Markov decision processes, Proceedings of IEEE Conference on Decision and Control (CDC), pp.43874392, Vol.4, Paradise Island, Nassau, Bahamas, 2004.
P.Viswanath, M.N.Murty and S.Bhatnagar, A pattern synthesis technique with an efficient nearest neighbor classifier for binary pattern recognition, Proceedings of International Conference on Pattern Recognition (ICPR), pp.416419, Vol.4, Cambridge, U.K., 2004.
A.Dukkipati, M.N.Murty and S.Bhatnagar, Cauchy annealing schedule: an annealing schedule for Boltzmann selection scheme in evolutionary algorithms, Proceedings of IEEE Congress on Evolutionary Computation, pp.5562, Vol.1, Portland, Oregon, USA, 2004.
A.Dukkipati, M.N.Murty and S.Bhatnagar, Quotient evolutionary space: abstraction of evolutionary process w.r.t macroscopic properties, Proceedings of IEEE Congress on Evolutionary Computation, pp.846853, Vol.2, Canberra, Australia, 2003.
P.Viswanath, M.N.Murty and S.Bhatnagar, Synthetic patterns for nearest neighbour classifier design, Proceedings of Knowledge Based Computer Systems (KBCS), pp.323332, Mumbai, India, 2002.
P.Viswanath, M.N.Murty and S.Bhatnagar, An efficient classifier: using a compact tree structure and novel pattern synthesis, Proceedings of HPC Asia Conference, pp.395398, Bangalore, India, 2002.
S.Bhatnagar, E.FernandezGaucherand, M.C.Fu, Y.He and S.I.Marcus, A Markov decision process model for capacity expansion and allocation, Proceedings of 38th IEEE Conference on Decision and Control, pp.11561161, Phoenix, Arizona, 1999.
S.Bhatnagar, M.C.Fu and S.I.Marcus, Two timescale SPSA algorithms for ratebased ABR flow control, Advances in System Theory Symposium (in honour of Sanjoy K.Mitter on his 65th birthday), Cambridge, USA, October, 1999.
S.Bhatnagar, M.C.Fu and S.I.Marcus, Rate based ABR flow control using two timescale SPSA, Proceedings of SPIE Conference on Performance and Control of Network Systems III, pp.142149, Boston, September, 1999.
S.Bhatnagar, M.C.Fu, S.I.Marcus, and Y.He, Markov decision processes for semiconductor fablevel decision making, Proceedings of IFAC 14th Triennial World Congress, Beijing, China, pp.145150, 1999.
S.Bhatnagar and V.Sharma, Optimal control of a feedback queue via stochastic approximation, Proceedings of IEEE Globecom98, Sydney, Australia, November, 1998.
S.Bhatnagar and V.S.Borkar, Infinitesimal perturbation analysis: an overview and recent trends, Proceedings of the sixth IEEE symposium on intelligent systems, Bangalore, 1997.
S.Bhatnagar and V.H.Gupta, Using stochastic control to save fuel in automobiles, Proceedings of the fifth IEEE symposium on intelligent systems, Bangalore, pp.5561, Nov.1996.
Technical Reports
S.Bhatnagar, R.S.Sutton, M.Ghavamzadeh and M.Lee, Natural ActorCritic Algorithms, Technical Report, Department of Computing Science, University of Alberta, Canada, 2009 2009TR0910.php.
H.L.Prasad, S.Bhatnagar and N.Hemachandra, A computational procedure for generalsum stochastic games, Technical Report IIScCSATR20095, Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India, 2009.
K.R.Vemu, S.Bhatnagar and N.Hemachandra, Linkroute pricing for optimal QoS, Technical Report IIScCSATR20078, Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India, 2007.
M.S.Abdulla and S.Bhatnagar, Reinforcement learning based algorithms for average cost Markov decision processes, Technical Report IIScCSATR20056, Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India, 2005.
S.Bhatnagar, A simulation based algorithm for Markov decision processes, Department of Computer Science and Automation, Indian Institute of Science, Bangalore, India, 2002.
Y.He, S.Bhatnagar, M.C.Fu and S.I.Marcus, Approximate policy iteration for semiconductor fablevel decision making  a case study, Institute for Systems Research, University of Maryland, URL: http:www.isr.umd.eduTechReports ISR2000TR _200049/, 2000.
S.Bhatnagar, M.C.Fu, S.I.Marcus and S.Bhatnagar, Randomized difference twotimescale simultaneous perturbation stochastic approximation algorithms for simulation optimization of hidden Markov models, Institute for Systems Research, University of Maryland, URL: http:www.isr.umd.eduTechReports ISR2000 TR_200013/, 2000.
S.Bhatnagar, M.C.Fu and S.I.Marcus, Optimal multilevel policies for ABR flow control using two timescale SPSA, Institute for Systems Research, University of Maryland, URL: http:www.isr.umd.eduTechReportsISR1999TR_9918/, 1999.
