Software Fault Tolerance via Environmental Diversity – Prof. V.V.S. Sarma Memorial Lecture (Second in the series)

Complex systems in different domains contain significant amount of software. Several  studies have established that a significant fraction of system outages are due to software faults. Traditional methods of fault avoidance, fault removal based on extensive testing/debugging, and fault tolerance based on design/data diversity are found inadequate to ensure high software dependability.The key challenge then is how to provide highly dependable software. We discuss a viewpoint of fault tolerance of software-based systems to ensure high dependability. We classify software faults into Bohr bugs and Mandel bugs, and identify aging-related bugs as a subtype of the latter. Traditional methods have been designed to deal with Bohr bugs.The next challenge then is to develop mitigation methods for Mandel bugs in general and aging-related  bugs in particular. We submit that mitigation methods for Mandel bugs utilize environmental diversity. Retry operation, restart application, failover to an identical replica (hot, warm or cold) and reboot the OS are examples of mitigation techniques that rely on environmental diversity.  For software aging related bugs it is also possible to utilize proactive environmental diversity technique known as software rejuvenation. We discuss environmental diversity both from experimental and analytic points of view and cite examples of real systems employing these techniques.


Bio of the Speaker:

Kishor S.  Trivedi holds the Hudson Chair in the Department of Electrical and Computer Engineering at Duke University, Durham, NC. He has a B.Tech (EE,1968) from IIT Mumbai, M.S. (CS,1972) and PhD (CS,1974) from the University of Illinois, Urbana-Champaign. He has been on the Duke faculty since 1975. He is the author of a well- known text entitled, Probability and Statistics with Reliability, Queuing and Computer Science Applications, first published by Prentice-Hall; a thoroughly revised second edition (including its Indian edition) of this book has been published by John Wiley. He has authored several other books. He is a Life Fellow of the Institute of Electrical and Electronics Engineers. He is a Golden Core Member of IEEE Computer Society. He has published over 600 articles and has supervised 48 Ph.D. dissertations. His h-index is 107. He is a recipient of IEEE Computer Society Technical Achievement Award for his research on Software Aging and Rejuvenation. He is a recipient of IEEE Reliability Society’s Lifetime Achievement Award. He has worked closely with industry in carrying out reliability/availability analysis, providing short courses on reliability, availability, performability modelling and in the development and dissemination of software packages such as SHARPE and SPNP.


About Prof. V.V.S. Sarma (May 1944 – January 2018)

Professor Vallury Subrahmanya Sarma, an extraordinary teacher and researcher, passed away on 13th January 2018 in Bangalore. Professor V.V.S. Sarma was born on May 7, 1944 in Vijayawada. After graduation with a University gold medal in Mathematics, Physics and Chemistry  from Andhra University in 1961, he obtained his BE, ME, and PhD degrees from IISc, Bangalore. He served the IISc as faculty in various capacities from 1967. He became a full professor in 1983, and continued his service until his retirement in 2006. He was a visiting Professor at the University of Southwestern Louisiana, USA, between1984-86 and at Tata Research Development and Design Centre, Pune between 1995-97. He was elected to the fellowships of Indian Academy of Science, Indian National Science Academy and Indian National Academy of Engineering. Post retirement, he was an Honorary Professor in CSA and an INAE Distinguished Professor.

Professor V.V.S. Sarma fondly called VVS by his students and friends initiated research at IISc in the then emerging areas of reliability engineering, pattern recognition, artificial Intelligence and machine learning, which are areas of utmost importance in the industry today. His survey paper in a special issue on AI in management with some new material in IEEE Transactions on Knowledge and Data Engineering entitled “Knowledge-based approaches for scheduling problems: A survey” was widely cited. He has guided a generation of researchers in these areas. His students were drawn from CSA, ECE, Aerospace, Mathematics, and Metallurgy departments and engineers from organizations such as IAF, NAL, ISRO, DRDO, BHEL under the external registration program. Many of his students are currently senior professors in universities or senior engineering researchers in Defense and ISRO across India, USA and Canada. With his collaborators N.Viswanadham and M.G. Singh, he wrote a book “Reliability of Computer and Control Systems” published by North-Holland Systems and Control series in 1987. He co-edited the book “Artificial Intelligence and Expert Systems in Indian Context,” published by TataMcGraw-Hill, 1990 jointly with N.Viswanadham, B.L.Deekshatulu, and B. Yegnanarayana. Prof. VVS Sarma was a very inspiring teacher. He used to enthuse and motivate his students to learn many topics of current research. As early as 1976, when the field was still in its infancy, he taught a course on Artificial Intelligence at IISc. He was a very gentle person and used to be affectionate towards all his students. In the passing away of Prof. VVS Sarma the research community has lost a mentor, an influential researcher and an outstanding teacher. All his students lost a father figure whom they will continue to look up to.


