E0 209 Principles of Distributed Software
January-April 2022, 3.30pm-5.00pm Tue Thu
Instructor:
K. V. Raghavan
TA:
Ashish Sanjay Kankal, Ashish Shashikant Bokil
First lecture: January 6 2022
Lecturing mode: Online initially. May transition to
in-person or hybrid during the semester. Class link available in Intranet
website that will be available from CSA home page.
Lecture slides
- Introduction,
Jan. 6th
- OS-level virtualization and Docker, Jan. 13, 15.
gs-spring-boot project. First install
openjdk-11-jdk and
maven. Unzip the project, enter into the
gs-spring-boot/ folder, and issue "./mvnw package" to build it. You can
now run the jar file inside gs-spring-boot/target, or you can build the
Docker image.
- Spring,
Service
Oriented Architecture, Jan. 18, 20.
- Microservices,
Jan. 25, 27.
- Feb. 1, 3, 8: Kubernetes.
Two minikube tutorials.
- Feb. 5: Demo on Spring Data JPA
- Feb. 10, 15, 17: Introduction to actor
model and Akka.
Akka-chatroom demo; to compile and run it,
please unzip, enter into the directory, and type "mvn compile exec:exec".
- Feb. 22, 24, Mar. 1: Akka Cluster, and its applications such as cluster sharding, routers, and
distributed data.
Sharding demo; to compile and run it,
please unzip, enter into the directory, and follow the instructions in the
README file.
- Mar. 3: Akka Persistence.
Persistence demo; to compile and run it,
please unzip, enter into the directory, and follow the instructions in the
README file.
- Mar. 8: BASE paper.
- Mar. 10, 15: CRDT paper.
- Mar. 17, 22, 24: Stronger consistency models paper
- Mar 29, 31, Apr. 5, 7: Slides provided by Dr. Prasad Deshpande,
Map reduce, HDFS and
Yarn, Spark.
Motivation
Development of distributed software applications is a very important
activity, accelerated in recent years by the increasing predominance of
cloud computing. The typical requirements from a modern day distributed
application are continuous availability even in the presence of software
and hardware faults, ability to scale up or down on-the-fly based on input
load (i.e., elasticity), ease of development and maintenance, and ease of
continuous integration and deployment. This course will introduce the
principles and programming models and frameworks for distributed
applications that help meet these requirements. It will also cover
representative modern languages and technologies that are used to develop
and deploy such applications.
Syllabus
Introduction to distributed software and cloud computing. OS-level
virtualization, developing and deploying containers using Docker.
Services, statelessness and statefulness, Representational State Transfer
(REST). Basics of developing services and web applications using
SpringBoot. The microservices paradigm of distributed software development,
and common microservices architecture patterns. Useful attributes of
microservices paradigm such as scale in and out, fault tolerance, and
availability. Cluster management using Kubernetes. Introduction to actors
-- a message-driven programming model that enables large scale concurrency,
distribution, and fault tolerance. Programming actor-based systems using
the Akka toolkit. Achieving availability and elasticity by distributing
the application over multiple nodes using Akka Cluster. Introduction to
distributed data analytics -- batch architecture patterns, and introduction
to Spark. Consistency of data in distributed programs -- eventual
consistency.
Reading material