Homepage of R. Govindarajan

E0-243 : Computer Architecture

Course Outline:

Processor Architecture: Instruction-Level Parallelism, Superscalar and VLIW architecture; Multi-core processors;
Memory Subsystem: Multilevel caches, Caches in multi-core processors, Memory controllers for multi-core systems;
Multiple processor systems: taxonomy, distributed and shared memory system, memory consistency models, cache coherence, and Interconnection networks;
Advanced topics in architecture

Text

D. A. Patterson and J. L. Hennessy, "Computer Architectures: A Quantitative Approach'', Morgan Kaufmann Publishers, 4th Edition
Current Literature.

Reading Materials

J.E. Smith and G.S. Sohi. Microarchitecture of Superscalar Processors. Proceedings of the IEEE, 83(12), 1609-1624.
K.C. Yeager. The MIPS R10000 Superscalar Processor. IEEE MICRO, 28-40, April 1996.
S. Adve and K. Gharachorloo, Shared Memory Consistency Models: A Tutorial, IEEE Computer, 1995.
T. Austin, E. Larson, and D. Ernst, SimpleScalar: An Infrastructure for Computer System Modeling , IEEE Computer, Feb. 2002.
R.E. Wunderlich, T. Wenisch, B. Falsafi and J.C.Hoe, SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling, 30th ISCA 2003.
S. Rixner, Dally, Kapasi, Mattson, Owens, Memory Access Scheduling, 27th ISCA, 2000
C. Liu, A. Sivasubramaniam, and Mahmut Kandemir, Organizing the Last Line of Defense before Hitting the MemoryWall for CMPs, HPCA 2004
Moinuddin K. Qureshi , Yale N. Patt, Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , Micro 2006
J. Nickolls, and W. J. Dally, The GPU Computing Era, IEEE Micro, March/April 2010.
A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong and T. M. Aamodt, Analyzing CUDA Workloads Using a Detailed GPU Simulator, Performance Analysis of Systems and Software, 2009. ISPASS 2009.
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi, Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware, In the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), March 2012.

Discussion Papers

D. Genbrugge, S. Eyerman and L. Eeckhout, Interval simulation : Raising the level of abstraction in architectural simulation 16th International symposium on High-Performance Computer Architecture (HPCA), pp. 307-318 (2010)
M. K. Qureshi V. Srinivasan J. A. Rivers, Scalable High Performance Main Memory System Using Phase-Change Memory Technology, ISCA 2009.
A. Jog, O. Kayiran, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das, Orchestrated Scheduling and Prefetching for GPGPUs, In Proc. of the 40th International Symposium on Computer Architecture (ISCA), Tel-Aviv, Israel, June 2013
Aamer Jaleel, Joseph Nuzman, Adrian Moga, Simon C. Steely Jr., Joel Emer, High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches, In Proc. of the 21st International Symposium on High Performance Computer Architecture (HPCA), Feb. 2015

Time :

Evaluation:

Mid-term : 20 marks (on Sept. 10 and Oct. 29, 2015)
Term-Project: 30 marks

Proposal Due by Oct. 15, 20015
First Review by Nov. 13, 2015
Report and Demo/Presentation : Nov. 27, 2015

Final Exam : 50 marks (on Dec. 2, 2015)

Lecture Notes:

L0: Introduction pps pdf
L1: ILP Processors pps pdf
L2: Parallel Architecture pps pdf

General Rules: