Exploring Energy-Performance Trade-offs for Heterogeneous Interconnect Clustered VLIW Processors
Rahul Nagpal and Y.N. Srikant
IISc-CSA-TR-2005-14
(October 2005) Available formats: [pdf]
Filed on October 18, 2005
Updated on February 28, 2006
Clustered architecture processors are preferred for embedded systems because centralized
register file architectures scale poorly in terms of clock rate, chip area, and power
consumption. Although clustering helps by improving clock speed, reducing energy consumption
of the logic, and making design simpler, it introduces extra overheads by way of inter-cluster
communication. This communication happens over long global wires having high load
capacitance which leads to delay in execution and significantly high energy consumption.
Technological advancements permit design of a variety of clustered architectures by varying
the degree of clustering and the type of interconnects. In this report, we focus on exploring
energy performance trade-offs in going from a unified VLIW architecture to different types
of clustered VLIW architectures. We propose a new instruction scheduling algorithm that
exploits scheduling slacks of instructions and communication slacks of data values together
to achieve better energy-performance trade-offs for clustered architectures. Our instruction
scheduling algorithm for clustered architectures with heterogeneous interconnect
achieves 35% and 40% reduction in communication energy, whereas the overall energy-delay
product improves by 4.5% and 6.5% respectively for 2 cluster and 4 cluster machines with
marginal 1.6% and 1.1% increase in execution time. Our test bed uses the Trimaran compiler
infrastructure.
Please bookmark this technical report as http://aditya.csa.iisc.ernet.in/TR/2005/14/.Problems ? Contact techrep@csa.iisc.ernet.in
[Updated at 2009-10-22T06:42Z]