Home » Event

Seminars

View all Seminars | Download ICal for this event

Scalable and Effective Polyhedral Auto-transformation without using Integer Linear Programming

Series: Ph.D. Colloquium

Speaker: Mr. Aravind Acharya Ph.D Student Dept. of CSA

Date/Time: Jan 08 14:00:00

Location: CSA Seminar Hall (Room No. 254, First Floor)

Faculty Advisor: Prof. Uday Kumar Reddy .B

Abstract:
In recent years, polyhedral auto-transformation frameworks have gained
significant interest in general-purpose compilation, because of their ability
to find and compose complex loop transformations that extract high performance
from modern architectures. These frameworks automatically find loop
transformations that either enhance locality, parallelism and minimize latency
or a combination of these. Recently, focus has also shifting on developing
intermediate representations, like MLIR, where complex loop transformations and
data-layout optimizations can be incorporated efficiently in a single common
infrastructure.

Polyhedral auto-transformation frameworks typically rely on complex Integer
Linear Programming (ILP) formulations to find affine loop transformations.
However, construction and solving these ILP problems is time consuming which
increases compilation time significantly. Secondly, loop fusion heuristics in
these auto-transformation frameworks are ad hoc, and modeling loop fusion
efficiently would further degrade compilation time.

In this thesis, we first relax the ILP formulation in the Pluto algorithm. We
show that even though LP relaxation reduces the time complexity of the problem,
it does not reduce the compilation time significantly because of the complex
construction of constraints. We also observe that due to relaxation,
sub-optimal loop transformations that result in significant performance degradation
may be obtained. Hence, we propose a new polyhedral auto-transformation
framework, called Pluto-lp-dfp, that finds efficient affine loop
transformations quickly,while relying on Pluto's cost function. The framework
decouples auto-transformation into three components: (1) loop fusion and
permutation (2) loop scaling and shifting and (3) loop skewing components.
In each phase, we solve a Linear Programming (LP) formulation instead of an ILP,
thereby resulting in a polynomial time affine transformation algorithm. We
propose a data structure, called fusion conflict graph, that allows us to model
loop fusion to work in tandem with loop permutations, loop scaling and loop
shifting transformations. We describe three greedy fusion heuristics,
namely,max-fuse, typed-fuse and hybrid-fuse, of which, the hybrid-fuse and
typed-fuse models incorporate parallelism preserving fusion heuristic without
significant compilation time overhead. We also provide a characterization of
time-iterated stencils that have tile-wise concurrent start and employ a
different fusion heuristic in such programs. In our experiments, we demonstrate
that Pluto-lp-dfp framework not only finds loop transformations quickly,
resulting in significant improvements in compilation time, but also outperforms
state-of-the-art polyhedral auto-parallelizers in terms of execution time of
the transformed program. We observe that Pluto-lp-dfp is faster than PoCC and
Pluto by a geomean factor of 461x and 2.2x in terms of compilation time. On
larger NAS benchmarks, Pluto-lp-dfp was faster than Pluto by 246x. PoCC failed
to find a transformation in a reasonable amount of time in these cases. In
terms of execution time, the hybrid-fuse variant in Pluto-lp-dfp outperforms
PoCC by a geomean factor 1.8x, with over 3x improvements in some cases. We
also observe that Pluto-lp-dfp is faster than an improved version of Pluto by a
factor of 7%, with a maximum performance improvement of 2x.

Department of Computer Science and Automation

Seminars

Scalable and Effective Polyhedral Auto-transformation without using Integer Linear Programming

Explore

Quick Links

Resources

Seminars Calendar