Add support to normalize affine.for ops i.e., convert the lower bound to zero
and loop step to one. The Upper bound is set to the trip count of the loop.
The exact value of loopIV is calculated just inside the body of affine.for.
Currently loops with lower bounds having single result are supported. No such
restriction exists on upper bounds.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This looks great! Thanks.
mlir/lib/Dialect/Affine/Transforms/AffineLoopNormalize.cpp | ||
---|---|---|
82–83 | has only a single iteration | |
84 | Please add a line to say what normalized is. | |
84 | -> with a single result. | |
95 | You may want to mention that loops with a max lower bound can't be normalized without additional support (like affine.execute_region's). (Actually those loops whose max's could be hoisted to the outermost position - outside any loops - could still be normalized.) You can just mention this in a comment. |
Hey guys,
I found a problem with this pass; if you feed it the following loop:
func @main(%arg0: memref<10xf32>, %x: f32) { %c0 = constant 0 : index %c10 = constant 0 : index affine.for %i = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%c10) { affine.store %x, %arg0[%i] : memref<10xf32> } return }
then it will be collapsed to a zero-iteration loop:
#map0 = affine_map<(d0, d1) -> (0)> #map1 = affine_map<(d0, d1) -> (d1 + d0)> module { func @main(%arg0: memref<10xf32>, %arg1: f32) { %c0 = constant 0 : index %c0_0 = constant 0 : index affine.for %arg2 = 0 to #map0(%c0_0, %c0) { %0 = affine.apply #map1(%c0_0, %arg2) affine.store %arg1, %arg0[%0] : memref<10xf32> } return } }
As a workaround it is possible to run canonicalization before normalization.
Should I report it somewhere like bugzilla (it doesn't seem to be very active though)?
Hi sgrechanik,
Thanks for pointing this pointing this out. What you are saying is correct, but your test case needs a slight correction. %c10 needs to be 10 not 0. Right now we can use --canoniclaize before running this pass. Although I will try to incorporate this in the pass itself.
has only a single iteration