This is an archive of the discontinued LLVM Phabricator instance.

[MLIR][Affine] Add affine.for normalization support
ClosedPublic

Authored by navdeepkk on Nov 27 2020, 7:57 AM.

Details

Summary

Add support to normalize affine.for ops i.e., convert the lower bound to zero
and loop step to one. The Upper bound is set to the trip count of the loop.
The exact value of loopIV is calculated just inside the body of affine.for.
Currently loops with lower bounds having single result are supported. No such
restriction exists on upper bounds.

Diff Detail

Event Timeline

navdeepkk created this revision.Nov 27 2020, 7:57 AM
navdeepkk requested review of this revision.Nov 27 2020, 7:57 AM
bondhugula accepted this revision.Dec 3 2020, 5:49 AM

This looks great! Thanks.

mlir/lib/Dialect/Affine/Transforms/AffineLoopNormalize.cpp
82–88

has only a single iteration

84

Please add a line to say what normalized is.

84

-> with a single result.

97

You may want to mention that loops with a max lower bound can't be normalized without additional support (like affine.execute_region's). (Actually those loops whose max's could be hoisted to the outermost position - outside any loops - could still be normalized.) You can just mention this in a comment.

This revision is now accepted and ready to land.Dec 3 2020, 5:49 AM
navdeepkk updated this revision to Diff 309760.EditedDec 5 2020, 10:07 PM
navdeepkk marked 4 inline comments as done.

Address comments on diff 308060

bondhugula accepted this revision.Dec 7 2020, 1:26 AM
This revision was automatically updated to reflect the committed changes.

Hey guys,
I found a problem with this pass; if you feed it the following loop:

func @main(%arg0: memref<10xf32>, %x: f32) {
  %c0 = constant 0 : index
  %c10 = constant 0 : index
  affine.for %i = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%c10) {
    affine.store %x, %arg0[%i] : memref<10xf32>
  }
  return
}

then it will be collapsed to a zero-iteration loop:

#map0 = affine_map<(d0, d1) -> (0)>
#map1 = affine_map<(d0, d1) -> (d1 + d0)>
module  {
  func @main(%arg0: memref<10xf32>, %arg1: f32) {
    %c0 = constant 0 : index
    %c0_0 = constant 0 : index
    affine.for %arg2 = 0 to #map0(%c0_0, %c0) {
      %0 = affine.apply #map1(%c0_0, %arg2)
      affine.store %arg1, %arg0[%0] : memref<10xf32>
    }
    return
  }
}

As a workaround it is possible to run canonicalization before normalization.
Should I report it somewhere like bugzilla (it doesn't seem to be very active though)?

Hey guys,
I found a problem with this pass; if you feed it the following loop:

func @main(%arg0: memref<10xf32>, %x: f32) {
  %c0 = constant 0 : index
  %c10 = constant 0 : index
  affine.for %i = affine_map<(d0) -> (d0)>(%c0) to affine_map<(d0) -> (d0)>(%c10) {
    affine.store %x, %arg0[%i] : memref<10xf32>
  }
  return
}

then it will be collapsed to a zero-iteration loop:

#map0 = affine_map<(d0, d1) -> (0)>
#map1 = affine_map<(d0, d1) -> (d1 + d0)>
module  {
  func @main(%arg0: memref<10xf32>, %arg1: f32) {
    %c0 = constant 0 : index
    %c0_0 = constant 0 : index
    affine.for %arg2 = 0 to #map0(%c0_0, %c0) {
      %0 = affine.apply #map1(%c0_0, %arg2)
      affine.store %arg1, %arg0[%0] : memref<10xf32>
    }
    return
  }
}

As a workaround it is possible to run canonicalization before normalization.
Should I report it somewhere like bugzilla (it doesn't seem to be very active though)?

Hi sgrechanik,
Thanks for pointing this pointing this out. What you are saying is correct, but your test case needs a slight correction. %c10 needs to be 10 not 0. Right now we can use --canoniclaize before running this pass. Although I will try to incorporate this in the pass itself.