This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Remove redundant loads
ClosedPublic

Authored by ayzhuang on May 27 2021, 4:55 PM.

Diff Detail

Event Timeline

ayzhuang created this revision.May 27 2021, 4:55 PM
ayzhuang requested review of this revision.May 27 2021, 4:55 PM
bondhugula requested changes to this revision.May 30 2021, 3:08 AM

Thanks for extending this pass. Comments below.

mlir/lib/Transforms/MemRefDataFlowOpt.cpp
286

We should have a doc comment here - if needed, this can be brief and refer to the larger comment above.

299–312

You can actually do both in one walk, i.e., if the store -> load forward doesn't forward it, immediately run the loadCSE. (The forwarding can be made to return a LogicalStatus to check if it happened.)

302

Comment here please. This isn't covered by the comment above.

302–303

Please use llvm::is_contained.

321–339

This is a large block of duplicated code that should be factored out and shared. You could have a function that given the storeOp and loadOp returns true/false as to whether the former may be reaching the latter, and this can be used to guard the collection of depSrcStores.

346–349

Actually post dominance isn't needed here - just dominance is sufficient (but has to be used differently) and in fact will lead to a less conservative condition. But this can be fixed in a follow-up patch since the check for forwardStoreToLoad can be similarly made stronger as well freed from post-dominance info. (I already have a patch and was incidentally planning to submit it; so I can take care of this in a follow-up.)

This revision now requires changes to proceed.May 30 2021, 3:08 AM
ayzhuang updated this revision to Diff 349057.Jun 1 2021, 12:23 PM

Address code review comments to use a utility function, combine walks, and add more comments.

ayzhuang marked 4 inline comments as done.Jun 1 2021, 12:37 PM
ayzhuang added inline comments.
mlir/lib/Transforms/MemRefDataFlowOpt.cpp
346–349

With your new patch, will %v1 be replaced?

affine.for %i0 = 0 to 10 {
  affine.for %i1 = 0 to 9 {
    %a1 = affine.apply affine_map<(d0, d1) -> (d1 + 1)> (%i0, %i1)
    affine.store %c7, %m[%i0, %a1] : memref<10x10xf32>
    %v0 = affine.load %m[%i0, %i1] : memref<10x10xf32>
  }
}

affine.for %i0 = 0 to 10 {
  affine.for %i1 = 0 to 9 {
    affine.store %c7, %m[%i0, %i1] : memref<10x10xf32>
    %v1 = affine.load %m[%i0, %i1] : memref<10x10xf32>
  }
}
ayzhuang updated this revision to Diff 349125.Jun 1 2021, 4:08 PM

Add shape checks needed for affine vector loads and stores.

bondhugula added inline comments.Jun 2 2021, 5:20 AM
mlir/lib/Transforms/MemRefDataFlowOpt.cpp
346–349

Yes. More simply put, patterns like this as well:

affine.store ...
affine.for ... {
   affine.load ...
   affine.store ...
   ... = affine.load 
   ...
}

With the current pass, the store to load in the nest won't happen. You don't need the last store to postdominate all other relevant stores. The new condition will similarly make load CSE more powerful.

bondhugula accepted this revision.Jun 2 2021, 5:23 AM

Thanks for addressing everything - LGTM!

This revision is now accepted and ready to land.Jun 2 2021, 5:23 AM

@bondhugula Thank you for the review. We'll merge it tomorrow if there are no more comments.

ayzhuang added inline comments.Jun 2 2021, 9:37 AM
mlir/lib/Transforms/MemRefDataFlowOpt.cpp
346–349

Great!

Very helpful! Looks great, thanks.

This revision was automatically updated to reflect the committed changes.