This is an archive of the discontinued LLVM Phabricator instance.

[LoopInterchange] Prevent Loop Interchange for non-affine value store to affine access
Needs ReviewPublic

Authored by geng on Oct 28 2020, 2:31 AM.

Details

Summary
In the following case
  #include <stdio.h>
  int a;
  int c[4][4];

  void test_deps() {
      for (int i = 0; i <= 3; i++) {
          for (int j = 0; j <= 3; j++) {
              a ^= 0x1000;
              c[j][i] = a;
          }
      }
  }

  main() {
      test_deps();
      for (int k = 0; k < 4; k++)
      printf("%d\n", c[0][k]);
  }

If we do the loop interchange, there will be correctness issues. However, the loop interchange can not be prevent by dependence matrix. The reason is that we have only one multi-dimensional array access c[j][i] within the loop, other accesses are scalar access. This patch is supposed to prevent loop interchange from the above case. We think the root cause of correctness issue is that the non-affine value (scalar) is stored to affine address (multi-dimensional array access).

This patch closes https://bugs.llvm.org/show_bug.cgi?id=47915

Diff Detail

Event Timeline

geng created this revision.Oct 28 2020, 2:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2020, 2:31 AM
geng requested review of this revision.Oct 28 2020, 2:31 AM
fhahn added a comment.Oct 28 2020, 1:13 PM

Thanks for the patch. Some initial comments inline.

llvm/lib/Transforms/Scalar/LoopInterchange.cpp
951

nit: re-flow comments

960

Nit: start comment with a capital letter, end with period (.).

961

Usually early returns are used to limit the indentation level. Could this just be something like

if (!isa<Instruction>(*Op) || OutLoop->isLoopInvariant(Op))
  return false;

and drop the else { block?

983

just return the expression? It would also be good to add a brief comment explaining the condition.

991

could this just be something like return any_of(I->operands(), [](Use &U) {});?

llvm/test/Transforms/LoopInterchange/non-affine-store-to-affine.ll
2

you need to add ; REQUIRES: asserts when using -debug, so the test is not run if assertions. are disabled.

42

A few small suggestions for the test case.

it would be helpful to update the names of the BBs to include more information, e.g. outer.header, inner.body or something like that. And re-order them in the file so they are in a more natural order like preheader, outer header, inner, outer.latch, exit.

Might also be good to have an exit block for the loop nest that returns and maybe clean up the names of the values (e.g. use %j for %indvars.iv.i %j.next for the step).

geng updated this revision to Diff 301533.EditedOct 29 2020, 1:02 AM

Thanks for the comments! Changed the code according to the initial comments

fhahn added a comment.Nov 9 2020, 12:26 PM

Thanks for the update!

llvm/lib/Transforms/Scalar/LoopInterchange.cpp
969

I'm not sure I entirely follow the logic here. Can we just check if the PHI is affine by getting the SCEV and checking if it is an affine addrec? also, we might check sibling loops that are not related by using subloops? (granted, sibling sub-loops are not supported at the moment.) If we would start at InnerLoop and traverse the getParent chain until we reach OuterLoop, we should only visit the relevant loops?

Would it make sense to flip the logic to instead of looking backwards to just iterate over a worklist starting at affine PHIs in inner and outer and then propagate the affine property to all uses and add newly discovered affine nodes to the worklist? Might need a bit of extra logic to deal with nodes where the operands are still pending.

Then we just need to check the set of visited node to see if we are dealing with affine operands for the stores?

jdoerfert requested changes to this revision.Nov 9 2020, 12:51 PM
jdoerfert added a subscriber: jdoerfert.

I don't understand the reasoning here, even if I assume for a moment the underlying idea is sound, the patch doesn't seem to be.
For example, it will only check stores of a certain shape, why can we store anything to non-affine accesses? What about non-store accesses?
This looks too much like ad-hoc pattern matching.

The problem, as I see it, is not about "storing" but the fact that the loop carried dependence through a (caused by a ^= 0x1000;) is not taken into account properly. While by itself it looks like a reduction dependency (see for example [0]), it cannot be ignored as intermediate values escape. I haven't read all of loop interchange to determine where/why the dependency is broken, but that is the reason this breaks.

A sketch for an example without a "store of a non-affine value" would look something like this:

void test_deps() {
    for (int i = 0; i <= 3; i++) {
        for (int j = 0; j <= 3; j++) {
            a ^= 0x1000;
            if (a == ...)
              c[j][i] = 42;
        }
    }
}

[0] https://arxiv.org/abs/1505.07716

llvm/lib/Transforms/Scalar/LoopInterchange.cpp
1020

This breaks out of this loop with both true and false return value. That cannot be correct.

This revision now requires changes to proceed.Nov 9 2020, 12:51 PM
mdchen added a subscriber: mdchen.Nov 24 2020, 3:39 AM
geng updated this revision to Diff 308272.Nov 30 2020, 12:15 AM
geng marked 7 inline comments as done.