This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Fix alias checking with potential later stack reuse
AbandonedPublic

Authored by yonghong-song on Oct 9 2020, 10:24 AM.

Details

Summary

This is to address the bug:

https://bugs.llvm.org/show_bug.cgi?id=47591

Currently, when selection dag tries to check whether two frame index
accesses alias to each other, it assumes it cannot alias if for two
different frame indices at least one is not fixed. This does not take into
account later frames can be reused if they are disjoint.

This issue is exposed to BPF as it uses ILP scheduler and x86 does
not have issue as it uses Source. Even x86 uses ILP, it does not
expose the issue as its target code is different from BPF.
I have a BPF backend patch to also use Source scheduler

https://reviews.llvm.org/D88525

and it fixed the issue. But I feel this is not the right fix and
it appears to me that the optimized dag does not sound right.

The patch fixed the issue by assuming two frames may alias due to
possible future stack reuse. The fix caused some failures
for X86 target due to different code sequence. Will fix tests
once we got consensus about what is the right fix for this problem.

TODO: fix tests
tested with llvm/tests and found the following failures on x86:

Failed Tests (25):
  LLVM :: CodeGen/X86/2008-05-12-tailmerge-5.ll
  LLVM :: CodeGen/X86/alias-static-alloca.ll
  LLVM :: CodeGen/X86/arg-copy-elide-win64.ll
  LLVM :: CodeGen/X86/atomic-fp.ll
  LLVM :: CodeGen/X86/atomic-mi.ll
  ...

Diff Detail

Event Timeline

yonghong-song created this revision.Oct 9 2020, 10:24 AM
Herald added a project: Restricted Project. · View Herald Transcript
yonghong-song requested review of this revision.Oct 9 2020, 10:24 AM

ping @courbet @niravd, any comments on this patch?

niravd added inline comments.Oct 14 2020, 8:39 AM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGAddressAnalysis.cpp
116

Removing this would be a bit of a shame as it has some nice improvements in other backends. (Also, is it only showing up in BPF or can you replicate this on X86/ARM?

I don't have the best knowledge on FrameInfo structures, so we should probably loop in someone who gets that better, but we should see if we can extract disjointed information out of the frame info.

At the very least, we should be able to retain that they can't alias if exactly one is fixed. However, the vast majority of improvements from this was catching disjointness of two separate allocations to see if we can keep this.

130

Can you change this so we check the two Index value match no matter what if we show non-aliasing? We really should check the both base and Index if we're going to show non-aliasing.

if (((BasePtr0.getIndex() == BasePtr1.getIndex()) &&

((IsFI0 != IsFI1) || (IsGV0 != IsGV1) || (IsCV0 != IsCV1)) &&
yonghong-song added inline comments.Oct 15 2020, 12:59 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGAddressAnalysis.cpp
116

Removing this would be a bit of a shame as it has some nice improvements in other backends. (Also, is it only showing up in BPF or can you replicate this on X86/ARM?

I did not check ARM, but the issue did not show up on X86. X86 implemented enableMachineScheduler() returning true, and this ensures
Source scheduler is used for selection dag,
createDefaultScheduler():

...
 if (OptLevel == CodeGenOpt::None ||
     (ST.enableMachineScheduler() &&   ST.enableMachineSchedDefaultSched()) ||
     TLI->getSchedulingPreference() == Sched::Source)
   return createSourceListDAGScheduler(IS, OptLevel);

...

For Source dag scheduler, if there are multiple choices, it perfers to follow lexicographic order. So this won't be an issue for X86.

ARM and AArch64 has similar override so it also uses Source dag scheduler and it won't have issue either.

I have a similar patch for BPF backend to also use Source scheduler:

https://reviews.llvm.org/D88525

But I feel that does not fix the root cause, so hence this patch. But Source scheduler should fix the issue for disjoint allocations since even these allocations are reused, Source schedule ensures early load/stores happen before later load/stores...

I don't have the best knowledge on FrameInfo structures, so we should probably loop in someone who gets that better, but we should see if we can extract disjointed information out of the frame info.

At the very least, we should be able to retain that they can't alias if exactly one is fixed. However, the vast majority of improvements from this was catching disjointness of two separate allocations to see if we can keep this.

Yes, the majority cases should be from two separate allocations. But theoretically separate allocations can be reused as long as lifetime start/end are honored for allocations, right? There are some x86 tests failed with my patch, I could spend some time on these failed tests to find patterns so we can try to maintain good cases...

130

Yes, I can do this.

niravd added a comment.EditedOct 15 2020, 1:25 PM

As an alternative, we could update ImproveChain (and visitLIFETIME_END) to limit the aliasing around lifetime_start / end to disallow improving the chain dependence of a mem op node from a different lifetime node that may alias. That should prohibit access from two aliasable frame indices from being concurrent while still allowing us to leverage that disjoint allocs should be disjoint.

As an alternative, we could update ImproveChain (and visitLIFETIME_END) to limit the aliasing around lifetime_start / end to disallow improving the chain dependence of a mem op node from a different lifetime node that may alias. That should prohibit access from two aliasable frame indices from being concurrent while still allowing us to leverage that disjoint allocs should be disjoint.

I am not familiar with selection dag code base. Could you help draft a patch for this? I can help do some testing. Feel free to use my test case at https://reviews.llvm.org/D88525.

ecnelises resigned from this revision.Oct 16 2020, 12:01 AM
dxu added a subscriber: dxu.Nov 30 2020, 11:23 AM

As an alternative, we could update ImproveChain (and visitLIFETIME_END) to limit the aliasing
around lifetime_start / end to disallow improving the chain dependence of a mem op node
from a different lifetime node that may alias. That should prohibit access from two aliasable
frame indices from being concurrent while still allowing us to leverage that disjoint allocs should be disjoint.

@niravd Any update on this?

@niravd Looks like https://reviews.llvm.org/D91833 fixed the issue. I have created a BPF patch https://reviews.llvm.org/D92451 with a test case so if anything happens we can detect the regression. Could you take a look whether https://reviews.llvm.org/D91833 can truely fix the issue?

If D91833 indeed fixed the issue, do you think we can backport it to llvm11?

yonghong-song abandoned this revision.Dec 30 2020, 4:20 PM