This is an archive of the discontinued LLVM Phabricator instance.

[LAA] Avoid non-determinism due to blocks order. PR56672
ClosedPublic

Authored by mkazantsev on Jul 25 2022, 6:29 AM.

Details

Summary

As test in PR56672 shows, LAA produces different results which lead to either
positive or negative vectorization decisions depending on the order of blocks
in loop. The exact reason of this is not clear to me, however this makes investigation
of related bugs extremely complex.

Current order of blocks in the loop is arbitrary. It may change, for example, if loop
info analysis is dropped and recomputed. Seems that it interferes with LAA's logic.
This patch fixates the traversal order of blocks in loops, making it RPOT.

Diff Detail

Event Timeline

mkazantsev created this revision.Jul 25 2022, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 6:29 AM
mkazantsev requested review of this revision.Jul 25 2022, 6:29 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 6:29 AM
This revision is now accepted and ready to land.Jul 25 2022, 7:45 AM
fhahn requested changes to this revision.Jul 25 2022, 7:53 AM

It would be good to clean up the test first.

llvm/test/Transforms/LoopVectorize/X86/pr56672.ll
2 ↗(On Diff #447307)

As this test a LAA issue, could you just use print<loop-accesses> instead?

46 ↗(On Diff #447307)

adressspace not needed.

48 ↗(On Diff #447307)

please replace the undef to avoid the test having UB

54 ↗(On Diff #447307)

Please avoid loads from null, as this is UB.

This revision now requires changes to proceed.Jul 25 2022, 7:53 AM

I am not sure this patch completely fixes the underlying issue causing the mis-compile. I should know more by tomorrow and also be able to add a few cleaned up test cases.

llvm/test/Transforms/LoopVectorize/X86/pr56672.ll
2 ↗(On Diff #447307)

use print<loop-accesses>

The pass name is actually print-access-info, which should probably be fixed.

mkazantsev updated this revision to Diff 447571.EditedJul 25 2022, 10:51 PM
mkazantsev marked 5 inline comments as done.

Updated test as @fhahn has proposed and moved it to LAA tests. Floran, if you can take a look into it and confirm or disprove this is a correct fix, I'd appreciate that. I'm not familiar with LAA's algorithm, so I'm just seeing this as blackbox which behaves non-deterministically depending on blocks order, and I don't know if it's expected or not. If there is a more targeted fix, that'd be nice. I'll hold this patch for a while.

fhahn accepted this revision.Jul 27 2022, 10:23 AM

Updated test as @fhahn has proposed and moved it to LAA tests. Floran, if you can take a look into it and confirm or disprove this is a correct fix, I'd appreciate that. I'm not familiar with LAA's algorithm, so I'm just seeing this as blackbox which behaves non-deterministically depending on blocks order, and I don't know if it's expected or not. If there is a more targeted fix, that'd be nice. I'll hold this patch for a while.

Thanks for your patience! After taking a closer look, it seems like the patch is a good first step and clearly improves the current situation. Fixing the remaining issue will require more time, and visiting the blocks in deterministic order is helpful in general.

LGTM, thanks!

llvm/test/Analysis/LoopAccessAnalysis/pr56672.ll
6

nit: shouldn't be needed

8

nit: Remove reference to vectorizer?

20

nit: test case might be slightly easier to read if the block would come after the loop header (%bb4), potentially also with more descriptive names for the blocks.

33

nit: could this just be an unconditional branch? If not, bb10 could be removed.

This revision is now accepted and ready to land.Jul 27 2022, 10:23 AM

In general, I prefer not to refer to any issue that involves different inputs leading to different outputs as "non-determinism". It might be bad for various reasons to make the output sensitive to minor details of the input, but actual non-determinism is a much more serious problem because the output isn't reproducible.

In general, I prefer not to refer to any issue that involves different inputs leading to different outputs as "non-determinism". It might be bad for various reasons to make the output sensitive to minor details of the input, but actual non-determinism is a much more serious problem because the output isn't reproducible.

Agreed here. Better wording would be to say that we're removing a block order sensitivity.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
2131–2135

nit: replace deterministic with "fixed".

mkazantsev marked an inline comment as done.Jul 27 2022, 11:15 PM
mkazantsev added inline comments.
llvm/test/Analysis/LoopAccessAnalysis/pr56672.ll
6

Not sure, but let's try to remove.

8

Ok

20

No, non-standard blocks order is what seems to be causing the bug. We can't rearrange them. Renamin will do.

33

Yup, we can.

mkazantsev marked an inline comment as done.Jul 27 2022, 11:17 PM
mkazantsev added inline comments.
llvm/test/Analysis/LoopAccessAnalysis/pr56672.ll
33

Rrrr, scratch that. Thank you for finding this. No, removal of this branch still reproduces the bug, but the fix doesn't help for it.

mkazantsev added inline comments.Jul 27 2022, 11:21 PM
llvm/test/Analysis/LoopAccessAnalysis/pr56672.ll
33

Let's frame this patch as "algo depends on blocks order" issue. I will file this separately, along with modified test. The answer for now is "we can't remove this to show *this* bug".

This revision was landed with ongoing or failed builds.Jul 27 2022, 11:37 PM
This revision was automatically updated to reflect the committed changes.