Page MenuHomePhabricator

Fix EarlyCSE to intersect aliasing metadata.
Needs ReviewPublic

Authored by sheredom on Jan 24 2020, 3:52 AM.

Details

Summary

Further to my post on the mailing list, EarlyCSE was not merging the aliasing information of two loads/stores that it wanted to fold, meaning that it was random (program order) which one survived. This meant that aliasing information would be sometimes propagated from inlined-restrict functions, when it should not be.

Diff Detail

Event Timeline

sheredom created this revision.Jan 24 2020, 3:52 AM
nlopes added a subscriber: nlopes.Jan 24 2020, 8:08 AM

I'm not familiar with the code of this pass, but is there a cheap way of identifying that the two operations are in the same basic block?
If so, you could take the intersection of the aliasing information rather than the union. Because if both ops are guaranteed to execute then the tightest aliasing still has to hold.
(being in the same BB doesn't imply that both instructions are executed, but there's code in ValueTracking perhaps that can check that)

nikic added a subscriber: nikic.Jan 24 2020, 8:19 AM

Can you explain in more detail why the current behavior is incorrect? Why is using the aliasing information in program order wrong? If the first load is noalias, then we should be able to deduplicate to that noalias load. If the second load is noalias, then generally we can't assume the first load is also noalias. Unless there is a must-exec relation between them, as @nlopes mentioned.

The patch looks ok to me. Some extra testcases to ensure that common !noalias / !alias.scope metadata is kept could be useful ?

Can you explain in more detail why the current behavior is incorrect? Why is using the aliasing information in program order wrong? If the first load is noalias, then we should be able to deduplicate to that noalias load. If the second load is noalias, then generally we can't assume the first load is also noalias. Unless there is a must-exec relation between them, as @nlopes mentioned.

Hal Finkel provided following example on the mailing list:

int * restrict r = a;
...
int x = noaliasing ? *r : *a;

This shows that the intersection of '*r' and '*a' must be taken.

Can you explain in more detail why the current behavior is incorrect? Why is using the aliasing information in program order wrong? If the first load is noalias, then we should be able to deduplicate to that noalias load. If the second load is noalias, then generally we can't assume the first load is also noalias. Unless there is a must-exec relation between them, as @nlopes mentioned.

We have two issues here;

  1. For any pass, if it changes any instruction with metadata that it does not understand, it should drop that metadata. That's the contract that we need to maintain.
  2. For this AA metadata in particular, we need to intersect the metadata, and this is true to preserve semantics for almost all other known metadata (AA, profiling, fp-precision, debug) although likely for it is not correct to keep only one, or the other, because of potential dynamic dependencies on the results.

For AA, as an example, we can have something like this:

int * restrict r = a;
int x = should_alias ? *a : *r;

(essentially the same is true using casts as TBAA).

In any case, we should be calling the function llvm::combineMetadataForCSE here. Please update the patch to do that.

nikic added a comment.Jan 24 2020, 8:58 AM

Hal Finkel provided following example on the mailing list:

int * restrict r = a;
...
int x = noaliasing ? *r : *a;

This shows that the intersection of '*r' and '*a' must be taken.

So, I guess that the reasoning here is that noalias loads are speculatable and noalias violation is thus poison (not immediate UB), so we can't make inferences from one load to the other. Does that capture it?

Hal Finkel provided following example on the mailing list:

int * restrict r = a;
...
int x = noaliasing ? *r : *a;

This shows that the intersection of '*r' and '*a' must be taken.

So, I guess that the reasoning here is that noalias loads are speculatable and noalias violation is thus poison (not immediate UB), so we can't make inferences from one load to the other. Does that capture it?

Yes. They don't need to be speculatable, but they certainly could be.

Can you explain in more detail why the current behavior is incorrect? Why is using the aliasing information in program order wrong? If the first load is noalias, then we should be able to deduplicate to that noalias load. If the second load is noalias, then generally we can't assume the first load is also noalias. Unless there is a must-exec relation between them, as @nlopes mentioned.

We have two issues here;

  1. For any pass, if it changes any instruction with metadata that it does not understand, it should drop that metadata. That's the contract that we need to maintain.
  2. For this AA metadata in particular, we need to intersect the metadata, and this is true to preserve semantics for almost all other known metadata (AA, profiling, fp-precision, debug) although likely for it is not correct to keep only one, or the other, because of potential dynamic dependencies on the results.

    For AA, as an example, we can have something like this:

    int * restrict r = a; int x = should_alias ? *a : *r;

    (essentially the same is true using casts as TBAA).

    In any case, we should be calling the function llvm::combineMetadataForCSE here. Please update the patch to do that.

So when I use combineMetadataForCSE it breaks the invariant loads tests because the code there just says 'if one thing is invariant load and the other is not, throw the invariant load metadata away'.

What would be the correct fix for this? I don't know a ton about the intracacies of invariant loads.

Can you explain in more detail why the current behavior is incorrect? Why is using the aliasing information in program order wrong? If the first load is noalias, then we should be able to deduplicate to that noalias load. If the second load is noalias, then generally we can't assume the first load is also noalias. Unless there is a must-exec relation between them, as @nlopes mentioned.

We have two issues here;

  1. For any pass, if it changes any instruction with metadata that it does not understand, it should drop that metadata. That's the contract that we need to maintain.
  2. For this AA metadata in particular, we need to intersect the metadata, and this is true to preserve semantics for almost all other known metadata (AA, profiling, fp-precision, debug) although likely for it is not correct to keep only one, or the other, because of potential dynamic dependencies on the results.

    For AA, as an example, we can have something like this:

    int * restrict r = a; int x = should_alias ? *a : *r;

    (essentially the same is true using casts as TBAA).

    In any case, we should be calling the function llvm::combineMetadataForCSE here. Please update the patch to do that.

So when I use combineMetadataForCSE it breaks the invariant loads tests because the code there just says 'if one thing is invariant load and the other is not, throw the invariant load metadata away'.

What would be the correct fix for this? I don't know a ton about the intracacies of invariant loads.

I believe that invariant loads are a bit special because the invariantness of a load is not allowed to have any control dependencies. The LangRef says, "If a load instruction tagged with the !invariant.load metadata is executed, the optimizer may assume the memory location referenced by the load contains the same value at all points in the program where the memory location is known to be dereferenceable; otherwise, the behavior is undefined." Thus, I think that it's valid for (invariant + nothing) to be (invariant) so long as both are guaranteed to be executed.

I would, however, do this as two separate patches. One which updates EarlyCSE to use combineMetadataForCSE (and XFAILs the test), and a second which proposes to update combineMetadataForCSE and fixes the test (and then we can figure out in that review what to do, because we may need a dominance check or a guaranteed-to-execute check or similar).

sheredom updated this revision to Diff 240896.Jan 28 2020, 8:40 AM

Use combineMetadataForCSE to combine the metadata.

hfinkel added inline comments.Feb 18 2020, 8:50 AM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
1099

No { } around single statement if.

1100

Why is DoesKMove true here? OpInst dominated Inst in the original IR, no?

sheredom marked an inline comment as done.Feb 19 2020, 2:25 AM
sheredom added inline comments.
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
1100

My understanding (albeit naive) of combineMetadataForCSE comes from the fact that at the definition the parameter is called KDominatesJ - which for the load case K does dominate J (whereas for the store case it does not, so I set it false there).

The fact the declaration and definition have different terms for the same boolean seems troublesome now I think about it.

hfinkel added inline comments.Feb 28 2020, 7:34 PM
llvm/lib/Transforms/Scalar/EarlyCSE.cpp
1100

The fact the declaration and definition have different terms for the same boolean seems troublesome now I think about it.

Yes, and I'm confused by the naming here. The comments in llvm/include/llvm/Transforms/Utils/Local.h say, "Some metadata can only be kept if K dominates J. For this to be correct, K cannot be hoisted." The comment on combineMetadata likewise says, "Some metadata kinds can only be kept if K does not move, meaning it dominated J in the original IR." The parameter is named DoesKMove, implying that when false, K did not originally dominate J. However, in Local.cpp, the parameter to combineMetadataForCSE is named KDominatesJ. This seems backward (i.e., DoesKMove seems like it should be false if K dominates J). In any case, the implementation of llvm::combineMetadata seems to do the more conservative thing when DoesKMove is true (when DoesKMove is false, it will prefer the metadata from K in some cases).

Thus, I think that it should be false here.