This is an archive of the discontinued LLVM Phabricator instance.

[InlineFunction] don't add noalias metadata for unknown objects
ClosedPublic

Authored by shchenz on Jun 7 2022, 4:38 AM.

Details

Summary

The unidentified objects recognized in getUnderlyingObjects may still alias to the noalias parameter because getUnderlyingObjects may not check deep enough to get the underlying object because of MaxLookup. The real underlying object for the unidentified object may still be the noalias parameter.

Don't add noalias metadata for these unidentified objects.

Diff Detail

Event Timeline

tingwang created this revision.Jun 7 2022, 4:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 4:38 AM
tingwang requested review of this revision.Jun 7 2022, 4:38 AM
tingwang edited the summary of this revision. (Show Details)Jun 7 2022, 4:39 AM
nikic added a reviewer: nikic.Jun 7 2022, 5:11 AM
nikic added a subscriber: nikic.

Please add a test case.

in AddAliasScopeMetadata: the default value of MaxLookup is not enough to identify the underlying object, and the alias info added by AddAliasScopeMetadata() is wrong, resulting in wrong schedule decisions in machine-scheduler.

AddAliasScopeMetadata is also buggy then..

tingwang updated this revision to Diff 435051.Jun 7 2022, 11:26 PM

Add test case for this particular issue.

nikic added inline comments.Jun 8 2022, 1:34 AM
llvm/test/Transforms/Inline/inline-noalias.ll
82 ↗(On Diff #435051)

This test case looks like it can be minimized further. Do we need any of the alias.scope/noalias metadata in the input IR? I think we're only interested in what inlining adds. All the arithmetic instructions are probably unnecessary as well. You can also switch to opaque pointer IR to avoid unnecessary types and bitcasts.

nikic added inline comments.Jun 8 2022, 1:39 AM
llvm/test/Transforms/Inline/inline-noalias.ll
82 ↗(On Diff #435051)

Here is a minimized test case that clearly shows the issue: https://llvm.godbolt.org/z/c8EhMddja

Okay, I think the core problem here is that the code assumes that the underlying object may be either a) an argument, b) an identified function local object or c) an escape source -- in the sense of only being able to alias a noalias argument if it has been captured earlier. This assumption does mostly hold up in practice, because the implementations of getUnderlyingObjects() and CaptureTracking are carefully synchronized (see in particular https://github.com/llvm/llvm-project/blob/20ca739701d7b2e2aa4028f1a7853f6d0fa0c412/llvm/lib/Analysis/ValueTracking.cpp#L4446-L4458). Looking through history, I think it might be synchronized for this specific usage actually. Of course, the assumption doesn't hold up if we stop walking underlying objects early, in which case we may alias without the argument being captured.

So this can be fixed in two ways: Avoid the assumption, or avoid stopping the walk early. For the latter case, something to be careful about is that using MaxLookup=0 also removes infinite loop protection for unreachable code. You'll want to add a test case for that. Likely we'd actually want something like D86669 as the implementation, which does avoid infinite loops.

Okay, I think the core problem here is that the code assumes that the underlying object may be either a) an argument, b) an identified function local object or c) an escape source -- in the sense of only being able to alias a noalias argument if it has been captured earlier. This assumption does mostly hold up in practice, because the implementations of getUnderlyingObjects() and CaptureTracking are carefully synchronized (see in particular https://github.com/llvm/llvm-project/blob/20ca739701d7b2e2aa4028f1a7853f6d0fa0c412/llvm/lib/Analysis/ValueTracking.cpp#L4446-L4458). Looking through history, I think it might be synchronized for this specific usage actually. Of course, the assumption doesn't hold up if we stop walking underlying objects early, in which case we may alias without the argument being captured.

So this can be fixed in two ways: Avoid the assumption, or avoid stopping the walk early. For the latter case, something to be careful about is that using MaxLookup=0 also removes infinite loop protection for unreachable code. You'll want to add a test case for that. Likely we'd actually want something like D86669 as the implementation, which does avoid infinite loops.

Looks like we need some more check on results from getUnderlyingObjects(). @efriedma proposed check like all_of(Objects, isIdentifiedObject), however there are cases for which object is not able to be identified, not due to the MaxLookup limit. I'm thinking maybe we need to know if walk stopped early or not in AddAliasScopeMetadata() to better handle this situation?

tingwang updated this revision to Diff 436272.Jun 12 2022, 10:08 PM
tingwang retitled this revision from [InlineFunction] Call to getUnderlyingObjects inside AddAliasScopeMetadata shall set MaxLookup = 0 to [InlineFunction] Handle early exit during getUnderlyingObjects due to MaxLookup.
tingwang edited the summary of this revision. (Show Details)

(1) Conditionally detect early exit during getUnderlyingObjects by invoke getUnderlyingObjects a second time. If we got new pointers, then the first walk had trouble, and be conservative in this case.
(2) Updated using nikic's simple test case.

tingwang added inline comments.Jun 12 2022, 10:11 PM
llvm/test/Transforms/Inline/inline-noalias.ll
82 ↗(On Diff #435051)

Thank you! This is much simple indeed!

tingwang added a comment.EditedJun 13 2022, 1:12 AM

I have the feeling that by adding this kind of check, I'm creating another hole which could be escaped by some other case, for example if those objects form a loop, and the loop step could be MaxLookup. I need more understand on code to decide if this check can be escaped or not.

I have the feeling that by adding this kind of check, I'm creating another hole which could be escaped by some other case, for example if those objects form a loop, and the loop step could be MaxLookup. I need more understand on code to decide if this check can be escaped or not.

I suspect that the test case which will cause infinite loop in getUnderlyingObjects() when MaxLookup = 0 will escape the proposed check, since we could get identical result from the second walk and hence the infinity. Unfortunately I do not have such case right now to observe...

tingwang abandoned this revision.Jun 16 2022, 3:25 AM
shchenz commandeered this revision.Jun 23 2022, 3:15 AM
shchenz reclaimed this revision.
shchenz updated this revision to Diff 439309.
shchenz added a reviewer: tingwang.
shchenz added inline comments.Jun 23 2022, 3:26 AM
llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll
55

The change here seems like another bug fix.

Before the patch:

%2 = bitcast i8* %0 to %f.Frame**

%FramePtr.i1 = load %f.Frame*, %f.Frame** %2, align 8, !alias.scope !3

%input.reload.addr13.i = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr.i1, i64 0, i32 2
%input.reload14.i = load i32, i32* %input.reload.addr13.i, align 4, !noalias !3
%n.val3.reload.addr11.i = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr.i1, i64 0, i32 1
%n.val3.reload12.i = load i32, i32* %n.val3.reload.addr11.i, align 4, !noalias !3

store i32 %sum7.i, i32* %n.val3.reload.addr11.i, align 4, !noalias !3
store i32 4, i32* %input.reload.addr13.i, align 4, !noalias !3

%0 is a noalias argument when do inlining. So before the change all the latter loads/stores are not aliased to the noalias argument %0, but seems it is not right. The base for the loads/stores are exactly %0 which is loaded from a pointer of pointer. The pointer of pointer is casted from %0.

getUnderlyingObjects() can not check underlying object inside loads for now.

Hi @tingwang thanks for great work for addressing the root cause. As discussed offline, I will try to fix this bug based on your analysis.

Dear reviewers, could you please help to have a look at the current solution. Thanks in advance.

shchenz added inline comments.Jun 23 2022, 3:44 AM
llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll
55

Please ignore this comments. Seems the analysis is wrong. The latter loads/stores are not load/store from base %0.

So we now lose some alias info but have correct functionality.

nikic added inline comments.Jun 23 2022, 7:45 AM
llvm/lib/Transforms/Utils/InlineFunction.cpp
1109

This looks overly aggressive to me, and probably loses most of the point of the PointerMayBeCapturedBefore check. We should be classifying objects into isIdentifiedObject and isEscapeSource, where we can use the capture-based logic for isEscapeSource. (It's currently static in BasicAliasAnalysis, but can be exported.)

shchenz updated this revision to Diff 439668.Jun 24 2022, 1:49 AM

address @nikic comment

shchenz marked an inline comment as done.Jun 24 2022, 1:51 AM
shchenz added inline comments.
llvm/lib/Transforms/Utils/InlineFunction.cpp
1109

Thanks. Updated.

nikic added a comment.Jun 24 2022, 3:44 AM
llvm/lib/Transforms/Utils/InlineFunction.cpp
1109

I think the basic logic here is now right. I still find the way the flags are set somewhat confusing though, especially that CanDeriveViaCapture is always set, even for non-escape-sources.

I think something like this would make the different cases clearer: https://gist.github.com/nikic/d900c8b2e901fc9c3ab1eb30650af9c3

llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

Do you know why these changes happen? (Just wondering, it looks harmless.)

shchenz updated this revision to Diff 439710.Jun 24 2022, 5:31 AM
shchenz marked 2 inline comments as done.

address @nikic comments

shchenz marked an inline comment as done.Jun 24 2022, 5:33 AM
shchenz added inline comments.
llvm/lib/Transforms/Utils/InlineFunction.cpp
1109

Haha, thanks very much. I just uploaded your patch shameless. : )

llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

The !noalias metadata is added when do inlining:

define i32 @main() local_unnamed_addr {
  %cont1 = call i8* @f.resume.0(i8* nonnull %.sub, i1 zeroext false)
}
define internal i8* @f.resume.0(i8* noalias nocapture nonnull align 4 dereferenceable(8) %0, i1 zeroext %1) {

  %n.val.reload.addr = bitcast i8* %0 to i32*
  %n.val.reload = load i32, i32* %n.val.reload.addr, align 4
  %inc = add i32 %n.val.reload, 1
  store i32 %inc, i32* %n.val.reload.addr, align 4

  tail call void @print(i32 %inc)
}

Before the change, underlying object %inc = add i32 %n.val.reload, 1 for tail call void @print(i32 %inc) parameter is RequiresNoCaptureBefore and PointerMayBeCapturedBefore() returns false, so there will be noalias metadata added for the function call.

But after the change, underlying object %inc = add i32 %n.val.reload, 1 is treated as a UsesUnknownObject.

Can the add (one operand is a load an the other is a constantInt) be handled in the isEscapeSource()?

nikic added inline comments.Jun 24 2022, 5:56 AM
llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

An add should never be an underlying object, because it's not a pointer.

I believe it gets added due to this code: https://github.com/llvm/llvm-project/blob/9081d3d8097af5e89905faf0ed568ba7e848e8df/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1046-L1052 I don't think the comment there is correct, and we should drop the IsArgMemOnlyCall && check. If a pointer is passed as an integer, then the ptrtoint will already capture it. And https://github.com/llvm/llvm-project/blob/9081d3d8097af5e89905faf0ed568ba7e848e8df/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1111-L1114 will always check for captures for non-argmemonly function calls.

shchenz marked an inline comment as done.Jun 24 2022, 6:32 AM
shchenz added inline comments.
llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

Yes, should I posted another patch to fix this issue? I think this fix should not reply on this patch, right? We can just ignore the non-pointer arguments in the PtrArgs collection phase and PointerMayBeCapturedBefore() will always check captures for the parameter even it is not a pointer for a non-argmemonly function calls.

nikic added inline comments.Jun 24 2022, 6:35 AM
llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

Yeah, sounds good.

shchenz added inline comments.
llvm/test/Transforms/Coroutines/coro-retcon-opaque-ptr.ll
38

Posted D128529. Thanks.

Gentle ping.

The impact regression cases are all back in D128529 suggested by @nikic

nikic added a comment.Jun 30 2022, 2:14 AM

@shchenz Can you please land D128529 first and then rebase this change?

This comment was removed by shchenz.
nikic accepted this revision.Jun 30 2022, 3:20 AM

LGTM, but the patch description could use an update.

This revision is now accepted and ready to land.Jun 30 2022, 3:20 AM
shchenz retitled this revision from [InlineFunction] Handle early exit during getUnderlyingObjects due to MaxLookup to [InlineFunction] don't add noalias metadata for unknown objects.Jun 30 2022, 10:53 PM
shchenz edited the summary of this revision. (Show Details)
jeroen.dobbelaere requested changes to this revision.Jun 30 2022, 11:16 PM
jeroen.dobbelaere added inline comments.
llvm/test/Transforms/Inline/inline-noalias-unidentify-object.ll
2

You'll need to add the option to check full lines, otherwise the check on line 17 (store i32...) will still match a store containing a noalias

This revision now requires changes to proceed.Jun 30 2022, 11:16 PM
This revision was not accepted when it landed; it landed in state Needs Revision.Jun 30 2022, 11:17 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
shchenz marked an inline comment as done.Jun 30 2022, 11:38 PM
shchenz added inline comments.
llvm/test/Transforms/Inline/inline-noalias-unidentify-object.ll
2

92a3e1b5c9e7ddd1b981393eac391a3745eff9a8 is committed to address this comment.

Sorry, I didn't noticed this comment when I committed the patch, seems we are almost at the same time : )

llvm/test/Transforms/Inline/inline-noalias-unidentify-object.ll
2

Thx !