This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Propagate some func/arg/ret attributes from caller to callsite
AbandonedPublic

Authored by goldstein.w.n on Jun 1 2023, 2:42 PM.

Details

Summary

Some function attributes imply the attribute for all/some instructions
in the function. These attributes can be safely propagated to
callsites within the function that are missing the attribute. This can
be useful if the original caller is later inlined, as if the
attributes are not propagated, they will be lost.

This patch only implements the attributes can can be blindly propagated.

Function attributes:

  • mustprogress, willreturn

If any of the above are true for the function, they MUST apply to any
callsite in the function (along with any other instruction).

Function access attributes:

  • readonly, readnone, writeonly

These can be applied to a callsite if all the callsite arguments'
underlying objects are globals or arguments to the caller. In this
case, since the callsite has the same visible pointer set as the
caller, the gurantees map 1-1.

Argument attributes:

  • noundef, nonnull, nofree, readonly, readnone, writeonly

If any of the above are true for an argument to the caller it will
hold true as an argument for the callsite.

Return attributes:

  • noundef, nonnull

IFF the callsite's value is used as the caller's return value, we can
propagate those attributes.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Jun 1 2023, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2023, 2:42 PM
goldstein.w.n requested review of this revision.Jun 1 2023, 2:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2023, 2:42 PM
nikic requested changes to this revision.Jun 1 2023, 2:59 PM

This patch only implements the attributes can can be blindly propagated.

At least the function-level nofree and memory attributes cannot be propagated. E.g. a function that allocates and frees internally is nofree. Writes and reads to allocas are memory(none) from a function perspective, etc. You can see multiple miscompiles due to this in the test diffs.

More generally, I'm not sure this code belongs in InstCombine.

This revision now requires changes to proceed.Jun 1 2023, 2:59 PM

This patch only implements the attributes can can be blindly propagated.

At least the function-level nofree and memory attributes cannot be propagated. E.g. a function that allocates and frees internally is nofree. Writes and reads to allocas are memory(none) from a function perspective, etc. You can see multiple miscompiles due to this in the test diffs.

Will drop nofree.

Ahh, think the memory attrs would also apply for malloc/free memory even if malloc/free is local to function so can apply if all arguments are non-alloca base. If all args non-alloca base then can apply to full function.

More generally, I'm not sure this code belongs in InstCombine.

I somewhat agree although I'm not sure where else? I thought about putting it in the attributor but 1) the attributor is not enabled (and its not clear when/if that will happen) and 2) the attributor seems to be for IPO which doesn't really describe this well imo. Figured since its an exact transform on the specific instructions for the sake of other optimization passes instcombine fits the mold reasonable well.

llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll
217 ↗(On Diff #527632)

The reason for these being deleted is the functions are marked readnone. I think that attribute is probably a mistake but don't think this is buggy to delete.

goldstein.w.n edited the summary of this revision. (Show Details)Jun 1 2023, 10:02 PM

@nikic fixed the nofree issue.

Also made it so that readnone, readonly, and writeonly are only passed on to the callsite iff all the arguments to the callsite have underlying objects of either function argument or global.

All is green on llvm-test-suite (although there were some failures before).
Since I seem to have a pattern of under thinking these attribute level changes will make sure to test them all on llvm-test-suite before submitting more changes / between patch iterations.

fhahn added a subscriber: fhahn.Jun 3 2023, 1:22 PM

FYI I shared D152081 which is a different option to enable callsite propagation and it might be a good place to discuss where to best solve the problem

nikic added a comment.Jun 4 2023, 7:42 AM

I somewhat agree although I'm not sure where else? I thought about putting it in the attributor but 1) the attributor is not enabled (and its not clear when/if that will happen) and 2) the attributor seems to be for IPO which doesn't really describe this well imo. Figured since its an exact transform on the specific instructions for the sake of other optimization passes instcombine fits the mold reasonable well.

FunctionAttrs seems like a reasonable place to put it?

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3652

Both of these cannot be null in this context.

3671

These attributes only exist on arguments, not functions.

3725

This is incorrect for UB-implying attributes like noundef, if the call is not guaranteed to transfer to the return.

3729

BB->getTerminator() cannot be null.

I somewhat agree although I'm not sure where else? I thought about putting it in the attributor but 1) the attributor is not enabled (and its not clear when/if that will happen) and 2) the attributor seems to be for IPO which doesn't really describe this well imo. Figured since its an exact transform on the specific instructions for the sake of other optimization passes instcombine fits the mold reasonable well.

FunctionAttrs seems like a reasonable place to put it?

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

goldstein.w.n added inline comments.Jun 4 2023, 10:21 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3725

We check that the return of the callsite is the return value of the function. Are you are worried about the return value escaping then an exception or interrupt between the callsite and return?

I somewhat agree although I'm not sure where else? I thought about putting it in the attributor but 1) the attributor is not enabled (and its not clear when/if that will happen) and 2) the attributor seems to be for IPO which doesn't really describe this well imo. Figured since its an exact transform on the specific instructions for the sake of other optimization passes instcombine fits the mold reasonable well.

FunctionAttrs seems like a reasonable place to put it?

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

Also I think then FunctionAttrs need to be moved to run before inlining:

./bin/opt -O3 -print-pipeline-passes
...cgscc(devirt<4>(inline<only-mandatory>,inline,function-attrs<skip-non-recursive>...
nikic added a comment.Jun 4 2023, 11:43 AM

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

I think it's pretty unlikely that the Attributor will be enabled by default anytime soon, so I wouldn't wait on that. I expect a substantial investment in improving Attributor compile-times will be needed to make this even remotely viable.

Also I think then FunctionAttrs need to be moved to run before inlining:

./bin/opt -O3 -print-pipeline-passes
...cgscc(devirt<4>(inline<only-mandatory>,inline,function-attrs<skip-non-recursive>...

Unless you care about non-trivial SCCs, it does run before inlining due to CGSCC interleaving.

An alternative to FunctionAttrs is doing this directly in InlineFunction -- in fact, we already transfer return attributes there in https://github.com/llvm/llvm-project/blob/36f351098cd50809658493d9b2e22a795874bab0/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1366. This could be extended to other attribute kinds. The benefit of doing it during inlining is that it's possible to transfer attributes from the call-site, rather than just the definition, though I'm not sure this is relevant for the cases you're interested in. You want this for the case where there are frontend-generated attributes on the definition, right?

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3725

Yes.

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

I think it's pretty unlikely that the Attributor will be enabled by default anytime soon, so I wouldn't wait on that. I expect a substantial investment in improving Attributor compile-times will be needed to make this even remotely viable.

Also I think then FunctionAttrs need to be moved to run before inlining:

./bin/opt -O3 -print-pipeline-passes
...cgscc(devirt<4>(inline<only-mandatory>,inline,function-attrs<skip-non-recursive>...

Unless you care about non-trivial SCCs, it does run before inlining due to CGSCC interleaving.

An alternative to FunctionAttrs is doing this directly in InlineFunction -- in fact, we already transfer return attributes there in https://github.com/llvm/llvm-project/blob/36f351098cd50809658493d9b2e22a795874bab0/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1366. This could be extended to other attribute kinds. The benefit of doing it during inlining is that it's possible to transfer attributes from the call-site, rather than just the definition, though I'm not sure this is relevant for the cases you're interested in. You want this for the case where there are frontend-generated attributes on the definition, right?

The motivation is for me was the loss of user-specified attributes, but could imagine other cases this is helpful.
How does making it an analysis function (not sure which file), something like "getPropagatedAttributesToCallsite(ArgAttrsOut, FuncAttrsOut, RetAttrsOut)" so it can be used in inline/functionAttrs.
My only concern with doing inline only is I think something is 'isSpeculativelyExecutable' when running over a readnone function will miss that the callsite is now implied readnone (assuming no allocas) which could also potentially be useful.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
3725

is that not also a concern of nonnull?

fhahn added a comment.Jun 4 2023, 12:55 PM

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

I think it's pretty unlikely that the Attributor will be enabled by default anytime soon, so I wouldn't wait on that. I expect a substantial investment in improving Attributor compile-times will be needed to make this even remotely viable.

Yeah, unfortunately I was hoping for more encouraging compile-time data.

fhahn added a comment.Jun 4 2023, 1:05 PM

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

I think it's pretty unlikely that the Attributor will be enabled by default anytime soon, so I wouldn't wait on that. I expect a substantial investment in improving Attributor compile-times will be needed to make this even remotely viable.

Yeah, unfortunately I was hoping for more encouraging compile-time data.

But I think we should give @jdoerfert some time to chime in, in case there are more knobs to improve compile-time the current patch isn't using.

At the moment FunctionAttrs doesn't do any callsite modifications, only the functions themselves. If you think it fits, however, I'd be okay extending FunctionAttrs. Same for nocapture patch? Also should I wait on D152081 or since thats too the attributor still makes sense to try and get something into FunctionAttrs?

I think it's pretty unlikely that the Attributor will be enabled by default anytime soon, so I wouldn't wait on that. I expect a substantial investment in improving Attributor compile-times will be needed to make this even remotely viable.

Yeah, unfortunately I was hoping for more encouraging compile-time data.

I see. Well I'm working on porting to helper that can be used in inliner/functionattrs (and I guess attributor too). Ill test compile time of both and see what makes sense.

preliminary results for adding callsite attribute to functionattrs:
https://llvm-compile-time-tracker.com/compare.php?from=94407e1bba9807193afde61c56b6125c0fc0b1d1&to=9a4f809588d2da4b58368469daa7b96322b7a540&stat=instructions%3Au

regressions [.05, .37] for -O3 and -O3thinlto which isn't so bad. Proper LTO has multi percent regressions (4% tramp3d-v4).

nikic requested changes to this revision.Jun 6 2023, 8:10 AM

(Removing from queue as superseded by D152226.)

This revision now requires changes to proceed.Jun 6 2023, 8:10 AM
goldstein.w.n abandoned this revision.Jun 6 2023, 10:15 AM