This is an archive of the discontinued LLVM Phabricator instance.

[OpaquePtr] Remove checking pointee type for byval/preallocated type
ClosedPublic

Authored by aeubanks on Jun 21 2021, 12:26 PM.

Details

Summary

These currently always require a type parameter. The bitcode reader
already upgrades old bitcode without the type parameter to use the
pointee type.

In cases where the caller does not have byval but the callee does, we
need to follow CallBase::paramHasAttr() and also look at the callee for
the byval type so that CallBase::isByValArgument() and
CallBase::getParamByValType() are in sync. Do the same for preallocated.

While we're here add a corresponding version for inalloca since we'll
need it soon.

Diff Detail

Event Timeline

aeubanks created this revision.Jun 21 2021, 12:26 PM
aeubanks requested review of this revision.Jun 21 2021, 12:26 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2021, 12:26 PM
dblaikie added a subscriber: rnk.Jun 21 2021, 2:29 PM

"arguably incorrect IR" isn't a phrase that is especially comforting. It is or isn't, and should have well defined behavior (including being explicitly undefined, if that's the case). Any chance we can get a clearer statement one way or the other?

Is this an alternative to that patch that kept getting committed/reverted that I think was related to trying to unify call site and function declaration parameter attributes? Might be worth mentioning that one in this patch description for history/context?

There are other attributes that don't seem to do this kind of "try the call instruction or the function declaration", do they? so I'd be a bit hesitant about this direction. Would like to get @rnk's thoughts on this too, perhaps.

I agree with @dblaikie that it'd be nice to get a clear answer for the behaviour (and documenting it in LangRef). Also, I wonder about ignoring the instruction attribute when the function declaration exists... seems like that is a step toward removing the instruction attribute entirely (maybe that's the right thing to do (not sure it is), but if so then it should be a high-level bit, not a side effect).

Don't worry about preallocated since it's still not really in use, and
is much more involved in setting up, making the chance of a mismatch
very small.

Not sure I agree with the logic here, although I could be misunderstanding it. If this is a supported IR feature, seems like knowingly inserting a bug wouldn't be great... is that what's happening? (Maybe adding the context @dblaikie mentioned would clarify for me...)

llvm/include/llvm/IR/InstrTypes.h
1730–1731

Is it intentional that Ty is not used at all by the early return? If so, it'd be more clear to move this to getParamByValType below the if statement.

rnk added a comment.Jun 21 2021, 5:03 PM

"arguably incorrect IR" isn't a phrase that is especially comforting. It is or isn't, and should have well defined behavior (including being explicitly undefined, if that's the case). Any chance we can get a clearer statement one way or the other?

Is this an alternative to that patch that kept getting committed/reverted that I think was related to trying to unify call site and function declaration parameter attributes? Might be worth mentioning that one in this patch description for history/context?

There are other attributes that don't seem to do this kind of "try the call instruction or the function declaration", do they? so I'd be a bit hesitant about this direction. Would like to get @rnk's thoughts on this too, perhaps.

As someone who has all the context, the commit message makes sense to me. In the change you allude to, we tried to stand on the principle that a call instruction should carry all of the ABI attributes it needs, but in practice, we found that IR producers, in particular instrumentation passes, usually don't set ABI attributes on call sites. It should be a semantics-preserving transform to rinse the target of a call instruction through an identity function, but in practice, that would break all these IR producers. It would make most direct calls indirect, and the ABI attributes would no longer be correct.

llvm/include/llvm/IR/InstrTypes.h
1731–1735

We should still prefer the type on the call site to the one on the called function if it is non-null.

"arguably incorrect IR" isn't a phrase that is especially comforting. It is or isn't, and should have well defined behavior (including being explicitly undefined, if that's the case). Any chance we can get a clearer statement one way or the other?

Is this an alternative to that patch that kept getting committed/reverted that I think was related to trying to unify call site and function declaration parameter attributes? Might be worth mentioning that one in this patch description for history/context?

There are other attributes that don't seem to do this kind of "try the call instruction or the function declaration", do they? so I'd be a bit hesitant about this direction. Would like to get @rnk's thoughts on this too, perhaps.

As someone who has all the context, the commit message makes sense to me. In the change you allude to, we tried to stand on the principle that a call instruction should carry all of the ABI attributes it needs, but in practice, we found that IR producers, in particular instrumentation passes, usually don't set ABI attributes on call sites. It should be a semantics-preserving transform to rinse the target of a call instruction through an identity function, but in practice, that would break all these IR producers. It would make most direct calls indirect, and the ABI attributes would no longer be correct.

Would we like to clean up those producers but it's infeasible - or is there good reason not to? (if it'd be good to do so, I'm OK with not necessarily doing it now - but writing it down as the intended goal/intentional technical debt for now)

Sounds like, if I'm following, that the "does everything work OK/the same if you rinse a target of a call through an identity function" test means we no longer think it's a good idea to try to check call attributes against their callee? Is it UB if they mismatch? (doesn't sound like it) are the callee's attributes ignored for the purpose of understanding call instructions? (that sounds correct/consistent with the rinse-test?)

aeubanks updated this revision to Diff 353664.Jun 22 2021, 8:46 AM

return the byval type first if it exists

aeubanks added a reviewer: Restricted Project.Jun 22 2021, 9:06 AM

CallBase::isByValArgument() and CallBase::getParamByValType() are *always* used together, so they need to match. We need to make sure that if isByValArgument() returns true, then getParamByValType() returns a non-null Type. Either we go down this route and preserve existing behavior, or we only look at the argument attributes, which is a change in behavior. For now I'd like to preserve existing behavior.

We could try to define whether or not mismatched byval parameter vs argument attributes is UB or not, but that's not what this patch is about. (I think currently the answer is that it's not, and we typically look at both the call instruction and the callee) I'll update the description to not say that it's "arguably incorrect" since currently it's not.

As for preallocated, my stance is still the same. It's not in used at all anywhere (I'm the person who implemented it but never got around to finishing it up). It's a lot of work to setup a preallocated call that it's very hard to forget the argument attribute.

aeubanks edited the summary of this revision. (Show Details)Jun 22 2021, 12:16 PM

As for preallocated, my stance is still the same. It's not in used at all anywhere (I'm the person who implemented it but never got around to finishing it up).

If it's not fully implemented / supported right now, then maybe that's reasonable. But the difference in behaviour between byval and preallocated seems likely to surprise someone at some point (unless you plan to delete the latter rather than finish it up?)... is there a comment you can leave behind somewhere, or some text that's reasonable to add to LangRef (not necessarily in the same patch) to document the status quo (for byval, preallocated, or both)?

It's a lot of work to setup a preallocated call that it's very hard to forget the argument attribute.

I'm not totally convinced; I could imagine confusion when someone is (e.g.) hand-reducing IR (or has a custom bugpoint pass or something) if behaviour is inconsistent.

aeubanks updated this revision to Diff 353769.Jun 22 2021, 1:32 PM

do the same for preallocated
add inalloca version

aeubanks edited the summary of this revision. (Show Details)Jun 22 2021, 1:32 PM
aeubanks edited the summary of this revision. (Show Details)Jun 22 2021, 1:44 PM

I remembered that I needed the same sort of function for inalloca, and inalloca also needs to match CallBase::isInAllocaArgument(), so I went ahead and did if for preallocated as well.

I never read the LangRef carefully enough, apparently we already expect ABI attributes to match [1]:

All ABI-impacting function attributes, such as sret, byval, inreg, returned, and inalloca, must match.

[1]: https://llvm.org/docs/LangRef.html#id327

rnk added a comment.Jun 22 2021, 3:05 PM

I never read the LangRef carefully enough, apparently we already expect ABI attributes to match [1]:

All ABI-impacting function attributes, such as sret, byval, inreg, returned, and inalloca, must match.

[1]: https://llvm.org/docs/LangRef.html#id327

That text is part of the section documenting how musttail works, and it's trying to make the statement that the prototype of the musttail call site must exactly match the prototype of the function making the tail call.

So far as I know, there isn't a general statement on what happens when attributes mismatch. I would try to infer it from what happens when function prototypes do not match, i.e. does the langref say what happens when you pass i64 to a function expecting i32? I think this area is underspecified. In practice, prototype mismatch cannot be optimized to UB / unreachable. Instcombine will try to "fix up" prototypes that almost match, mainly to support use cases like LTO.

Ah, I read the "in addition" incorrectly, I thought it was a separate section from the musttail.

I've started asking upstream on where we should go in regards to mismatched ABI attributes
but in any case, this patch should be NFC(I) in its current state, because all calls to getParamByValType() are guarded by isByValArgument().

nikic accepted this revision.Jul 7 2021, 2:23 PM
nikic added a subscriber: nikic.

LGTM. Maybe not the final state we want wrt call ABI attributes, but seems sensible for now.

This revision is now accepted and ready to land.Jul 7 2021, 2:23 PM