This is an archive of the discontinued LLVM Phabricator instance.

[X86] Extend the integer parameter if the function isn't local linked
AbandonedPublic

Authored by LuoYuanke on Apr 1 2022, 10:47 PM.

Details

Summary

This patch is based on the D71178.
We have disscussion here:
https://discourse.llvm.org/t/always-extend-the-integer-parameters-of-callee/61319.

We want to make clang behave just like gcc to avoid compatibility issues
between different compilers. We can't remove zeroext/signext on the front-end,
which will cause compatibility issues between the new clang and the previous clang.

There are many test cases waiting to be fixed. I put this patch here to see if
we can do this.

Diff Detail

Unit TestsFailed

Event Timeline

LiuChen3 created this revision.Apr 1 2022, 10:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2022, 10:47 PM
LiuChen3 requested review of this revision.Apr 1 2022, 10:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2022, 10:47 PM
LuoYuanke added inline comments.Apr 1 2022, 11:18 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
3960

Add comments to explain why we ignore sign/zero attribute here?

LuoYuanke added inline comments.Apr 2 2022, 6:29 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
3960

Maybe check F.hasLocalLinkage() to respect sign/zero attribute for local linkage functions.

skan added a subscriber: skan.Apr 6 2022, 12:40 AM
LiuChen3 updated this revision to Diff 421091.Apr 6 2022, 10:46 PM

Address Yuanke's comments: do the extension only if the function isn't local linked.

LiuChen3 retitled this revision from [X86] Always extend the integer arguments of callee. to [X86] Extend the integer parameter if the function isn't local linked.Apr 6 2022, 10:49 PM
skan added a reviewer: skan.Apr 6 2022, 11:13 PM
skan added inline comments.Apr 6 2022, 11:18 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
3956–3957

If you check hasLocalLinkage here, I believe we also need to check hasAddressTaken...

However, I think such optimizations should be done in llvm/lib/Transforms/IPO/*.cpp. instead of here.

LiuChen3 added inline comments.Apr 7 2022, 12:01 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
3956–3957

Sorry that I am not familiar with hasAddressTaken. I have one question, does hasAddressTaken also means hasLocalLinkage ?

skan added inline comments.Apr 7 2022, 1:56 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
3956–3957

hasLocalLinkage does not implies !hasAddressTaken. If a function’s address is taken, then it could be called indirectly by libraries. You can see an example in llvm/lib/Transforms/IPO/GlobalOpt.cpp.

LiuChen3 updated this revision to Diff 421142.Apr 7 2022, 2:45 AM

Address Shenchen's comment

lebedev.ri requested changes to this revision.Apr 7 2022, 2:50 AM
lebedev.ri added a subscriber: lebedev.ri.

Still not okay with it. The fix is to fix whatever is emitting the attribute, not completely ignore it.

This revision now requires changes to proceed.Apr 7 2022, 2:50 AM

Still not okay with it. The fix is to fix whatever is emitting the attribute, not completely ignore it.

I don't quite understand what you mean.
If there is no such attribute, it should not do any extension itself. Regardless of whether this function is only a local linked. The optimization here is only for parameters with zeroext/signext attributes.

Still not okay with it. The fix is to fix whatever is emitting the attribute, not completely ignore it.

I don't quite understand what you mean.
If there is no such attribute, it should not do any extension itself. Regardless of whether this function is only a local linked. The optimization here is only for parameters with zeroext/signext attributes.

If there is an zeroext attribute on the function argument, it tells us that the caller is responsible to ensure that the i8 argument,
which is passed in an i64 register, does not have garbage bits in the upper 64-8 bits, but those bits are zeros, so we don't need to zeroext ourselves.
If the caller forgets to do that, the behavior is undefined. If the attribute is present, but the caller wasn't actually supposed to zeroext (ABI mismatch),
then whoever has put that attribute there is at fault. Sometimes ignoring the attribute and performing zeroext ourselves is a workaround for ABI mismatch.

Sometimes ignoring the attribute and performing zeroext ourselves is a
workaround for ABI mismatch.

Yes, that's precisely the point of this change -- as discussed already!
Clang has been putting zeroext on all C functions taking 8/16 bit types,
incorrectly, for it's entire existence.

And, thus, now we must extend redundantly on both sides of the call, for
external functions.

We must extend in the caller to support calling old-clang code (which
didn't extend in callee). We must extend in the callee to support being
called by non-clang compilers (which don't always extend in the caller).

It seems we all agree we should extend the parameter both in caller and in callee. This is also the behavior of gcc.
For now, the front-end doesn't have an accurate attribute that describes what we're expecting right now and doesn't know whether the function is external linked. By default clang will do the extension only in the caller. This forced us to do some workarounds on the backend.
We also thought about setting these attributes correctly in the IPO, but is there any possibility that IPO won't do in some cases?
D112942 introduces a new parameter attribute NoExt might help us. But this still means that by default, we will do extension in both caller and callee. And only in some optimization cases, we will do extension in caller or callee. Still not the ideal method that lebedev.ri expected.
Besides, I think that as an ABI issue, the correctness should be guaranteed first, optimization is not always guaranteed.
@lebedev.ri, is there any better approach to fix this issue?

Ping? This bug will cause correctness issues, we need to fix it. I think it is possible to fix in the backend in the current situation.
Is there any other better method, @jyknight @lebedev.ri

If there is no better solution, I hope this patch merge in. This is a bug that needs to be fixed.

I'm not really seeing consensus, neither here nor in the RFC, i'm afraid.
I suspect the fix will require keeping the existing attributes intact,
introducing new ones that mean what you want, and using them.

LuoYuanke accepted this revision.Apr 18 2022, 3:10 AM

I think we all agree that this is a bug. Given there is no better solution for it, I'd like check in this patch first. We can revert the patch when there is better solution proposed.

We have been incompatible for years, why the urgency now? What level of testing has been done outside of the lit tests?

rnk requested changes to this revision.Apr 18 2022, 1:04 PM

I think we all agree that this is a bug. Given there is no better solution for it, I'd like check in this patch first. We can revert the patch when there is better solution proposed.

I don't think we've reached agreement that this is a bug. I am concerned that removing these extension assert nodes will result in performance regressions. Here's the summary table from the RFC:

clang: caller do the zero/sign extension while callee do nothing.
gcc: both caller and callee do the zero/sign extension.
icc: callee do the zero/sign extension while caller do nothong.

The way I read that, GCC and Clang are ABI compatible with each other.

The incompatibility exists between ICC and Clang/LLVM. ICC does not extend parameters, but Clang/LLVM assumes they will have been extended in the caller.


I think in cases where we disagree and are blocking each other, that's when we need to resort to flags to make forward progress. What do you think about adding a cl::opt for this feature? At the very least, this will allow folks to do their own performance testing, even if we ultimately default to GCC's behavior. I see the OpenVMS folks have a setting for this as well ("assume clean parameters").

Finally, maybe this should be discussed in the x86_64 psABI issue tracker. At the very least, the psabi docs could be clearer.

This revision now requires changes to proceed.Apr 18 2022, 1:04 PM

gcc: both caller and callee do the zero/sign extension.

The way I read that, GCC and Clang are ABI compatible with each other.

This is not right. GCC doesn't guarantee sign extension on the caller, and it doesn't always do it see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92821 and https://bugs.llvm.org/show_bug.cgi?id=44228 for example.

I don't think we've reached agreement that this is a bug. I am concerned that removing these extension assert nodes will result in performance regressions.

The performance concern looks reasonable. We need collect performance data to prove there is no performance regression for typical benchmarks.

I think in cases where we disagree and are blocking each other, that's when we need to resort to flags to make forward progress. What do you think about adding a cl::opt for this feature? At the very least, this will allow folks to do their own performance testing, even if we ultimately default to GCC's behavior. I see the OpenVMS folks have a setting for this as well ("assume clean parameters").

Adding cl::opt looks good to me.

rjmccall added a comment.EditedApr 18 2022, 9:39 PM

From Michael Matz's comments on the GCC bug and some of the linked conversation, it sounds pretty clear that the psABI does not require extension. They could certainly be more explicit about that in the psABI, but they're under no real obligation to: the baseline assumption should always be that things not stated are not guaranteed. If the psABI intended extension to be required, the document would really need to state it positively, rather than leaving it to be inferred. Since it does not, extension is not guaranteed by the psABI.

Of course, that creates a problem, because clang is not following the psABI. Platforms where clang is the system compiler have the option of declaring clang's current behavior to be their platform ABI and documenting the variance, if they like. Otherwise, clang needs to come into compliance with the psABI, but of course it also has to maintain compatibility with code emitted by old versions of clang, which are extending in the caller and assuming extension in the callee. As has been pointed out, those goals are not incompatible: we just need to conservatively extend in the caller and not assume extension in the callee. That won't be free, but it seems like what we have to do in order to make up for years of miscompilation, at least by default.

That said, I agree with the people who say that it's not the backend's responsibility to make this policy choice. The frontend should be emitting IR correctly to get the behavior it wants. So if we accept that the frontend might legitimately want conservative extension, the immediate question is whether we actually have a way to request it. I suspect that we don't, unless we've recently gotten much stricter about treating call-site and declaration attributes as independent.

So to recap:

  1. We should add a way for frontends to request conservative extension.
  2. clang should request conservative extension on most x86_64 platforms by default.
  3. clang can add a flag to control the ABI it uses for clients that are unwilling to pay the cost of conservative extension.
  4. Platforms that use clang as a system compiler should decide whether they want to (a) default to conservative extension or (b) document that they use caller extension and thus vary from the psABI.

According to our internal tests, the performance of most of the tests is not affected. The most affected test performance dropped by %0.76. Average performance dropped 0.25%. This matches our expectations

It would be great if it could be fixed on the front end but now no one does. The compromise is to do workaround in the backend. I will fix it after the front end solves this issue.

There is a discussion about sign/zero extend parameter years ago: https://groups.google.com/g/x86-64-abi/c/E8O33onbnGQ. Obviously psABI does not guarantee parameter has been sign/zero extended.

If people think that we'll reliably get the right behavior by the frontend adding sext and zext to call sites but not to function arguments, I can walk you through how to make that change in the frontend.

What we can't do in the frontend is just switch to unconditionally not using sext and zext at all on x86_64, because we need to maintain compatibility with old versions of clang. But we can introduce a flag that makes clang generate incompatible code if someone wants to opt in to it for performance.

If people think that we'll reliably get the right behavior by the frontend adding sext and zext to call sites but not to function arguments, I can walk you through how to make that change in the frontend.

What we can't do in the frontend is just switch to unconditionally not using sext and zext at all on x86_64, because we need to maintain compatibility with old versions of clang. But we can introduce a flag that makes clang generate incompatible code if someone wants to opt in to it for performance.

Thanks. @rjmccall . I'd like to hear your approach. If the change in FE is not large, I am willing to do it. But if it is too complicated, I may not be able to do that. Firstly I found we might be able to ignore the attribute in void CodeGenModule::ConstructAttributeList in clang/lib/CodeGen/CGCall.cpp , but it is the target independent part, so I gave up.
I agree introducing one new flag.

If people think that we'll reliably get the right behavior by the frontend adding sext and zext to call sites but not to function arguments, I can walk you through how to make that change in the frontend.

What we can't do in the frontend is just switch to unconditionally not using sext and zext at all on x86_64, because we need to maintain compatibility with old versions of clang. But we can introduce a flag that makes clang generate incompatible code if someone wants to opt in to it for performance.

Thanks. @rjmccall . I'd like to hear your approach. If the change in FE is not large, I am willing to do it. But if it is too complicated, I may not be able to do that. Firstly I found we might be able to ignore the attribute in void CodeGenModule::ConstructAttributeList in clang/lib/CodeGen/CGCall.cpp , but it is the target independent part, so I gave up.

I think you need to add a new ConservativeExtend kind to ABIArgInfo and make X86_64ABIInfo::classifyArgumentType (and classifyReturnType?) use that instead of Extend.

I think you need to add a new ConservativeExtend kind to ABIArgInfo and make X86_64ABIInfo::classifyArgumentType (and classifyReturnType?) use that instead of Extend.

I will take a look. Thanks!

lebedev.ri resigned from this revision.Jan 12 2023, 4:58 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

LuoYuanke commandeered this revision.EditedJan 12 2023, 4:59 PM
LuoYuanke edited reviewers, added: LiuChen3; removed: LuoYuanke.

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

There is another solution implemented in https://reviews.llvm.org/D124435, so I think we can abandon this patch.

LuoYuanke abandoned this revision.Jan 12 2023, 5:03 PM