This is an archive of the discontinued LLVM Phabricator instance.

[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
Needs ReviewPublic

Authored by pengfei on Apr 7 2022, 12:07 AM.

Details

Summary

X86 codegen uses function attribute min-legal-vector-width to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set min-legal-vector-width to the width of the maximum for both caller and callee.

It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion.

  • For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller.
  • For argument promotion, we check the min-legal-vector-width of the caller and callee and refuse to promote when they don't match.

The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller foo has two callees bar and baz. When doing argument promotion, both foo and bar has the same min-legal-vector-width. So the argument was promoted to vector. Then the inlining inlines baz to foo and updates min-legal-vector-width, which results in ABI mismatch between foo and bar.

This patch fixes the problem by expanding the concept of min-legal-vector-width to indicator of functions arguments. That says, any passes touch functions arguments have to set min-legal-vector-width to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.

Diff Detail

Event Timeline

pengfei created this revision.Apr 7 2022, 12:07 AM
Herald added a project: Restricted Project. · View Herald Transcript
pengfei requested review of this revision.Apr 7 2022, 12:07 AM
Herald added a reviewer: baziotis. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript

This patch is a combination of D122594 and D123125, which is to address @jdoerfert 's comments on D123125 .

Gentle ping. I think it is bug we need to fix.

nikic added a subscriber: nikic.Apr 18 2022, 7:58 AM

Could you please update the patch with a description of what and how you are fixing? Possibly with an explanation for why this should not be handled through the existing ABI compatibility check?

pengfei edited the summary of this revision. (Show Details)Apr 18 2022, 8:47 AM
jdoerfert added inline comments.Apr 18 2022, 1:20 PM
llvm/test/Transforms/ArgumentPromotion/byval-3.ll
25 ↗(On Diff #421105)

I don't understand this test. Where does the 512 come from and why does it replace the 0 (which from a min perspective is less). Also, this has no byval even though the name of the test is byval-3.ll. Finally, I'd suggest to avoid the undef and UB.

pengfei added inline comments.Apr 18 2022, 7:28 PM
llvm/test/Transforms/ArgumentPromotion/byval-3.ll
25 ↗(On Diff #421105)

Where does the 512 come from

As I described in the summary, min-legal-vector-width reflects the maximum vector width in function arguments. TL;DR, it comes from the promotion that changes the maximum vector width.

and why does it replace the 0 (which from a min perspective is less)

Because on X86, a <32 x half> with different value of min-legal-vector-width uses different registers, i.e., different ABI. And we can't prevent later inliner pass to promote the value for single caller or callee, we must promote them at the first phase.

Also, this has no byval even though the name of the test is byval-3.ll.

I referenced byval-2.ll. The byval doesn't matter, so I removed it. I can add it back or change the filename.

Finally, I'd suggest to avoid the undef and UB.

Both argument promotion and lining are legal behavior. I think there's no undef/UB for single pass here.

I'm still having a hard time understanding what the intended semantics of this attribute are -- my best guess from this patch is that "min-legal-vector-width" is actually supposed to be a "max-legal-vector-width", which is why we need to increase it during promotion?

Before we start doing any code changes, I think we need a LangRef patch that specifies the precise semantics and requirements of this attribute.

pengfei updated this revision to Diff 423567.Apr 19 2022, 2:35 AM

Address review comments.

  1. Removed test byval-3.ll. The tests in min-legal-vector-width.ll already cover the change.
  2. Add description for "min-legal-vector-width" in LangRef.

I'm still having a hard time understanding what the intended semantics of this attribute are -- my best guess from this patch is that "min-legal-vector-width" is actually supposed to be a "max-legal-vector-width", which is why we need to increase it during promotion?

Before we start doing any code changes, I think we need a LangRef patch that specifies the precise semantics and requirements of this attribute.

Although it's bit against common sense, the "min" is the correct word here. It means the minimum legal vector requirments to backend. For example, a 128 bits vector only requires legality for 128 bits register, but backend are free to have the ability to provide 256 bits and more.
The "min-legal-vector-width" is the maximum type in function arguments because all arguments are supposed to be legal. Otherwise, we may have compatibility issue.

Finally, I'd suggest to avoid the undef and UB.

Both argument promotion and lining are legal behavior. I think there's no undef/UB for single pass here.

To make it clearer and avoid similar situations in the future:

  1. Argument promotion was legal in that test case.
  2. Inlining would have been legal, it wasn't happening for the test case.
  3. Passes with UB are bugs, I was not referring to that.
  4. The functions in the test could not have been executed without triggering UB. Hence, it would have been valid to remove the call and place an unreachable there instead. Valid, but obviously not what you wanted to test.

Talking about tests, I'm confused why the ArgumentPromotion/X86/min-legal-vector-width test is not affected. The Attributor ones are "simply copies".

llvm/lib/IR/Attributes.cpp
2042

Can't this function walk the arguments of Fn and determine Width if necessary?

llvm/lib/Transforms/IPO/Attributor.cpp
2493

Shouldn't we use the DataLayout for size questions?

Finally, I'd suggest to avoid the undef and UB.

Both argument promotion and lining are legal behavior. I think there's no undef/UB for single pass here.

To make it clearer and avoid similar situations in the future:

  1. Argument promotion was legal in that test case.
  2. Inlining would have been legal, it wasn't happening for the test case.
  3. Passes with UB are bugs, I was not referring to that.
  4. The functions in the test could not have been executed without triggering UB. Hence, it would have been valid to remove the call and place an unreachable there instead. Valid, but obviously not what you wanted to test.

No sure if I understood it correctly. Do you mean "the test case" by https://godbolt.org/z/zo3hba8xW? The case was drastically reduced from a big IR. It won't generate valid assembly but demonstrates the problem. The ABI mismatch can be proved by another test https://godbolt.org/z/KnsPorvKa

Talking about tests, I'm confused why the ArgumentPromotion/X86/min-legal-vector-width test is not affected. The Attributor ones are "simply copies".

Please note the different between IS__TUNIT_OPM: attributes and IS__TUNIT_NPM: attributes. The former has one more attribute than the latter, because "min-legal-vector-width"="256" in ATTR4 turns into "min-legal-vector-width"="512", which is the same with ATTR3. So we reduced one, then ATTR6 -> ATTR5 and ATTR7 -> ATTR6
But it seems we lack a test for ArgPromotion, so I think I should add the previous test back.

pengfei added inline comments.Apr 19 2022, 8:37 AM
llvm/lib/IR/Attributes.cpp
2042

I don't think so. For one thing, "min-legal-vector-width" indicates not only for arguments of Fn but also the largest vector arguments in functions called by Fn. For another, The old "min-legal-vector-width" is already >= the largest vector width in old arguments. No need to walk again. We only need to update for the new argument.

llvm/lib/Transforms/IPO/Attributor.cpp
2493

We can get the target capbility by features or target name. The problem here is "min-legal-vector-width" is different per function.

pengfei updated this revision to Diff 423870.Apr 20 2022, 4:40 AM

Add test case back.

Dear reviewers, we have to fix the bug :)

As far as I understand, I have addressed all questions and I didn't see objections for now. The affects of this patch should be minor, so I'm planning to land it by this week end. Feel free to set blocker if you still think it's not the right direction. Thanks!

This revision was not accepted when it landed; it landed in state Needs Review.May 1 2022, 11:15 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
arsenm reopened this revision.Dec 7 2022, 1:52 PM
arsenm added a subscriber: arsenm.

This should not have landed without additional review. I have raised some issues at

https://reviews.llvm.org/rG7c04454227f5c6bd3f515783950a815970c90558

llvm/docs/LangRef.rst
2193–2194

This is an unreasonable expectation for an attribute and

This should not have landed without additional review. I have raised some issues at

https://reviews.llvm.org/rG7c04454227f5c6bd3f515783950a815970c90558

Explained there. This patch is simply a bug fix. The document just describes an existing design. No new implementation and change here.

This should not have landed without additional review. I have raised some issues at

https://reviews.llvm.org/rG7c04454227f5c6bd3f515783950a815970c90558

Explained there. This patch is simply a bug fix. The document just describes an existing design. No new implementation and change here.

To be fair, this went up for review but did never get approval. Committing it this way is discouraged. Further, comments made before have not been addressed, e.g.,
4. The functions in the test could not have been executed without triggering UB. Hence, it would have been valid to remove the call and place an unreachable there instead. Valid, but obviously not what you wanted to test.

All that said, there is post-commit review. Now @arsenm brings up valid points, esp. for the documentation which was lacking before (as per @nikic) and is still not useful.
Just arguing "it is somewhere there" or "we can always improve documentation" is not what I would encourage as response given the way this was landed.

Now to some technical issue:
Why would we even track this in an argument? We are apparently free to modify the IR no matter what the current argument says, but if we do we need to update it "for the (X86) backend". Why doesn't the (X86) backend simply check the argument types itself?

This should not have landed without additional review. I have raised some issues at

https://reviews.llvm.org/rG7c04454227f5c6bd3f515783950a815970c90558

Explained there. This patch is simply a bug fix. The document just describes an existing design. No new implementation and change here.

To be fair, this went up for review but did never get approval. Committing it this way is discouraged. Further, comments made before have not been addressed, e.g.,
4. The functions in the test could not have been executed without triggering UB. Hence, it would have been valid to remove the call and place an unreachable there instead. Valid, but obviously not what you wanted to test.

All that said, there is post-commit review. Now @arsenm brings up valid points, esp. for the documentation which was lacking before (as per @nikic) and is still not useful.
Just arguing "it is somewhere there" or "we can always improve documentation" is not what I would encourage as response given the way this was landed.

Now to some technical issue:
Why would we even track this in an argument? We are apparently free to modify the IR no matter what the current argument says, but if we do we need to update it "for the (X86) backend". Why doesn't the (X86) backend simply check the argument types itself?

I definitely had a patch that did that. See VectorWidthInfer.cpp here https://reviews.llvm.org/D41341 I'm trying to reconstruct all the discussion that lead to that being abandoned.

This should not have landed without additional review. I have raised some issues at

https://reviews.llvm.org/rG7c04454227f5c6bd3f515783950a815970c90558

Explained there. This patch is simply a bug fix. The document just describes an existing design. No new implementation and change here.

To be fair, this went up for review but did never get approval. Committing it this way is discouraged. Further, comments made before have not been addressed, e.g.,
4. The functions in the test could not have been executed without triggering UB. Hence, it would have been valid to remove the call and place an unreachable there instead. Valid, but obviously not what you wanted to test.

I explained above: The case was drastically reduced from a big IR. It won't generate valid assembly but demonstrates the problem.. I thought it addressed your concern :)
It's not clear to me why it triggers UB. All I can see is it won't generate real assemble due to other optimizations. But I think it is clear and legal to both argument promotion and inlining pass.

All that said, there is post-commit review. Now @arsenm brings up valid points, esp. for the documentation which was lacking before (as per @nikic) and is still not useful.
Just arguing "it is somewhere there" or "we can always improve documentation" is not what I would encourage as response given the way this was landed.

Sure. I'm trying to address the comments.
I may not good at documentation. But I'd argue documentation is a relative subjective thing. And it's common a speaker thought he/she had explained clearly but audiences thought nonsense. That's why sometimes we need several rounds discussions/explanations in one topic.
So when I wrote "we can always improve documentation", I mean fine, let's just improve it. I don't find anything improper in the reply. Can you explain it more or suggest a better way how I should follow up?

Now to some technical issue:
Why would we even track this in an argument? We are apparently free to modify the IR no matter what the current argument says, but if we do we need to update it "for the (X86) backend". Why doesn't the (X86) backend simply check the argument types itself?

I think Craig helped addressed it.