Page MenuHomePhabricator

[X86][MS] Fix the wrong alignment of vector variable arguments on Win32
ClosedPublic

Authored by pengfei on Nov 24 2021, 7:06 AM.

Details

Summary

D108887 fixed alignment mismatch by changing the caller's alignment in
ABI. However, we found some cases that still assume the alignment is
vector size. This patch fixes them to avoid the runtime crash.

Diff Detail

Event Timeline

pengfei created this revision.Nov 24 2021, 7:06 AM
pengfei requested review of this revision.Nov 24 2021, 7:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2021, 7:06 AM
rnk added inline comments.Nov 24 2021, 1:50 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
3464

I would prefer to see if we can avoid changing the prototype here. The target checks (is win32) can be calculated internally by looking at the subtarget. The only thing that varies per call site is the IsVarArg part. If we have to change the prototype, please just pass IsVarArg.

4085

Ditto. In this case, we are accumulating consecutive boolean parameters which can reduce readability as well. An alternative solution would be nicer.

4095

Surely there are other kinds of parameters that are clamped to 4 byte alignment on win32. How are doubles handled? Can we handle them the same way?

llvm/test/CodeGen/X86/vaargs-win32.ll
41

As I understand it, clang will not generate this IR. It will either mark the vector with inreg, or it will pass it indirectly (<4 x float>*).

97

I would like to see a test case where we set up a call that has intentionally misaligned parameters, so a C prototype that looks like void f(v4f, int, v4f, ...). This really underscores the need to use movups, because the ABI requires the data to be unaligned.

pengfei updated this revision to Diff 389776.Nov 25 2021, 7:05 AM
pengfei marked 2 inline comments as done.

Address @rnk 's comments. Thanks for the review!

llvm/lib/Target/X86/X86ISelLowering.cpp
4095

I think double might be aligned to 8 too. But I think ignore double handling should be ok. There's no difference with alignment equals to 4 and 8, because we don't have aligned instructions for it.

llvm/test/CodeGen/X86/vaargs-win32.ll
41

Yeah, it did have inreg when generated. I removed it because I thought it's not precise here. Adding it back.

97

We have a similar one above, see testPastArguments. It checks (int, v4f). I think it should be ok.
OTOH, we can also check whether or not to use movups by checking the stack realignment code andl $-16, %esp.

rnk added inline comments.Nov 30 2021, 11:41 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
4095

I think it's important to make sure we have the right alignment values for all types, regardless of whether they have aligned instructions or not. LLVM uses alignment aggressively, so we need to be precise everywhere.

I'd still like to see if we can make this condition less target-specific. What's special about win32 in this case is that we only have 4 byte stack alignment. There are other platforms where this is true as well: i686-darwin for example.

What is the effect of passing Subtarget.getFrameLowering()->getStackAlign() in place of the MaybeAlign parameter?Presumably it would cause some test failures, but maybe we actually want that behavior.

If we do this, do we actually need the IsVarArg parameter at all? To me, it seems unlikely that whether a prototype has varargs or not should affect the way that prototyped arguments are passed. I believe such vectors passed in memory should be passed indirectly.

pengfei added inline comments.Dec 9 2021, 5:23 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
4095

Sorry for the late reply.

What's special about win32 in this case is that we only have 4 byte stack alignment.

That's true, but some variables have their own alignment. See the example in f2. https://godbolt.org/z/xqcvj4YoK
I have this impression since I met the problem previously, but I'm not sure for other types. Seems double is still aligned to 4 byte.

There are other platforms where this is true as well: i686-darwin for example.

Seems not. At least f80 is aligned to 16 bytes. I just fixed the issue in D113739.

do we actually need the IsVarArg parameter at all?

Yes, because the same vector has different alignment between variant and fixed argument.

rnk added inline comments.Dec 13 2021, 4:48 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
4095

Yes, LLVM will realign the stack to store values with high required alignment. I'm saying that, this code, which stores an argument to stack memory, should maybe clamp its alignment assumption to the ABI's stack alignment, which on Windows, happens to be 4. On most other platforms, it will be 16. That seems equivalent to your logic, and more general. I'm asking if this suggestion causes problems in practice. Maybe it causes widespread test failures, I can't say for sure. In any case, I'd like to see a more principled solution.

Yes, because the same vector has different alignment between variant and fixed argument.

I think what I'm trying to get at is that, in these two prototypes, v should be passed identically:

void f1(int x, v4f v, int y);
void f2(int x, v4f v, int y, ...);

The IsVarArg boolean affects the entire call site. Adding and removing an ellipsis to the prototype should not change the instructions used to store the prototyped arguments, they should remain the same, and use the regular alignment assumptions, right?

Sorry, I made a mistake when I wanted to demonstrate the difference between variant and fixed arguments. Yes, you are right. The alignment I showed in f2 is the store of variable instead of ABI's.
A summary from my latest investigation:

  1. For fixed function arguments:
  2. For variant arguments:
    • MSVC allows 3 vector variables at max no matter whether the variables are in the left of comma or in ellipsis in the prototype: https://godbolt.org/z/sYbcheEjv
    • MSVC always use stack to pass vector variables. The alignment for the vector variables is 4.

I think what I'm trying to get at is that, in these two prototypes, v should be passed identically:

No, they are not. On fixed arguments function, vector variables are passed by registers or address. While on variant function, vector variables are limited to 3 and passed by stack.

I'm saying that, this code, which stores an argument to stack memory, should maybe clamp its alignment assumption to the ABI's stack alignment, which on Windows, happens to be 4. On most other platforms, it will be 16.

We have specified each of the type's alignment in the calling conversion when they are passed by stack: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L841
You can find on other 32 bit platforms, the type alignments are almost 4 too. The only exception is f80. Anyway, we don't need to warry about the fixed arguments here.

The IsVarArg boolean affects the entire call site.

Yes, because all vector variables are passed on stack with alignment = 4. No matter they are in the left of comma or right. It is intended to affects the entire call site. .

Adding and removing an ellipsis to the prototype should not change the instructions used to store the prototyped arguments, they should remain the same, and use the regular alignment assumptions, right?

No. Adding and removing an ellipsis are totally different ways when passing arguments. We need to use IsVarArg to distinguish each other.

rnk added a comment.Jan 11 2022, 12:59 PM

Sorry, I made a mistake when I wanted to demonstrate the difference between variant and fixed arguments. Yes, you are right. The alignment I showed in f2 is the store of variable instead of ABI's.
A summary from my latest investigation:

  1. For fixed function arguments:
  2. For variant arguments:
    • MSVC allows 3 vector variables at max no matter whether the variables are in the left of comma or in ellipsis in the prototype: https://godbolt.org/z/sYbcheEjv
    • MSVC always use stack to pass vector variables. The alignment for the vector variables is 4.

I think what I'm trying to get at is that, in these two prototypes, v should be passed identically:

No, they are not. On fixed arguments function, vector variables are passed by registers or address. While on variant function, vector variables are limited to 3 and passed by stack.

I'm saying that, this code, which stores an argument to stack memory, should maybe clamp its alignment assumption to the ABI's stack alignment, which on Windows, happens to be 4. On most other platforms, it will be 16.

We have specified each of the type's alignment in the calling conversion when they are passed by stack: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L841
You can find on other 32 bit platforms, the type alignments are almost 4 too. The only exception is f80. Anyway, we don't need to warry about the fixed arguments here.

The IsVarArg boolean affects the entire call site.

Yes, because all vector variables are passed on stack with alignment = 4. No matter they are in the left of comma or right. It is intended to affects the entire call site. .

Adding and removing an ellipsis to the prototype should not change the instructions used to store the prototyped arguments, they should remain the same, and use the regular alignment assumptions, right?

No. Adding and removing an ellipsis are totally different ways when passing arguments. We need to use IsVarArg to distinguish each other.

Thanks for the info, sorry about the delayed response. I see that adding the ellipsis drastically changes the way vectors are passed, but I think that complexity lives in the frontend. If the function is vararg, the frontend (Clang or other) will pass the vector directly. If it has a fixed prototype, the vector should be passed by address after passing three vectors.

There are no cases when a vector passed on the stack is aligned to 16 bytes, it should always be four byte aligned. Therefore, I don't think we need the IsVarArg boolean, we can go ahead and clamp the alignment on these argument loads.

pengfei updated this revision to Diff 400382.Jan 16 2022, 7:52 AM

Remove IsVarArg boolean.

There are no cases when a vector passed on the stack is aligned to 16 bytes, it should always be four byte aligned.

Unfortunately, there is. See changes in win32-spill-xmm.ll.

The good news is it doesn't look like a reasonable test. See https://godbolt.org/z/hdsPTsbPW
MSVC doesn't pass a 512 bits argument in that way while Clang FE passes it by pointer.
It seems D12337 isn't a valid patch. So we can ignore the change in win32-spill-xmm.ll.
We may need to consider to fix the imcompatible issue between Clang and MSVC, but it is not related to this patch anyway.

rnk accepted this revision.Feb 2 2022, 10:22 AM

lgtm

Sorry for the delay, I was out sick, and this fell out of my inbox.

llvm/lib/Target/X86/X86ISelLowering.cpp
4094

I think this could be simplified to use getStackAlign, but I won't insist.

This revision is now accepted and ready to land.Feb 2 2022, 10:22 AM

lgtm

Sorry for the delay, I was out sick, and this fell out of my inbox.

Thanks for the review. It's all right. Take care!

llvm/lib/Target/X86/X86ISelLowering.cpp
4094

I don't think so. The alignment are inconstant with different types on different OSs. Take CC_X86_64_C for example, https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L592-L603 f80/f128 are determined by layout and vector types are aligned to their size.

This revision was landed with ongoing or failed builds.Feb 12 2022, 6:23 PM
This revision was automatically updated to reflect the committed changes.