D108887 fixed alignment mismatch by changing the caller's alignment in
ABI. However, we found some cases that still assume the alignment is
vector size. This patch fixes them to avoid the runtime crash.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
3475 | I would prefer to see if we can avoid changing the prototype here. The target checks (is win32) can be calculated internally by looking at the subtarget. The only thing that varies per call site is the IsVarArg part. If we have to change the prototype, please just pass IsVarArg. | |
4092 | Ditto. In this case, we are accumulating consecutive boolean parameters which can reduce readability as well. An alternative solution would be nicer. | |
4101 | Surely there are other kinds of parameters that are clamped to 4 byte alignment on win32. How are doubles handled? Can we handle them the same way? | |
llvm/test/CodeGen/X86/vaargs-win32.ll | ||
41 | As I understand it, clang will not generate this IR. It will either mark the vector with inreg, or it will pass it indirectly (<4 x float>*). | |
97 | I would like to see a test case where we set up a call that has intentionally misaligned parameters, so a C prototype that looks like void f(v4f, int, v4f, ...). This really underscores the need to use movups, because the ABI requires the data to be unaligned. |
Address @rnk 's comments. Thanks for the review!
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4101 | I think double might be aligned to 8 too. But I think ignore double handling should be ok. There's no difference with alignment equals to 4 and 8, because we don't have aligned instructions for it. | |
llvm/test/CodeGen/X86/vaargs-win32.ll | ||
41 | Yeah, it did have inreg when generated. I removed it because I thought it's not precise here. Adding it back. | |
97 | We have a similar one above, see testPastArguments. It checks (int, v4f). I think it should be ok. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4101 | I think it's important to make sure we have the right alignment values for all types, regardless of whether they have aligned instructions or not. LLVM uses alignment aggressively, so we need to be precise everywhere. I'd still like to see if we can make this condition less target-specific. What's special about win32 in this case is that we only have 4 byte stack alignment. There are other platforms where this is true as well: i686-darwin for example. What is the effect of passing Subtarget.getFrameLowering()->getStackAlign() in place of the MaybeAlign parameter?Presumably it would cause some test failures, but maybe we actually want that behavior. If we do this, do we actually need the IsVarArg parameter at all? To me, it seems unlikely that whether a prototype has varargs or not should affect the way that prototyped arguments are passed. I believe such vectors passed in memory should be passed indirectly. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4101 | Sorry for the late reply.
That's true, but some variables have their own alignment. See the example in f2. https://godbolt.org/z/xqcvj4YoK
Seems not. At least f80 is aligned to 16 bytes. I just fixed the issue in D113739.
Yes, because the same vector has different alignment between variant and fixed argument. |
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4101 | Yes, LLVM will realign the stack to store values with high required alignment. I'm saying that, this code, which stores an argument to stack memory, should maybe clamp its alignment assumption to the ABI's stack alignment, which on Windows, happens to be 4. On most other platforms, it will be 16. That seems equivalent to your logic, and more general. I'm asking if this suggestion causes problems in practice. Maybe it causes widespread test failures, I can't say for sure. In any case, I'd like to see a more principled solution.
I think what I'm trying to get at is that, in these two prototypes, v should be passed identically: void f1(int x, v4f v, int y); void f2(int x, v4f v, int y, ...); The IsVarArg boolean affects the entire call site. Adding and removing an ellipsis to the prototype should not change the instructions used to store the prototyped arguments, they should remain the same, and use the regular alignment assumptions, right? |
Sorry, I made a mistake when I wanted to demonstrate the difference between variant and fixed arguments. Yes, you are right. The alignment I showed in f2 is the store of variable instead of ABI's.
A summary from my latest investigation:
- For fixed function arguments:
- LLVM will pass the first 3 vector variables by register: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L804
- LLVM will pass the following vector variables by value with alignment = 4: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L788
- Clang FE will emit the address of value instead of the value itself. So we don't have chance to handle the alignment. https://godbolt.org/z/KvvY4hMda
- This is matching what MSVC's doing.
- For variant arguments:
- MSVC allows 3 vector variables at max no matter whether the variables are in the left of comma or in ellipsis in the prototype: https://godbolt.org/z/sYbcheEjv
- MSVC always use stack to pass vector variables. The alignment for the vector variables is 4.
I think what I'm trying to get at is that, in these two prototypes, v should be passed identically:
No, they are not. On fixed arguments function, vector variables are passed by registers or address. While on variant function, vector variables are limited to 3 and passed by stack.
I'm saying that, this code, which stores an argument to stack memory, should maybe clamp its alignment assumption to the ABI's stack alignment, which on Windows, happens to be 4. On most other platforms, it will be 16.
We have specified each of the type's alignment in the calling conversion when they are passed by stack: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L841
You can find on other 32 bit platforms, the type alignments are almost 4 too. The only exception is f80. Anyway, we don't need to warry about the fixed arguments here.
The IsVarArg boolean affects the entire call site.
Yes, because all vector variables are passed on stack with alignment = 4. No matter they are in the left of comma or right. It is intended to affects the entire call site. .
Adding and removing an ellipsis to the prototype should not change the instructions used to store the prototyped arguments, they should remain the same, and use the regular alignment assumptions, right?
No. Adding and removing an ellipsis are totally different ways when passing arguments. We need to use IsVarArg to distinguish each other.
Thanks for the info, sorry about the delayed response. I see that adding the ellipsis drastically changes the way vectors are passed, but I think that complexity lives in the frontend. If the function is vararg, the frontend (Clang or other) will pass the vector directly. If it has a fixed prototype, the vector should be passed by address after passing three vectors.
There are no cases when a vector passed on the stack is aligned to 16 bytes, it should always be four byte aligned. Therefore, I don't think we need the IsVarArg boolean, we can go ahead and clamp the alignment on these argument loads.
Remove IsVarArg boolean.
There are no cases when a vector passed on the stack is aligned to 16 bytes, it should always be four byte aligned.
Unfortunately, there is. See changes in win32-spill-xmm.ll.
The good news is it doesn't look like a reasonable test. See https://godbolt.org/z/hdsPTsbPW
MSVC doesn't pass a 512 bits argument in that way while Clang FE passes it by pointer.
It seems D12337 isn't a valid patch. So we can ignore the change in win32-spill-xmm.ll.
We may need to consider to fix the imcompatible issue between Clang and MSVC, but it is not related to this patch anyway.
lgtm
Sorry for the delay, I was out sick, and this fell out of my inbox.
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4100 | I think this could be simplified to use getStackAlign, but I won't insist. |
Thanks for the review. It's all right. Take care!
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
4100 | I don't think so. The alignment are inconstant with different types on different OSs. Take CC_X86_64_C for example, https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86CallingConv.td#L592-L603 f80/f128 are determined by layout and vector types are aligned to their size. |
I would prefer to see if we can avoid changing the prototype here. The target checks (is win32) can be calculated internally by looking at the subtarget. The only thing that varies per call site is the IsVarArg part. If we have to change the prototype, please just pass IsVarArg.