In DAGISel, the parameter alignment only have 4 bits to hold the value.
The encode(alignment) would need extra 1 bit, so the max aligment that
ISel can support is 2^14. This patch verify the parameter, return value
and align attribute for parameter.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I think arbitrary/maximum alignment doesn't make sense in reality and usually implies something wrong. So adding a verifier seems a good choice. LGTM but wait one or two days for other reviewers.
llvm/lib/IR/Verifier.cpp | ||
---|---|---|
3156 | At least we need to add comments at ArgFlagsTy::MemAlign to say that we verify the width of bit field in Verify.cpp. |
After digging more in the LLVM code, I find there is MaximumAlignment (1 << 32) in Value.h (https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/Value.h#L793) and constant null pointer would be set the MaximumAlignment at https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/Value.cpp#L977. Below test case would fail with my patch which constrain parameter alignment. To fix the max size conflict between MaximumAlignment and ArgFlagsTy.MemAlign maybe just use unsigned type for MemAlign instead of bitfield? But it seems there is assert to monitor ArgFlagsTy size at https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/TargetCallingConv.h#L69.
So in summary, for now I have no idea how to fix it. Maybe leave the test case of my patch just crash during the lowering and remind middle-end generate the right vector size or alignment attribute. @craig.topper, @RKSimon, @pengfei any ideas?
; RUN: opt -passes=instcombine -S < %s | FileCheck %s define i32 @PR48810() { ; CHECK-LABEL: @PR48810( ; CHECK-NEXT: call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 undef, i8* align 4294967296 null, i64 undef, i1 false) ; CHECK-NEXT: ret i32 undef ; %r = call dereferenceable(1) i8* @mempcpy(i8* undef, i8* null, i64 undef) ret i32 undef } declare i8* @mempcpy(i8*, i8* nocapture readonly, i64)
Can we check if the pointer has byval/byref attribute? I think the MemAlign should only be meaningful for them. And if we simply passing a pointer with OrigAlign, it's safe to truncate to the maximum of MemAlign in backend.
Further more, byval/byref together with align attribute are all ABI related. It does make sense to have different alignment with a pure pointer used inside function only. Here's comment from LangRef:
The byval attribute also supports specifying an alignment with the align attribute. It indicates the alignment of the stack slot to form and the known alignment of the pointer specified to the call site. If the alignment is not specified, then the code generator makes a target-specific assumption.
llvm/lib/IR/Verifier.cpp | ||
---|---|---|
3155–3163 | The Assert is defined as below. Let me check if it will fail in release build. #define Assert(C, ...) \ do { if (!(C)) { CheckFailed(__VA_ARGS__); return; } } while (false) | |
llvm/test/Verifier/param-ret-align.ll | ||
2 | Probably not. llvm-as would exit on the first failure. |
llvm/test/Verifier/param-ret-align.ll | ||
---|---|---|
2 | Then? We could check more than one failure in a LIT test. |
llvm/lib/IR/Verifier.cpp | ||
---|---|---|
3155–3163 | This Assert is different from assert. It is enabled no matter NDEBUG is turned on or off |
Still fine to me.
llvm/test/Verifier/param-ret-align.ll | ||
---|---|---|
2 | I understand it is the second function will never be handled once the first failed. With a rough search, I didn't find an example to handle more than one fucntions. |
llvm/lib/IR/Verifier.cpp | ||
---|---|---|
3155–3163 | Good to know, thanks! |
llvm/test/Verifier/param-ret-align.ll | ||
---|---|---|
2 | Oh, I understand what yuanke means now |
Just to throw it out there, we should be able to support a larger alignment in ArgFlagsTy pretty easily, because PointerAddrSpace only needs 24 bits, so we have 8 unused bits still left in that structure.
@LuoYuanke Not sure if this is documented anywhere, but there is this check in the DataLayout implementation: https://github.com/llvm/llvm-project/blob/fea20cb99087208d589d81bb7bbbc84198c7ffa4/llvm/lib/IR/DataLayout.cpp#L250-L256
Yeah, I don't think it's worth to add one more bit just for avoiding error. We don't have the requirment to store such a large alignment. The OrigAlign is another topic, I think we can leave it as is.
As seen in https://github.com/llvm/llvm-project/issues/58304, some intrinsics need a parameter with higher alignment (2 << 15). Should we extend the alignment field to accommodate that?
As seen in https://github.com/llvm/llvm-project/issues/58304, some intrinsics need a parameter with higher alignment (2 << 15). Should we extend the alignment field to accommodate that?
Maybe we can, but it seems an infrastructure change. Not sure what the risk we'd take for the change.
In the case of #58304 it is a @llvm.vector.reduce.or.v16i8488(<16 x i8488> %18) that is incorrectly failing, which is not a real call. Could we exclude intrinsics from this verifier check?
I remove the constrain for intrinsics with https://reviews.llvm.org/D136330. However when testing below intrinsics case, it still crashes on DAGISel.
$ cat param-align.ll define dso_local float @foo(<8192 x float> noundef %vec) { entry: %b = call float @llvm.vector.reduce.fadd.f32.v8192f32(float -0.0, <8192 x float> %vec) ret float %b } declare float @llvm.vector.reduce.fadd.f32.v8192f32(float %start_value, <8192 x float> %a) $ llc param-align.ll llc: /export/users/workspace/opensrc/llvm-project-github/llvm/include/llvm/CodeGen/TargetCallingConv.h:150: void llvm::ISD::ArgFlagsTy::setMemAlign(llvm::Align): Assertion `getNonZeroMemAlign() == A && "bitfield overflow"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
That is presumably because of the <8192 x float> noundef %vec argument for @foo. If it was loaded from a pointer, does it still fail? It will be a pretty huge expansion, but should work OK.
That is presumably because of the <8192 x float> noundef %vec argument for @foo. If it was loaded from a pointer, does it still fail? It will be a pretty huge expansion, but should work OK.
It doesn't fail when <8192 x float> %vec is loaded from a pointer.
A question: why is the max alignment is not 1<<15 if we have 4 bits?