I can add the other test cases from PR34021 if required?
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
might be a bit unrelated - but do we've got a hint regarding why is this even an issue?
by all means - it doesn't seems right for an empty ms inline-asm statement to affect successful compilation, without even mentioning the involvement of the encapsulating function's return type.
Another semi-adopted MS legacy issue?
I apologize for the lack of detail for this comment, but I didn't seem to capture this at the time. When I investigated this initially, I came up with this solution and it was insufficient. I don't remember WHAT this ends up not fixing however.
When discussing it with @rnk we decided that (according to my notes):
"so, addReturnRegisterOutputs needs to look at the LLVM return type, and choose between EAX:EDX for integer / struct return types,
ST0 for small FP types, and XMM0/YMM0 etc for vectors/big FP types"
The issue is more that the *TargetInfo::addReturnRegisterOutputs doesn't properly configure things, it simply works with integers. It needs to be extended to work with int/struct types, as well as vector/FP types.
@erichkeane OK, are you in a position to take on this bug? Either commandeering this patch or I abandon it and you start your own?
I'm not sure I have time for it unfortunately. It IS on my personal backlog, I just haven't had time lately. Has this affected someone else (besides my company's internal tests), or become more important lately?
Ah, I seem to remember that the problem here is that we don't know whether this value should be signed or unsigned at this point, so it could be CreateZExtOrTrunc OR CreateSignExtOrTrunc? However, there is no information as to whether the integer is signed here as far as I remember...
I think this is a reasonable stop-gap fix since the code isn't trying to return EAX:EDX or XMM0 from the inline asm blob. This affects any function that contains inline asm and returns a vector, which is potentially a lot of stuff.
test/CodeGen/pr34021.c | ||
---|---|---|
2 ↗ | (On Diff #113785) | Please FileCheck the IR. We should see a pattern like this: define <4 x float> @rep() %retval = alloca <4 x float> call i64 inlineasm sideeffect {{.*}} store {{.*}}, <4 x float>* %retval %v = load {{.*}}, <4 x float>* %res store <4 x float> %v, <4 x float>* %retval %rv = load <4 x float>* %retval ret <4 x float> %rv That IR pattern makes it clear that the store of the asm output is immediately killed by the store of res to retval. |
Actually... I've convinced myself that this is likely the correct fix for this bug. There is likely a separate bug for vector-types and struct return types, however I'd believe those aren't really things that anyone else supports.
According to our definition, v4si is NOT a "Vector" type, since a vector type requires it be a FP value. SO, we are converting an integer stored in eax/edx to a wider integer. Since assembly doesn't have a 'signed'-ness, I would assert that widening a register-stored variable to via zero-ext is the correct fix here.
In summary, I believe this is the correct fix for the bug as reported.
According to our definition, v4si is NOT a "Vector" type, since a vector type requires it be a FP value.
Umm, what? An integer vector is still a vector. The backend will return it in xmm0 on x86 (both 32 and 64-bit).
The actual problem here is that X86_32TargetCodeGenInfo::addReturnRegisterOutputs is adding outputs which don't make any sense; 32-bit x86 only returns integers and pointers (and maybe a few other weird things like complex integers and tiny vectors?) in EAX:EDX. So we shouldn't add a constraint which involves those registers if the function returns a 128-bit vector (which goes in xmm0), or a struct (which is returned in memory), or a float (which is returned in an x87 register), etc.
Adding the "right" constraint for stuff that isn't returned in EAX would be nice, but probably isn't necessary unless we actually run into code which depends on it.
Ugg.. you're right. I missed that addReturnRegisterOutputs was creating a return type as an integer, so when it was being checked in this if/else tree (edited above) you're not actually looking at the return type anymore....
lgtm
I was hoping for a test case that didn't require assertions, but this is enough to land the fix.
My mistake, I should be able to remove that as well - I'll do it as part of the commit. Thanks.