This is an archive of the discontinued LLVM Phabricator instance.

Fix cast assertion on MS inline assembly with vector spills (PR34021)
ClosedPublic

Authored by RKSimon on Sep 4 2017, 12:10 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Sep 4 2017, 12:10 PM
coby added a subscriber: coby.Sep 5 2017, 4:46 AM

might be a bit unrelated - but do we've got a hint regarding why is this even an issue?
by all means - it doesn't seems right for an empty ms inline-asm statement to affect successful compilation, without even mentioning the involvement of the encapsulating function's return type.
Another semi-adopted MS legacy issue?

I apologize for the lack of detail for this comment, but I didn't seem to capture this at the time. When I investigated this initially, I came up with this solution and it was insufficient. I don't remember WHAT this ends up not fixing however.

When discussing it with @rnk we decided that (according to my notes):

"so, addReturnRegisterOutputs needs to look at the LLVM return type, and choose between EAX:EDX for integer / struct return types,
ST0 for small FP types, and XMM0/YMM0 etc for vectors/big FP types"

The issue is more that the *TargetInfo::addReturnRegisterOutputs doesn't properly configure things, it simply works with integers. It needs to be extended to work with int/struct types, as well as vector/FP types.

I apologize for the lack of detail for this comment, but I didn't seem to capture this at the time. When I investigated this initially, I came up with this solution and it was insufficient. I don't remember WHAT this ends up not fixing however.

@erichkeane OK, are you in a position to take on this bug? Either commandeering this patch or I abandon it and you start your own?

I apologize for the lack of detail for this comment, but I didn't seem to capture this at the time. When I investigated this initially, I came up with this solution and it was insufficient. I don't remember WHAT this ends up not fixing however.

@erichkeane OK, are you in a position to take on this bug? Either commandeering this patch or I abandon it and you start your own?

I'm not sure I have time for it unfortunately. It IS on my personal backlog, I just haven't had time lately. Has this affected someone else (besides my company's internal tests), or become more important lately?

Ah, I seem to remember that the problem here is that we don't know whether this value should be signed or unsigned at this point, so it could be CreateZExtOrTrunc OR CreateSignExtOrTrunc? However, there is no information as to whether the integer is signed here as far as I remember...

rnk edited edge metadata.Sep 5 2017, 10:08 AM

I think this is a reasonable stop-gap fix since the code isn't trying to return EAX:EDX or XMM0 from the inline asm blob. This affects any function that contains inline asm and returns a vector, which is potentially a lot of stuff.

test/CodeGen/pr34021.c
2 ↗(On Diff #113785)

Please FileCheck the IR. We should see a pattern like this:

define <4 x float> @rep()
  %retval = alloca <4 x float>
  call i64 inlineasm sideeffect {{.*}}
  store {{.*}}, <4 x float>* %retval
  %v = load {{.*}}, <4 x float>* %res
  store <4 x float> %v, <4 x float>* %retval
  %rv = load <4 x float>* %retval
  ret <4 x float> %rv

That IR pattern makes it clear that the store of the asm output is immediately killed by the store of res to retval.

erichkeane edited edge metadata.Sep 5 2017, 1:13 PM
In D37448#861211, @rnk wrote:

I think this is a reasonable stop-gap fix since the code isn't trying to return EAX:EDX or XMM0 from the inline asm blob. This affects any function that contains inline asm and returns a vector, which is potentially a lot of stuff.

Actually... I've convinced myself that this is likely the correct fix for this bug. There is likely a separate bug for vector-types and struct return types, however I'd believe those aren't really things that anyone else supports.

According to our definition, v4si is NOT a "Vector" type, since a vector type requires it be a FP value. SO, we are converting an integer stored in eax/edx to a wider integer. Since assembly doesn't have a 'signed'-ness, I would assert that widening a register-stored variable to via zero-ext is the correct fix here.

In summary, I believe this is the correct fix for the bug as reported.

According to our definition, v4si is NOT a "Vector" type, since a vector type requires it be a FP value.

Umm, what? An integer vector is still a vector. The backend will return it in xmm0 on x86 (both 32 and 64-bit).

The actual problem here is that X86_32TargetCodeGenInfo::addReturnRegisterOutputs is adding outputs which don't make any sense; 32-bit x86 only returns integers and pointers (and maybe a few other weird things like complex integers and tiny vectors?) in EAX:EDX. So we shouldn't add a constraint which involves those registers if the function returns a 128-bit vector (which goes in xmm0), or a struct (which is returned in memory), or a float (which is returned in an x87 register), etc.

Adding the "right" constraint for stuff that isn't returned in EAX would be nice, but probably isn't necessary unless we actually run into code which depends on it.

According to our definition, v4si is NOT a "Vector" type, since a vector type requires it be a FP value.

Umm, what? An integer vector is still a vector. The backend will return it in xmm0 on x86 (both 32 and 64-bit).

Ugg.. you're right. I missed that addReturnRegisterOutputs was creating a return type as an integer, so when it was being checked in this if/else tree (edited above) you're not actually looking at the return type anymore....

RKSimon updated this revision to Diff 114569.Sep 11 2017, 5:09 AM

Added checks to test case

rnk accepted this revision.Sep 11 2017, 10:26 AM

lgtm

I was hoping for a test case that didn't require assertions, but this is enough to land the fix.

This revision is now accepted and ready to land.Sep 11 2017, 10:26 AM
In D37448#866700, @rnk wrote:

lgtm

I was hoping for a test case that didn't require assertions, but this is enough to land the fix.

My mistake, I should be able to remove that as well - I'll do it as part of the commit. Thanks.

This revision was automatically updated to reflect the committed changes.