This is an archive of the discontinued LLVM Phabricator instance.

Handle the case of live 16-bit subregisters in X86FixupBWInsts
ClosedPublic

Authored by andrew.w.kaylor on Nov 27 2017, 2:45 PM.

Details

Summary

This fixes Bugzilla report 35240 (https://bugs.llvm.org/show_bug.cgi?id=35240).

What was happening in that case is we had an instruction the was copying R10B into R9B and we wanted to promote this to a 32-bit move. This potentially leaves the upper 24 bits of R9D as undefined. That's OK if those bits are never used, but if they are then we need to prevent the transformation.

The Machine IR before the transformation in the failing test case looks like this:

BB#3: derived from LLVM BB %entry
    Live Ins: %EAX %EDX %ESI %R8B **%R10B %R9D**
    Predecessors according to CFG: BB#2 BB#1
  %CL<def> = MOV8rr %R8B<kill>
  %EDX<def,tied1> = SHR32rCL %EDX<kill,tied0>, %EFLAGS<imp-def,dead>, %CL<imp-use>
  MOV32mr %RSP, 1, %noreg, 24, %noreg, %EDX<kill>; mem:Volatile ST4[%k]
**  %R9B<def> = MOV8rr %R10B<kill>, %R9D<imp-use,kill>, %R9D<imp-def>**
  %CX<def> = MOV16rm %RSP, 1, %noreg, 18, %noreg; mem:Volatile LD2[%f](dereferenceable)
**  %CX<def,tied1> = OR16rr %CX<kill,tied0>, %R9W, %EFLAGS<imp-def,dead>**
 <snip>

As LivePhysRegs is stepping backward, it adds R9W to the live set, but when X86FixupBWInsts checks for liveness, it only checks R8D. This patch fixes that.

When a live register is added using LivePhysRegs::addReg() all of its sub-registers are also marked as live, but not super registers. In theory, we could take advantage of this behavior and only check the 16-bit sub-register for liveness (since if the 32-bit subregister is live the 16-bit subregister will be also), but the check is cheap and it seems preferable to have the code explicitly checking the conditions it requires.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper edited edge metadata.Nov 27 2017, 2:53 PM

Do you have a reduced test case to include?

MatzeB edited edge metadata.

FWIW At a first glance this feels to me as we should have the fixes in isLive() instead as that seems to roughly be "LivePhysRegs::contains() with tweaks" where the LivePhysRegs::contains() would check super registers.

FWIW At a first glance this feels to me as we should have the fixes in isLive() instead as that seems to roughly be "LivePhysRegs::contains() with tweaks" where the LivePhysRegs::contains() would check super registers.

Yeah, that makes sense. The reason I didn't do that was that this code already had special case handling for the sub_8bit_hi subregister. I suppose a case could be made that that also should be in isLive(), but since this is the only caller of isLive() the special logic of isLive() could just as well be sunk here.

Either way, you're right that it's more distributed than it ought to be. I'll clean it up.

-Refactored getSuperRegDestIfDead() to consolidate the liveness checking logic.
-Added a test case.

a.elovikov added inline comments.Nov 28 2017, 8:16 AM
test/CodeGen/X86/fixup-bw-inst.mir
195 ↗(On Diff #124496)

I'm surprised not to see "%R9D<imp-use,kill>, %R9D<imp-def>" from the Bugzilla here. IIIC the use below does not prevent anything because it might be use-undef so I'm not sure what exactly in this test prevents the transformation.

test/CodeGen/X86/fixup-bw-inst.mir
195 ↗(On Diff #124496)

As the test is written, it hits the case where the super-register is detected as live after the instruction (because LivePhysRegs sees the use of r9w and so adds it to the live set) but we don't see either an implicit use or an implicit def of the register in the instruction, so we assume it must have been live before the instruction. I think that this will be true in actual Machine IR that has been produced normally. In this case, the MIR in the test is just incomplete. I suppose the use on line 197 isn't considered as "possibly undef" just because the MIR in the test doesn't say it is undef.

I'll update this to make the test look like the actual failure that was seen in the wild.

BTW, I did run this test without my patch and it fails (because r9d isn't live and we weren't checking r9w). The test also fails (even with my change) if I add 'implicit-def %r9d' but not 'implicit kill %r9d'.

Updated the test case.

craig.topper added inline comments.Nov 29 2017, 10:39 AM
lib/Target/X86/X86FixupBWInsts.cpp
195 ↗(On Diff #124496)

Drop the curlies?

204 ↗(On Diff #124496)

Drop the curlies?

lib/Target/X86/X86FixupBWInsts.cpp
204 ↗(On Diff #124496)

I put the curlies here (and above) because they're inside a scope that needs curlies. Someone suggested to me once that I should use or not use curlies consistently throughout a given scope, but it may be that I'm applying that too broadly. clang-format seems to be happy either way. I'm also happy either way.

Death to the curlies!

This revision is now accepted and ready to land.Nov 29 2017, 4:46 PM
a.elovikov edited edge metadata.Nov 30 2017, 4:06 AM

I'm not sure if the following is possible/legal but it fails even with this patch:

body:             |
  bb.0:
    successors:
    liveins: %ch, %bl

    %cl = MOV8rr %bl, implicit-def %cx, implicit killed %ch, implicit-def %eflags
    ; CHECK-NOT: MOV32rr
    RETQ %cx

If that's a valid testcase than "a little more checking to do." (isLive before this patch) has to be fixed. If it's invalid - we'd need an assert for that too (or some mir-verify?). However, that should not be a part of this change, probably. Hence, just asking what's your opinion about that testcase.

a.elovikov added inline comments.Nov 30 2017, 4:12 AM
test/CodeGen/X86/fixup-bw-inst.mir
197 ↗(On Diff #124800)

And it would be better to have an additional testcase based on this one but with %r9 changed to some register that has 8bit_hi subregister.

I'm not sure if the following is possible/legal but it fails even with this patch:

body:             |
  bb.0:
    successors:
    liveins: %ch, %bl

    %cl = MOV8rr %bl, implicit-def %cx, implicit killed %ch, implicit-def %eflags
    ; CHECK-NOT: MOV32rr
    RETQ %cx

If that's a valid testcase than "a little more checking to do." (isLive before this patch) has to be fixed. If it's invalid - we'd need an assert for that too (or some mir-verify?). However, that should not be a part of this change, probably. Hence, just asking what's your opinion about that testcase.

Hi Andrei,

I apologize for the long delay in responding. I've been distracted with other things and haven't committed this change yet.

I'm not terribly familiar with MIR as it appears in a test like this, but I can't see any reason that your proposed testcase would be invalid. I do think it's extremely unlikely to arise in actual generated Machine IR, since we hardly ever use the 8-bit high registers. What I'd like to do is commit the change from this review as is, and file a new Bugzilla report to track the 8bit_hi issue. Does that sound reasonable to you?

In D40524#951832, @andrew.w.kaylor wrote:
I'm not terribly familiar with MIR as it appears in a test like this, but I can't see any reason that your proposed testcase would be invalid. I do think it's extremely unlikely to arise in actual generated Machine IR, since we hardly ever use the 8-bit high registers. What I'd like to do is commit the change from this review as is, and file a new Bugzilla report to track the 8bit_hi issue. Does that sound reasonable to you?

Yes, these are two separate issues, IMO.

@andrew.w.kaylor What is happening with this patch, will you be able to commit it for Jan 3 to fix PR35240? What about the high 8-bit register issue - is there a bugzilla (and repro) for this?

@andrew.w.kaylor What is happening with this patch, will you be able to commit it for Jan 3 to fix PR35240? What about the high 8-bit register issue - is there a bugzilla (and repro) for this?

Yes. I apologize. I've had some other priorities that pushed this to the back burner, but I should be able to commit it today assuming a clean merge with the latest code base. I have not investigated the high 8-it issue, but I believe Andrei's example will demonstrate a problem. I think it is very unlikely to occur during actual compilation.

This revision was automatically updated to reflect the committed changes.

@andrew.w.kaylor What is happening with this patch, will you be able to commit it for Jan 3 to fix PR35240? What about the high 8-bit register issue - is there a bugzilla (and repro) for this?

Yes. I apologize. I've had some other priorities that pushed this to the back burner, but I should be able to commit it today assuming a clean merge with the latest code base. I have not investigated the high 8-it issue, but I believe Andrei's example will demonstrate a problem. I think it is very unlikely to occur during actual compilation.

Cheers - did you create the bug report in the end? I can't seem to find it.

Cheers - did you create the bug report in the end? I can't seem to find it.

So I never did get around to filing the bug report, but @a.elovikov has already fixed this. Thanks, Andrei!