This is an archive of the discontinued LLVM Phabricator instance.

Fix subreg value numbers in handleMoveUp
ClosedPublic

Authored by rampitec on Mar 2 2017, 4:21 PM.

Details

Summary

The problem can occur in presence of subregs. If we are swapping two
instructions defining different subregs of the same register we will
get a new liveout from a block. We need to preserve value number for
block's liveout for successor block's livein to match.

To illustrate the problem exposed by that test: before scheduling we had:

%vreg12 [48r,96r:0)[96r,112r:1)[112r,400r:2)[464B,560r:2)  0@48r 1@96r 2@112r L00000001 [48r,208r:0)  0@48r L00000002 [48r,208r:0)  0@48r L00000004 [112r,384r:0)[464B,544r:0)  0@112r L000000
 
0B      BB#0: derived from LLVM BB %entry
            Live Ins: %SGPR0_SGPR1
16B             %vreg1<def> = COPY %SGPR0_SGPR1; SGPR_64:%vreg1
32B             %vreg20:sub0_sub1<def,read-undef> = S_LOAD_DWORDX2_IMM %vreg1, 9, 0; mem:LD8[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_128:%vreg20 SGPR_64:%vreg1
48B             %vreg12:sub0_sub1<def,read-undef> = S_LOAD_DWORDX2_IMM %vreg1, 11, 0; mem:LD8[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_128:%vreg12 SGPR_64:%vreg1
96B             %vreg12:sub3<def> = S_MOV_B32 61440; SReg_128:%vreg12
112B            %vreg12:sub2<def> = S_MOV_B32   -1; SReg_128:%vreg12
…
 
464B    BB#2: derived from LLVM BB %two
            Predecessors according to CFG: BB#0
544B            %vreg20:sub2<def> = COPY %vreg12:sub2; SReg_128:%vreg20,%vreg12
560B            %vreg20:sub3<def> = COPY %vreg12:sub3; SReg_128:%vreg20,%vreg12

%vreg12 is liveout from BB#0, segment with value #2: 112r,400r:2. Livein into #BB2 has value #2: [464B,560r:2)

After the scheduling:

%vreg12 [24r,28r:1)[28r,48r:2)[48r,400r:0)[464B,584r:2)  0@48r 1@24r 2@28r L00000001 [48r,208r:0)  0@48r L00000002 [48r,208r:0)  0@48r L00000004 [28r,384r:0)[464B,580r:0)  0@28r L00000008 [24r,400r:0)[464B,584r:0)  0@24r
 
0B      BB#0: derived from LLVM BB %entry
            Live Ins: %SGPR0_SGPR1
16B             %vreg1<def> = COPY %SGPR0_SGPR1; SGPR_64:%vreg1
24B             %vreg12:sub3<def,read-undef> = S_MOV_B32 61440; SReg_128:%vreg12
28B             %vreg12:sub2<def> = S_MOV_B32 -1; SReg_128:%vreg12
48B             %vreg12:sub0_sub1<def> = S_LOAD_DWORDX2_IMM %vreg1, 11, 0; mem:LD8[undef(addrspace=2)](nontemporal)(dereferenceable)(invariant) SReg_128:%vreg12 SGPR_64:%vreg1
…
464B    BB#2: derived from LLVM BB %two
            Predecessors according to CFG: BB#0
576B            %vreg21<def> = V_CVT_F16_F32_e32 %vreg0, %EXEC<imp-use>; VGPR_32:%vreg21,%vreg0
580B            %vreg20:sub2<def> = COPY %vreg12:sub2; SReg_128:%vreg20,%vreg12
584B            %vreg20:sub3<def> = COPY %vreg12:sub3; SReg_128:%vreg20,%vreg12

We have switched %vreg12 subreg definitions, so the new liveout from BB#0 is [48r,400r:0), value number 0. BB#2's livein is not updated: [464B,584r:2).
At this point liveout != livein and LIS is broken. Verifier will report the error.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Mar 2 2017, 4:21 PM
MatzeB edited edge metadata.Mar 2 2017, 4:37 PM

I remember that handleMove() tries to reuse the existing VNInfo* for the last liverange. Sounds like on of those cases is wrong, and should be fixed rather than doing a postpass expensively patching around liveranges.

The review does not have a dedicated test case since that is very
difficult to force scheduler to reorder two certain instructions, but
when I submit D30557 it will break the test CodeGen/AMDGPU/br_cc.f16.ll
without this patch.

That's why we have unittests/MI/LiveIntervalTest.cpp, please add a test.

I remember that handleMove() tries to reuse the existing VNInfo* for the last liverange. Sounds like on of those cases is wrong, and should be fixed rather than doing a postpass expensively patching around liveranges.

That was my first thought, so I debugged handleMove for this case. I did not find a place like this, but maybe I'm just not familiar enough with this piece of code. What's interesting subranges are updated properly.

That's why we have unittests/MI/LiveIntervalTest.cpp, please add a test.

Thank you for the pointer, I will look into this.

MatzeB added a comment.Mar 2 2017, 5:19 PM

I remember that handleMove() tries to reuse the existing VNInfo* for the last liverange. Sounds like on of those cases is wrong, and should be fixed rather than doing a postpass expensively patching around liveranges.

That was my first thought, so I debugged handleMove for this case. I did not find a place like this, but maybe I'm just not familiar enough with this piece of code. What's interesting subranges are updated properly.

handleMoveUp/handleMoveDown are pretty complicated, and this is certainly not the first we fine a bug in them. The last times I tried to construct simple testcase and then stepping around with a debugger to see in which of the many cases you end up and then trying to understand/fix it from there...

rampitec updated this revision to Diff 90512.Mar 3 2017, 11:18 AM
rampitec edited the summary of this revision. (Show Details)

Added test. It looks like only handleMoveUp has a problem. If the same test swaps same two instructions in other direction LIS is correct.

That is a co-incidence that handleMoveDown does not have this problem.

The LR before handleMove: [16r,32r:0)[32r,96r:1)[112B,128r:1).
After handleMoveDown: [8r,32r:0)[32r,96r:1)[112B,128r:1).
After handleMoveUp: [8r,16r:1)[16r,96r:0)[112B,128r:1).

handleMoveUp managed to change original value numbers 0 and 1 at BB#0 but did not attempt to update a successor. handleMoveDown did not brake just because it has preserved original value number order.
Neither of two have any loops which would propagate a new value number to successors. That is not sufficient to just properly connect one successor, all of them need to be updated recursively.
So as long as original value number order cannot be preserved an update loop is needed.

rampitec updated this revision to Diff 90522.Mar 3 2017, 12:57 PM

Successors only updated in a very rare case liveout value number has changed. The fast check should keep compile time as before.

Thanks for putting up the testcase!

I am still not convinced this is the best fix. Before invoking handleMove there must have been 1 value number living out of the block, we may create new value number in the process, however we should always be able to implement handleMove in a way that we re-use the value number that was live-out previous for whatever new value is live-out now. This means we have to fix handleMoveUp() and not add more core to work around a bug that really is in handleMoveUp().

The LR before handleMove: [16r,32r:0)[32r,96r:1)[112B,128r:1).
After handleMoveDown: [8r,32r:0)[32r,96r:1)[112B,128r:1).
After handleMoveUp: [8r,16r:1)[16r,96r:0)[112B,128r:1).

Yep, this is a bug, it should look like:

[8r,16r:0)[16r,96r:1)[112B,128r:1) after handleMoveUp and I see no reason why it couldn't.

Thanks for putting up the testcase!

I am still not convinced this is the best fix. Before invoking handleMove there must have been 1 value number living out of the block, we may create new value number in the process, however we should always be able to implement handleMove in a way that we re-use the value number that was live-out previous for whatever new value is live-out now. This means we have to fix handleMoveUp() and not add more core to work around a bug that really is in handleMoveUp().

It is not only move up, move down can do the same I guess. It worth nothing I see no actual case when it happens. So I think it needs to be fixed in handleMove itself.
Do you think we can just swap value numbers of old and new liveouts instead of propagating a new value number to successors?

Thanks for putting up the testcase!

I am still not convinced this is the best fix. Before invoking handleMove there must have been 1 value number living out of the block, we may create new value number in the process, however we should always be able to implement handleMove in a way that we re-use the value number that was live-out previous for whatever new value is live-out now. This means we have to fix handleMoveUp() and not add more core to work around a bug that really is in handleMoveUp().

It is not only move up, move down can do the same I guess. It worth nothing I see no actual case when it happens. So I think it needs to be fixed in handleMove itself.
Do you think we can just swap value numbers of old and new liveouts instead of propagating a new value number to successors?

I have fixed similar bugs to this in the past, there is an expectations for handleMoveUp/handleMoveDown to maintain the same value number for the live-out value. The code there already juggles value number to achieve this probably gets mixed up in of the many cases.

Thanks for putting up the testcase!

I am still not convinced this is the best fix. Before invoking handleMove there must have been 1 value number living out of the block, we may create new value number in the process, however we should always be able to implement handleMove in a way that we re-use the value number that was live-out previous for whatever new value is live-out now. This means we have to fix handleMoveUp() and not add more core to work around a bug that really is in handleMoveUp().

It is not only move up, move down can do the same I guess. It worth nothing I see no actual case when it happens. So I think it needs to be fixed in handleMove itself.
Do you think we can just swap value numbers of old and new liveouts instead of propagating a new value number to successors?

I have fixed similar bugs to this in the past, there is an expectations for handleMoveUp/handleMoveDown to maintain the same value number for the live-out value. The code there already juggles value number to achieve this probably gets mixed up in of the many cases.

In fact if I change my fix to swap liveout VN if changed instead of iterating successors that should be fast and safe if I understand it right. Also less cryptic source.

rampitec updated this revision to Diff 91429.Mar 10 2017, 3:28 PM
rampitec retitled this revision from Fix value numbers in successor blocks if liveout number has changed to Fix subreg value numbers in handleMoveUp.
rampitec edited the summary of this revision. (Show Details)

Removed post-processing and updated handleMoveUp.
Test renamed to SubRegMoveUp.

MatzeB accepted this revision.Mar 10 2017, 3:39 PM

This looks good, thanks!

This revision is now accepted and ready to land.Mar 10 2017, 3:39 PM
This revision was automatically updated to reflect the committed changes.