{ARM} IfConversion does not handle un-analyzable branch correctly
Needs ReviewPublic

Authored by gael.jobin on Thu, Oct 12, 8:48 AM.

Details

Reviewers
efriedma
iteratee
Summary

The IfConversion pass does not handle diamond correctly when both TrueBB and FalseBB have IsBrAnalyzable set to false (contains indirect branch for example) and share at least one common instruction from the end.

For my detailed investigation see below.


I got a silly bug when compiling our project with the latest Clang. Here's the outputted assembly:

tst r3, #255
strbeq r6, [r7]
ldreq r6, [r4, r6, lsl #2]
strne r6, [r7, #4]
ldr r6, [r4, r6, lsl #2]
bx r6

For the code to execute correctly, either the ldr should be a ldrne instruction or the ldreq instruction should be removed. The error seems to come from the IfConvertion MachinePass. Here's is what it looks like before and after.

#BEFORE IfConversion MachinePass

BB#7:
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#5 BB#6
   STRBi12 %R5, %R6<kill>, 0, pred:14, pred:%noreg; mem:ST1[%cond.i23.i.i.i] 
   %R6<def> = LDRBi12 %R7, 0, pred:14, pred:%noreg; mem:LD1[%15](align=4) 
   %R3<def> = EORri %R6, 254, pred:14, pred:%noreg, opt:%noreg
   %R3<def> = ANDrr %R3<kill>, %R6<kill>, pred:14, pred:%noreg, opt:%noreg
   %R6<def> = MOVi 0, pred:14, pred:%noreg, opt:%noreg
   TSTri %R3<kill>, 255, pred:14, pred:%noreg, %CPSR<imp-def>; 
   Bcc <BB#9>, pred:0, pred:%CPSR<kill>;


BB#8:
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRi12 %R6, %R7<kill>, 4, pred:14, pred:%noreg; mem:ST4[%__size_.i3.i.i.i.i] 
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]
   BX %R6<kill>

BB#9:
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRBi12 %R6, %R7<kill>, 0, pred:14, pred:%noreg; mem:ST1[%21](align=4)  
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]
   BX %R6<kill>

#AFTER IfConversion MachinePass

BB#7:
   ...
   TSTri %R3<kill>, 255, pred:14, pred:%noreg, %CPSR<imp-def>; 
   STRBi12 %R6, %R7, 0, pred:0, pred:%CPSR; mem:ST1[%21](align=4) 
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:0, pred:%CPSR, %R6<imp-use,kill>; mem:LD4[%0]
   STRi12 %R6, %R7<kill>, 4, pred:1, pred:%CPSR<kill>; mem:ST4[%__size_.i3.i.i.i.i]  
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]
   BX %R6<kill>

Inside the IfConvertDiamondCommon() function of IfConversion.cpp, the function is called with NumDups2=1, which makes sense because BB#8 and BB#9 share the same LDRrs instruction with the same operands. The problem is the call to TTI->removeBranch() function that does not remove the BX instruction. Thus, when removing the common instructions, the BX is removed instead of the LDRs instruction.

# Before removeBranch call on MBB1
BB#9: derived from LLVM BB %if.else.i.i.i.i
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRBi12 %R6, %R7<kill>, 0, pred:14, pred:%noreg; mem:ST1[%21](align=4) 
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]
   BX %R6<kill>

# After removeBranch call on MBB1, returned value:0
BB#9: derived from LLVM BB %if.else.i.i.i.i
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRBi12 %R6, %R7<kill>, 0, pred:14, pred:%noreg; mem:ST1[%21](align=4) 
   %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]
   BX %R6<kill>

#After removing common instructions (NumDups2=1) on MBB1
BB#9: derived from LLVM BB %if.else.i.i.i.i
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRBi12 %R6, %R7<kill>, 0, pred:14, pred:%noreg; mem:ST1[%21](align=4) 
  %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:14, pred:%noreg; mem:LD4[%0]

#After predicated on MBB1
BB#9: derived from LLVM BB %if.else.i.i.i.i
   Live Ins: %LR %R0 %R1 %R2 %R4 %R5 %R6 %R7 %R8 %R9 %R10 %R12
   Predecessors according to CFG: BB#7
   STRBi12 %R6, %R7<kill>, 0, pred:0, pred:%CPSR; mem:ST1[%21](align=4) 
  %R6<def> = LDRrs %R4, %R6<kill>, 16386, pred:0, pred:%CPSR; mem:LD4[%0]

IfConvertDiamondCommon() support different scenarios for a given EntryBB. A quick summary:

  1. FalseBB and TrueBB contains both an analyzable branch. SkipUnconditionalBranches=true meaning the branches are not checked if they are identical. ValidDiamond() checks that they end up on the same TailBB. If TailBB is either nullptr or not the same for both TrueBB and FalseBB, the diamond is not valid. IfConvertDiamondCommon() is called with RemoveBranch=true. Branches are successfully removed in both blocks, thus common instructions are correctly removed based on Dups2 value. At the end, TailBB is merged or a conditional branch is inserted. IfConversion successful.
  2. FalseBB not analyzable and TrueBB analyzable or inversely (inducing a different branch instruction). ValidDiamond() is going to fail since FalseBB/TrueBB.TrueBB will be equal to nullptr for the not analyzable one hence TT =! FT will trigger.
  3. FalseBB and TrueBB both not analyzable (indirect branch for example on ARM). So SkipUnconditionalBranches=false.
    1. If the branches are not identical, CountDuplicatedInstruction will set Dups2 to 0. Then IfConvertDiamondCommon() is called with RemoveBranch=false. The removeBranch() on TrueBB/BBI1 will silently fail. TrueBB will be predicated completely (branch instruction included). FalseBB on the other hand will be predicated excluding the branch instruction (due to the loop on DI2 when RemoveBranch=false). IfConversion successful.
    2. If the branches are identical. CountDuplicatedInstruction will set Dups2 accordingly.
      1. If there's no common instruction, same behaviour as in 3.A
      2. If there's some common instructions, this is my case !

In my case: The EntryBB is analyzable but TrueBB/FalseBB are not (AnalyzeBranches()). TrueBB/FalseBB instructions can be predicated (ScanInstruction(). feasibleDiamond()). TrueBB/FalseBB cannot SkipUnconditionalBranches (ValidDiamond()). CountDuplicatedInstruction() compute Dups1=0 as there's no common instructions found at the beginning and Dups2=1 as it found one common instruction at the end. During their computation, neither debug instructions (skipDebugInstructionsForward()) nor branch instructions (TIB->isBranch()) are counted. We note that because TrueBB and FalseBB are not analyzable, CountDuplicatedInstruction() also ensure that the branch instruction is exactly the same in both blocks. If not, it would not fail but simply set Dups2 to 0.

Now when the blocks are prepared to be merged in IfConvertDiamondCommon(), removeBranch() fails silently on TrueBB (BBI1) as indirect branches are not supported. Hence the erasing of the common instructions based on Dups2=1 value will produce an invalid block because it will remove the branch instead of the actual common instruction.

In the patch below, I manually delete the branch if and only if the block is un-analyzable and both branches are identical in both block TrueBB/FalseBB. I also removed the call to verifySameBranchInstructions() as the blocks do not have to have identical branches as explained in point 3.A above. I also removed the code related to "two predicated terminators" because the only way for DI2 to be equal to MBB2.end() is with RemoveBranch=true and NumDups2=0 which means that the branch have been removed and getFirstTerminator() will always return end() and never go through the second if statement.

Diff Detail

gael.jobin created this revision.Thu, Oct 12, 8:48 AM
gael.jobin edited the summary of this revision. (Show Details)Thu, Oct 12, 8:50 AM
gael.jobin added reviewers: efriedma, iteratee.

In general, please follow existing style for the file you're editing.

Also, next time, please upload a complete diff of the file, rather than just 3 lines of context.

lib/CodeGen/IfConversion.cpp
1753

space after if

1756

Please use reverse iterators for anything that must stop at the beginning.

1760

space after brace, and after while

Updating the diff according with the comments.

Remove undesired tab.

iteratee added inline comments.Mon, Oct 16, 9:52 AM
lib/CodeGen/IfConversion.cpp
1803

This is backwards. Reverse iterators start at the last element and go 1 past the first.

RTIE should be const, and you should be incrementing RTIB

gael.jobin updated this revision to Diff 119470.EditedWed, Oct 18, 7:25 AM

Instead of trying to manually handle the case of branches not removed by the removeBranch(), I have chosen another approach.

The CountDuplicatedInstructions() set Dups1 and Dups2 respectively. The first one is the number of common instructions from the beginning between two blocks (TrueBB and FalseBB) and the second is the number of common instructions from the end. Dups2 counts any instruction except debug info instruction or branches (isBranch()).

The real problem si when Dups2 is used in DiamondCommon(). The loop that remove common instructions is missing a condition to ignore branch instructions (it was maybe assumed that no branch instructions can exist because of the foregoing removeBranch() call ). As we have seen, removeBranch() does not remove all kind of branches. Then, we need to add a condition into the loop.

This patch simply apply the same logic inside CountDuplicatedInstructions() into DiamondCommon() when dealing with Dups2 variable, which makes sense.