Page MenuHomePhabricator

[ARM] Make sure that the constant pool does not keep in the middle of an IT block.
ClosedPublic

Authored by simpal01 on Jul 12 2019, 2:25 AM.

Details

Summary

This change make sure that llvm does not emit an invalid IT block by putting the constant pool in the middle of an IT block. We have code to try to avoid putting a constant island in the middle of an IT block, but it only works if we see an IT between the one currently referencing CPE and possible insertion point. If the first instruction we look at is the VLDRD after the IT , we never see the IT and does not realize that the instruction doing the load could be in an IT block itself.

Diff Detail

Event Timeline

simpal01 created this revision.Jul 12 2019, 2:25 AM

Hello. Can you add some more context? It makes the reviews easier to read.

simpal01 edited reviewers, added: dmgreen; removed: greened.Jul 12 2019, 2:38 AM
simpal01 updated this revision to Diff 209457.Jul 12 2019, 4:06 AM

Added more context.

dmgreen added inline comments.Jul 12 2019, 9:56 AM
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1393

I'm not sure I understand this (existing) code completely. Would it be possible for this new loop to walk _past_ the end of BaseInsertOffset, making a CPE out of range of the user? Because each iteration of this outer loop can now increase Offset by more than just 2/4 (getInstSizeInBytes), and the --MI at the end of the loop might not get us back in-range.

Or would it be possible to miss instructions that should be handled in the "if (CPUIndex < NumCPUsers && CPUsers[CPUIndex].MI == &*MI)" block above because they are inside IT blocks?

1400

This assert shouldn't be in a LLVM_DEBUG. Can you move it out whilst you are here?

1403

Would it be possible to put the loop here? So if we didn't see a LastIT, but are still in a IT block, we need to get out of it one way or another. My understanding it that would only happen if we started in an IT block (but may be mistaken).

efriedma added inline comments.Jul 12 2019, 1:23 PM
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1393

An IT block is at most 4 instructions plus the "it" itself (18 bytes). The code normally tries to choose the last possible point to split, and all the relevant instructions support much larger offsets, so we should never choose a split point inside the same it block as the constant pool reference.

But we're hitting the "This could point off the end of the block" case here; for reasons I don't really understand, this cuts off the iteration well before the end of the block. Probably the right fix is to change the computation of BaseInsertOffset so it doesn't cut off the loop before the first legal split point.

1400

I think it's written this way to avoid an unused variable warning for PredReg. But yes, it should be using #ifndef NDEBUG, not LLVM_DEBUG.

llvm/test/CodeGen/ARM/constant-islands-split-IT.mir
94

Do you know why we're trying to split this block in the first place? It should be possible to place all the necessary constant pool entries after the call to __stack_chk_fail.

simpal01 marked 2 inline comments as done.Jul 16 2019, 7:57 AM
simpal01 added inline comments.
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1403

Yes. It happened to notice only when we did not see a LastIT, but are still in a IT block

llvm/test/CodeGen/ARM/constant-islands-split-IT.mir
94

When this pass iteratively placing the constant pool placements,

  1. If the CPE is placed after the water then that water get removed from the WaterList.
  2. There is a NewWaterList which will be updated with the NewIsland created.
  3. There is something called HighWaterMark which records the highest basic block where a CPEntry is placed.

In this particular test case, after the first iteration of constant pool placements, the block structure will be looking like below.The CONSTPOOL_ENTRY_6, %const.0 is added after the UserBB "BB5" because that is the only place where it can find the water in range.

BBO:

Referring %const.0

BB1:

CONSTPOOL_ENTRY_3 (%const.0)

BB2:

CONSTPOOL_ENTRY_4 (%const.1)

BB3:

CONSTPOOL_ENTRY_5 (%const.2)

BB4: SPACE 790

BB5:

Referring %const.1
Referring %const.2
Referring %const.0

BB6:

CONSTPOOL_ENTRY_6 (%const.0)

When the second iteration starts, the current placement for the CP entry CONSTPOOL_ENTRY_4 (%const.1) for the CPUser in BB5 will be out of range.Then it tries to look for water where it can place this CPE. It can see BB6 is in range. But it not only checks water is in range but also checks if it is either at a lower address than the high water mark (this is not true here.The current high watermark for this CPE is 2 which is greater than BB6) or a new water block that was created (BB6 is the new water block that was created in the end of previous iteration,but the NewWaterList gets cleared before the second iteration started) . Hence it can not find any water where it can place CONSTPOOL_ENTRY_4 (%const.1). Then it tries to split in the middle of the UserBB BB5.

simpal01 marked an inline comment as done.Jul 16 2019, 8:20 AM
simpal01 added inline comments.
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1393

Yes. It is possible for this new loop to walk past the end of BaseInsertOffset and can make a CPE out of range of the user. But i thought since this pass iteratively place or move around the constant pools untill all the CPE is in range with the corresponding CPE user, this out of range can catch and make in range in the next iteration. Not sure if this is an optimal solution though!!

simpal01 marked an inline comment as done.Jul 16 2019, 9:19 AM
simpal01 added inline comments.
llvm/test/CodeGen/ARM/constant-islands-split-IT.mir
94

just a small correction in the above writings.

if it is either at a lower address than the high water mark (this is not true here.The current high watermark for this CPE is 2 which is greater than BB6)

It should be,

if it is either at a lower address than the high water mark (this is not true here.The current HighWaterMark for this CPE is at BB2 which is already at lower address than BB6)

simpal01 updated this revision to Diff 210844.EditedJul 19 2019, 8:56 AM

Uploaded a new patch. I think this is more sensible way of fixing than the previous one.

In this particular test case , the userBB referring CPE (VLDRD %const.2, 0, 0, $cpsr) is nearly at the end of block .

t2IT 0, 1, implicit-def $itstate
 renamable $d0 = VLDRD %const.1, 0, 0, $cpsr, implicit $itstate :: (load 8 from constant-pool)
 renamable $d1 = VLDRD %const.2, 0, 0, $cpsr, implicit $itstate :: (load 8 from constant-pool)
 renamable $d2 = VLDRD %const.0, 0, 0, $cpsr, implicit $itstate :: (load 8 from constant-pool)
 $r0 = t2SUBri $r0, 12, 0, $cpsr, $noreg, implicit killed $itstate
 t2IT 0, 4, implicit-def $itstate
 $sp = tMOVr $r0, 0, $cpsr, implicit $itstate
 $sp = t2LDMIA_RET $sp, 0, killed $cpsr, def $r4, def $r5, def $r6, def $r7, def $r8, def $r9, def $r10, def $r11, def $pc, implicit killed $d0, implicit killed $d1, implicit killed $d2, implicit $sp, implicit killed $itstate
 tBL 14, $noreg, &__stack_chk_fail, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp

So the initial BaseInsertOffset calculated (iBaseInsertOffset = UserOffset + U.getMaxDisp() - UPad ) should be way down and and will at the offset after this block here .
Hence this particular condition (BaseInsertOffset + 8 >= UserBBI.postOffset()) becomes true here and does the recalculation of baseinsertoffset and make the new BaseInsertOffset just after the UserOffset ( UserOffset + TII->getInstSizeInBytes(*UserMI) + 1) . This is the one actually blowing off here

I think we can check here if this recalculated BaseInsertOffset is in the middle of IT block.If it is in the middle, change the computation of BaseInsertOffset to after the IT block.

I think only in this case IT can be spitted to keep the constant pools. In all other cases as @eli.friedman told the code never choose a split point inside the same it block as the constant pool reference.Because it always tries to choose the last possible point to split, and all the relevant instructions support much larger offsets.

Any thoughts/comments?

This makes sense to me... but I'm not sure this handles all possible cases. In particular, consider the case where both an instruction that needs a constant pool and the block's terminator are inside the it block. Then the split point is after a terminator, so the code still doesn't work. This might be hard to trigger, though... we don't normally put terminators inside it blocks. A predicated unconditional branch generally turns into a conditional branch, not an unconditional branch in an it block. And we don't support predicating jump table branches. Actually, I think the only branch we predicate using an it block is tBRIND (which is generated from an IR indirectbr). But it's still possible to hit that case, and we could add more similar cases in the future. (A quick example where an it block contains both a terminator and non-terminator instruction: echo "int f(void *g(void*,void*), void **q) { void*p = g(&&X, &&Y); if (p){*q=p; goto *p;} return 7; X: return 5; Y: return 6; }" | clang -x c - -o - -S -mllvm -stop-after=arm-cp-islands --target=armv7-eabi -O2 -mthumb.)

So to make sure this works in general, I see two options:

  1. Split the it block by inserting a new it instruction.
  2. Fix the algorithm so it doesn't force splitting a block when it clearly doesn't need to be split. I haven't dug into the algorithm here really carefully, but it doesn't make sense to split a block when the constant pool reference is close to the end of the block. (I'm not sure exactly how to define "close", but it probably includes constant pool references within 50 bytes of the end of the block, given all pc-relative loads can reach data over 500 bytes away.)

We could merge the current patch anyway, though; even if it doesn't fix everything, it shouldn't break any constructs that aren't already broken.

llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1328

I think moving the variable here is making the code less clear; better to just use a separate variable inside the if statement.

1359

It's probably more readable to refactor the logic to compute the "next" possible offset, then perform the std::max() afterwards.

simpal01 updated this revision to Diff 211108.Jul 22 2019, 8:21 AM
simpal01 marked 2 inline comments as done.

Addressed @efriedma suggestions.

carwil added a subscriber: carwil.Jul 23 2019, 6:07 AM

I think that so long as we don't end up iterating off the end of a block, trying this seems like an improvement over current trunk. Changing the way the Constant Island Pass iterates would likely be more disruptive, and enabling the assert here will let us see if there are problems elsewhere. We may find that splitting the IT block is the better option. @efriedma what do you think?

llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1358

It is probably worth copying the "Fell off end of block" assert from below to here, to make it very obvious what's wrong in case that comes up.

Changing the way the Constant Island Pass iterates would likely be more disruptive

This is definitely true.

I'm okay with the current approach for now.

simpal01 updated this revision to Diff 211464.EditedJul 24 2019, 4:18 AM

Adding assert statement in case "Fell off end of block" case comes

dmgreen accepted this revision.Jul 24 2019, 6:18 AM

OK thanks. LGTM. Lets give this a go and see how it does.

llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
1360

Better to move this up a line, so we hit the assert before trying to dereference I. Hopefully (I believe) it won't come up.

This revision is now accepted and ready to land.Jul 24 2019, 6:18 AM
This revision was automatically updated to reflect the committed changes.