This is an archive of the discontinued LLVM Phabricator instance.

[MergeICmp] Split blocks that do other work.
ClosedPublic

Authored by trentxintong on Mar 13 2018, 10:51 AM.

Details

Summary

We do not try to move the instructions and split the block till we
know the blocks can be split, i.e. BCE-cmp-insts can be separated from
non-BCE-cmp-insts.

Diff Detail

Event Timeline

trentxintong created this revision.Mar 13 2018, 10:51 AM

Add a DEBUG msg.

Update comment in a test

Thanks a lot for the patch.

lib/Transforms/Scalar/MergeICmps.cpp
147

Typo: extra "this".

153

Typo: Additionally.

179

Typo: side effects.

382

This will unconditionally split the block even if any subsequent condition for merging fails. This has following issues:

  • The pass will modify code even when it does nothing.
  • The splitting is not accounted for in the value returned by simplify(), so the analysis pass might not be correctly invalidated if we later decide to do nothing.

To fix this, you could add the block to the chain with a tag and do the actual splitting in simplify().

For the first point: What about adding a test with a single splittable block and check that the pass introduces no changes ?

trentxintong added inline comments.Mar 19 2018, 7:05 AM
lib/Transforms/Scalar/MergeICmps.cpp
382

Actually tryToSplitBCECmpBlock will not move instructions around if it can not split the block. i.e. it first tests whether all non-bce-cmp instructions can be separated from bce-cmp instructions.

The way I think about the problem is that we have a chain and within the chain there may be blocks that do work other than the BCE compare which stops us from collapsing the chain into a memcmp. This block can be the first block of the chain or in middle of the chain.

In case its the first block in the chain, we can choose to split it (or discard it in case we can not split it), which is what we do with this patch. NOTE: the way split is done will not affect what is recorded in BCECmpBlock. More specifically, a new block is created and all the non-bce-cmp instructions are moved to the new BB. The old BB stays what it is as far as bce-cmp instructions are concerned.

In case its in middle of the chain, its a bit more complicated, we need to terminate the chain, process what has been collected and restart anew (This has not been implemented, but i can see how we can implement this even with this patch. i.e. we can break out of the loop and process what has been collected. Then try to rerun the chain collecting/collapsing process again on the modified IR).

courbet added inline comments.Mar 19 2018, 7:14 AM
lib/Transforms/Scalar/MergeICmps.cpp
382

Sure, I understand what this fixes and I think it's worth fixing.
My point was that I don't think that we should eagerly split the first block if is can be split, but we should first collect this block along with other "normal" BCECmp blocks, and split it only when we know that the chain itself is mergeable (in simplify). Otherwise we might split a block just to realize later that the chain was not actually mergeable, in which case we've done useless work for nothing. That's why I was suggesting adding a test with a single block (i.e. a non-mergeable chain), to check that the pass would do nothing to it.

trentxintong added inline comments.Mar 19 2018, 7:32 AM
lib/Transforms/Scalar/MergeICmps.cpp
382

Make sense. Let me do that.

Address comment.

Update an assert.

Simplify how split instructions are moved.

Update checks in a test case.

Thanks, only stylistic comments left.

lib/Transforms/Scalar/MergeICmps.cpp
112

Add: "The block might do extra work besides the atom comparison, in which case doesOtherWork() returns true. Under some conditions, the block can be split into the atom comparison part and the "other work" part (see couldSplit()).

149

canSplit() ?

160

style: the function should start with lowercase.

160

I think you can call that just split() given that this is a member function of BCECmpBlock.

546

nit: C->splitBCECmpBlock() ?

test/Transforms/MergeICmps/X86/tuple-four-int8.ll
23

The comment is not very clear. Maybe something like:
"This is merged into a 3-byte memcmp for blocks land.elem1, land.elem2 and entry (the latter being split in the cmp part and the address computation part), and an extra 1-byte compare for land.elem0."

Address stylistic comments.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 9 2018, 6:17 AM
This revision was automatically updated to reflect the committed changes.