This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Trim duplicate basic blocks in switch cases
AbandonedPublic

Authored by dylanmckay on Oct 27 2015, 8:39 AM.

Details

Summary

Adds a case to SimplifyCFG which checks for switch case basic blocks
that are identical and prunes them.

This breaks several tests which I am not sure how to fix.

Broken tests:

LLVM :: CodeGen/AArch64/arm64-jumptable.ll
LLVM :: CodeGen/ARM/jump-table-islands-split.ll
LLVM :: Transforms/SimplifyCFG/CoveredLookupTable.ll
LLVM :: Transforms/SimplifyCFG/X86/switch-table-bug.ll
LLVM :: Transforms/SimplifyCFG/X86/switch_to_lookup_table.ll

X86/switch_to_lookup_table.ll is trivial to fix.

Motivated by Rust issue #26496.
Fixes LLVM bug 24220.

CC'ing Phillip because he filed the Bugzilla bug.

Adding Renato and Tim as this breaks ARM and AArch64 tests.

Diff Detail

Event Timeline

dylanmckay updated this revision to Diff 38548.Oct 27 2015, 8:39 AM
dylanmckay retitled this revision from to [SimplifyCFG] Trim duplicate basic blocks in switch cases.
dylanmckay updated this object.
dylanmckay added reviewers: rengolin, t.p.northover.
dylanmckay added a subscriber: reames.
t.p.northover edited edge metadata.Oct 27 2015, 9:32 AM

I think it's a side-effect of the CFG-tidying AArch64 and ARM do after expanding atomic accesses. I'd suggest passing "-aarch64-atomic-cfg-tidy=false" and "-arm-atomic-cfg-tidy=false" at the top of each test. It's irrelevant to what they're actually trying to do.

BTW, llvm-commits isn't a subscriber here yet. Is that intentional?

rengolin set the repository for this revision to rL LLVM.
rengolin edited subscribers, added: rengolin, t.p.northover, llvm-commits; removed: reames.
sanjoy requested changes to this revision.Oct 27 2015, 11:28 AM
sanjoy edited edge metadata.
sanjoy added inline comments.
lib/Transforms/Utils/SimplifyCFG.cpp
3404

Should this live as BasicBlock::isIdenticalTo?

3407

Nit: no braces needed here.

3416

No braces needed here.

3425

I'd change the assert's message to We should've handled this case earlier. and remove the comment.

3462

Don't you need to adjust PHI nodes in CurSuccessor, since it now has a new incoming block?

This revision now requires changes to proceed.Oct 27 2015, 11:28 AM
dylanmckay updated this revision to Diff 38618.Oct 27 2015, 7:20 PM
dylanmckay edited edge metadata.

Added flags to ARM and AArch64 tests as suggested by Tim - this fixes the tests.

Added a hidden command line parameter to disable duplicate basic block merging. This fixes the two broken X86 tests, which were trivially broken due to basic block renaming. This also fixes CoveredLookupTable.ll for the same reason.

  • Moved basic block comparison into a method on BasicBlock
  • Removed redundant braces
  • Changed assert message
dylanmckay marked 4 inline comments as done.Oct 27 2015, 7:21 PM
dylanmckay added inline comments.Oct 27 2015, 7:26 PM
lib/Transforms/Utils/SimplifyCFG.cpp
3461

Does it really have a new incoming block? The block that contains the switch statement will still be the successor because all case statements are apart of the same basic block?

It is highly likely that I am misunderstanding, however.

Note that all tests are fixed now - Tim's suggestion fixed the ARM and AArch64 tests.

The X86 tests and CoveredLookupTable.ll were broken solely because they assumed the names of the successors did not change. This is irrelevant to the test, so I simply disabled duplicate block merging for them.

sanjoy requested changes to this revision.Oct 28 2015, 5:00 PM
sanjoy edited edge metadata.
sanjoy added inline comments.
include/llvm/IR/BasicBlock.h
318

Prefix the sentence with a \brief like in the other methods.

lib/IR/BasicBlock.cpp
453

Nit: spacing.

lib/Transforms/Utils/SimplifyCFG.cpp
3411

Why do you need AC and DL?

3461

PHI nodes have to have an entry for each incoming edge -- if multiple
edges come from the same predecessor then you need multiple entries in
the PHI node for that predecessor (all of which should have the same
incoming value).

So if you start with

declare void @foo()

define void @main(i1 %c) {
 entry:
  br i1 %c, label %match_case2, label %sw

 sw:
  %value = alloca i32
  store i32 5, i32* %value
  %val = load i32, i32* %value
  
  switch i32 %val, label %join [
    i32 1, label %match_case
    i32 48, label %match_case1
    i32 92, label %match_case2
    i32 23, label %match_case3
  ]

match_case:
  call void @foo()
  unreachable

match_case1:
  call void @foo()
  unreachable

match_case2:
  %x = phi i32 [ 10, %entry ], [ 20, %sw ]
  call void @foo()
  unreachable

match_case3:
  call void @foo()
  unreachable

join:
  ret void
}

transforming it to

declare void @foo()

define void @main(i1 %c) {
 entry:
  br i1 %c, label %match_case2, label %sw

 sw:
  %value = alloca i32
  store i32 5, i32* %value
  %val = load i32, i32* %value
  
  switch i32 %val, label %join [
    i32 1, label %match_case
    i32 48, label %match_case1
    i32 92, label %match_case2
    i32 23, label %match_case2 ;; change here
  ]

match_case:
  call void @foo()
  unreachable

match_case1:
  call void @foo()
  unreachable

match_case2:
  %x = phi i32 [ 10, %entry ], [ 20, %sw ]
  call void @foo()
  unreachable

match_case3:
  call void @foo()
  unreachable

join:
  ret void
}

is incorrect (i.e. you'll fail verification) -- you also need to update %x to be phi i32 [ 10, %entry ], [ 20, %sw ], [ 20, %sw ].

This revision now requires changes to proceed.Oct 28 2015, 5:00 PM
dylanmckay updated this revision to Diff 38698.Oct 28 2015, 5:09 PM
dylanmckay edited edge metadata.

Removed unused type parameters.

They were remnants from a copy-paste of another type signature.

dylanmckay marked an inline comment as done.Oct 28 2015, 5:14 PM
dylanmckay added inline comments.
include/llvm/IR/BasicBlock.h
318

The AUTOBRIEF Doxygen option was enabled a few months ago - this is no longer necessary and presents visual noise.

I will however add it if you think consistency is worth it.

sanjoy added inline comments.Oct 28 2015, 5:17 PM
include/llvm/IR/BasicBlock.h
318

this is no longer necessary and presents visual noise.

SGTM (i.e. it is fine to not add \brief).

reames edited edge metadata.Oct 28 2015, 8:33 PM

A couple of minor comments.

Have you thought about approaching this as prefix merging? If you found two successors with identical prefixes, it might be worth merging them and inserting a branch at the end to distinguish between the distinct suffixes. This would subsume your current optimization. I'm not suggesting you drop this and do that instead, but it might be an interesting follow on depending on what you actual code patterns are.

lib/IR/BasicBlock.cpp
435

I suspect this pattern already exists elsewhere in SimplifyCFG. I'd encourage you to submit a refactoring to extract it separately. I won't requite this.

Adding a maximum instructions paramater might make this more useful. (i.e. is identical to, but conservative if longer than N instructions.)

437

I believe this is a linear search implementation. (i.e. this saves nothing and is strictly extra work.)

440

Cursor1? Cur is a not a common name in LLVM. I have seen Itr used which would convey the same intent.

448

This *looks* like it will walk off the end of BB 2 if it is shorter than the this. The isIdenticalTo check should fail first since instructions won't match, but please add an assert to make this obvious.

lib/Transforms/Utils/SimplifyCFG.cpp
3432

I'm worried about large basic blocks here. Possibly cap this around 20-30 instructions?

3434

Do you need to notify the other successor this block is no longer a predecessor? PHI updates, etc?

4457

I wouldn't bother with the option.

dylanmckay marked 2 inline comments as done.Oct 28 2015, 9:28 PM

Phillip's idea is a good one - it would be fairly easy to generalize this optimization so that it performes "prefix merging".

Do you think the new optimization should still live in SimplifyCFG? One problem with keeping it in SimplifyCFG is that block merging will interfere with tests even worse than with this patch (which is why the current patch adds a command line option to disable merging), causing them to wonder "where has my basic block gone?".

If the patch were to be split into a separate pass, it would be quite a small pass which would be overkill. Perhaps it would be better to keep the optimization in SimplifyCFG and keep the command line option for testing purposes.

Thoughts?

lib/Transforms/Utils/SimplifyCFG.cpp
3461

I understand this now - thanks for the explanation Sanjoy!

reames added a subscriber: reames.Oct 28 2015, 9:39 PM

If you decide to go forward with the prefix matching, SimplifyCFG is
absolutely the right place. I'm really bothered by the fact that target
specific tests are sensitive to a target independent transform pass at
all. I didn't follow that part of the review, but any way we can fix
that? Working around it further by introducing a new pass is definitely
not something we should do.

Philip

The problem is that they are doing something akin to this:

define void @foo() {
    switch i32 %val, label %join [
        ; CHECK_NEXT: match_1
        i32 1, label %match_1
        ; CHECK_NEXT: match_2
        i32 2, label %match_2
        ; CHECK_NEXT: match_3
        i32 3, label %match_3
    ]
    match_1: br label %join
    match_2: br label %join
    match_3: br label %join

    join:
        ret void
}

They are hardcoding names of basic blocks, which this optimization mangles, so for example, all match arms would be optimized to match_1.

We could use LIT's regexps to fix this (that's my only idea) - there are only three tests which fail because of this:

LLVM :: Transforms/SimplifyCFG/CoveredLookupTable.ll
LLVM :: Transforms/SimplifyCFG/X86/switch-table-bug.ll
LLVM :: Transforms/SimplifyCFG/X86/switch_to_lookup_table.ll

dylanmckay abandoned this revision.Oct 28 2015, 10:24 PM

Upon further reflection, I believe regular expressions are the best choice (as the x86 tests shouldn't care about the names of basic blocks - they should only care that i.e. the switch statements are converted to selects).

I will fix the tests properly, remove the extraneous command line param, generalise to a basic block prefix merge optimisation and open a new diff once this is done.

Thanks for the reviews Sanjoy and Phillip, I will make sure to address all points in the next patch.

I’m not sure if the prefix merging is possible - I’ve got it to the point
where SimplifyCFG recognizes common instructions in the basic block head,
factors it into a separate block, which is then directly branched to by the
switch instruction.

The problem is that a branch to the different tails needs to be made via a
PHI node. Using this, there needs to be distinct predecessors in order to
differentiate between the two head blocks. Of course, the problem is that
both code paths into the common block come from a single switch statement
in a single basic block.

In code (after the block prefix has been factored out).

declare void @foo();

define void @main() {
entry:

switch i32 %1, label %join [

    i32 1, label %common_head
    i32 2, label %common_head
]

; the factored out head common to tail1 and tail2
common_head:

call void foo()
; SEE HERE: How to distinguish between branches from same basic block
%label_to = phi label [%tail1, %entry], [%tail2, %entry]
br %label_to

tail1: unreachable ; stub
tail2: unreachable ; stub

join:

ret void

}

Am I missing something?