This is an archive of the discontinued LLVM Phabricator instance.

Lower consecutive select instructions correctly.
ClosedPublic

Authored by danielcdh on Sep 1 2016, 11:58 AM.

Details

Summary

If consecutive select instructions are lowered separately in CGP, it will introduce redundant condition check and branches that cannot be removed by later optimization phases. This patch lowers all consecutive select instructions at the same to to avoid inefficent code as demonstrated in https://llvm.org/bugs/show_bug.cgi?id=29095

Diff Detail

Event Timeline

danielcdh updated this revision to Diff 70041.Sep 1 2016, 11:58 AM
danielcdh retitled this revision from to Lower consecutive select instructions correctly..
danielcdh updated this object.
danielcdh added a reviewer: davidxl.
danielcdh added a subscriber: llvm-commits.
vsk added a subscriber: vsk.Sep 2 2016, 12:01 PM
vsk added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
4545

Assert SI1->getParent() == SI2->getParent()?

4549

Use 'auto It', a const_iterator, and clang-format?

4553

Is SI2 not dependent on SI1 if, e.g SI2->getTrueValue() == (add SI 1)? I'm fuzzy on what 'dependent' means in this context, sorry!

4556

This would be better off as an llvm_unreachable; that should let you remove the return.

4571

I'm worried that this is quadratic in the number of SelectInsts in a BB. Is there a way to perform the isDependent check by looking at the users of a SelectInst instead of traversing the BB?

test/CodeGen/X86/codegen-prepare-select.ll
8 ↗(On Diff #70041)

Please move these checks closer to the IR which produces the expected assembly.

Similarly, please move the NOT tests closer to the IR which used to produce the corresponding assembly.

40 ↗(On Diff #70041)

It would help me follow along if you could reduce this test a bit more. E.g, besides the branch_weights, I'm not sure if any of the metadata here is really needed. Similarly, it's not clear if/why some of the operations in the loop are really needed (mul, ashr, sext). If they really are needed, it would help to have a small explanatory comment.

danielcdh updated this revision to Diff 70240.Sep 2 2016, 3:00 PM
danielcdh marked 4 inline comments as done.

update

lib/CodeGen/CodeGenPrepare.cpp
4553

function removed

4556

removed

test/CodeGen/X86/codegen-prepare-select.ll
8 ↗(On Diff #70041)

remove the entire test and added to some legacy tests.

vsk added a comment.Sep 2 2016, 3:30 PM

Thanks! I found this much easier to follow.

It might be worth adding negative tests (for >1 jae), where the conditions of adjacent SelectInst's differ, or where there is overlap between the true/false values.

danielcdh updated this revision to Diff 70246.Sep 2 2016, 3:51 PM

add more test

danielcdh updated this revision to Diff 70247.Sep 2 2016, 3:54 PM

correct comment

davidxl added inline comments.Sep 2 2016, 5:54 PM
lib/CodeGen/CodeGenPrepare.cpp
4549

Always push SI first and start from the next instruction?

4558

Early return true if ASI size is 1?

4559

Probably use a different variable name from the input parameter.

4560

I1 --> TI
I2 --> FI

4564

Add a comment here for explanation.

4613

Instead of skipping the optimization, It is better to keep the current behavior if those selects can not be grouped -- optimize them one by one. Skipping optimization completely need more discussion and can be done as follow up.

danielcdh updated this revision to Diff 70263.Sep 2 2016, 9:55 PM
danielcdh marked 4 inline comments as done.

update

lib/CodeGen/CodeGenPrepare.cpp
4558

code removed

4564

code removed

4613

Had a better fix to make sure that the PHI can be lowered even there is dependency.

davidxl added inline comments.Sep 9 2016, 12:16 PM
lib/CodeGen/CodeGenPrepare.cpp
4619

Perhaps use a local var LastSI to be ASI.back() ?

4734

Add a comment about using reverse iterator.

4741

Add a comment here describing the situation.

danielcdh updated this revision to Diff 70893.Sep 9 2016, 1:20 PM
danielcdh marked an inline comment as done.

update

danielcdh marked 2 inline comments as done.Sep 9 2016, 1:20 PM
davidxl added inline comments.Sep 9 2016, 4:30 PM
lib/CodeGen/CodeGenPrepare.cpp
4738

There is probably no need to do reverse iteration. I suggest create a helper function

Value *getTrueOrFalseValue(SelectInst *SI, bool isTrue, const SmallPtrSet<...>& Selects) {

Value *V; 
do {
   V = (isTrue? SI->getTrueValue() : SI->getFalseValue());
   VI = dyn_cast<SelectInst>(V);
   if (!VI || !Selects.count(VI))
      break;
   SI = VI;
 } while (true);
 return V;

}

Then the code below can be simplified a lot and easier to read:

PN->addIncoming(getTrueOrFalseValue(SI, true, INS), TrueBlock);
PN->addIncoming(getTrueOfFalseValue(SI, false, INS), FalseBlock);

danielcdh updated this revision to Diff 70935.Sep 9 2016, 5:30 PM

refactor the helper function

davidxl added inline comments.Sep 12 2016, 12:23 PM
lib/CodeGen/CodeGenPrepare.cpp
4586

is defined

4589

make it static.

4594

Before the loop save SI to SI0. Here an assert can be added to make SI and SI0 have same cond.

danielcdh updated this revision to Diff 71045.Sep 12 2016, 12:50 PM
danielcdh marked 2 inline comments as done.

update

davidxl accepted this revision.Sep 12 2016, 12:58 PM
davidxl edited edge metadata.

ltgm

This revision is now accepted and ready to land.Sep 12 2016, 12:58 PM
danielcdh closed this revision.Sep 12 2016, 1:31 PM