This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Generalize foldSelectCCToShiftAnd
Needs ReviewPublic

Authored by mgudim on Nov 4 2022, 6:47 AM.

Details

Summary

[DAGCombine] Generalize foldSelectCCToShiftAnd

There are many equivalent ways to express the pattern `select_cc setlt
X, 0, A, 0". In this PR we extend the code in foldSelectCCToShiftAnd
to work on all such patterns.

By the time we call foldSelectCCToShiftAnd, the non-strict integer
inequalities must have been canonicalized to strict ones. But we can't
expect that the zero constant can only appear as the "false" argument of
the select. In this patch we extend foldSelectCCToShiftAnd to handle
these "inverted" cases.

Diff Detail

Event Timeline

mgudim created this revision.Nov 4 2022, 6:47 AM
mgudim requested review of this revision.Nov 4 2022, 6:47 AM
mgudim added inline comments.Nov 4 2022, 6:51 AM
llvm/test/CodeGen/RISCV/select-to-shift-and.ll
2

This is a target-independent optimization, so I am not sure how to properly test it. I included this file only to demonstrate how the optimization works.

Please let me know what you think a proper place / form for this test should be.

mgudim updated this revision to Diff 473614.Nov 7 2022, 4:31 AM
mgudim updated this revision to Diff 474537.Nov 10 2022, 6:46 AM
mgudim added a reviewer: amehsan.
mgudim edited reviewers, added: reames, tyomitch; removed: amehsan.Nov 10 2022, 9:12 AM

Do the SETLE and SETGE cases occur? I thought DAGCombine always changes those to SETLT and SETGE by changing the constant?

mgudim added a comment.EditedNov 13 2022, 12:52 AM

Do the SETLE and SETGE cases occur? I thought DAGCombine always changes those to SETLT and SETGE by changing the constant?

@craig.topper
Thanks for taking a look! Yes, you're right. I worked on this a month ago and some cases didn't work. Today I ran the latest master and everything worked. The simplification happend in SimpifySetCC. I am not sure what changed exactly. Anyway, the approach with canonicalizing SETCC is much better than making all other combines handle all the posibilities.

I'll see why the optimization didn't happen in the AArch64 test case.

mgudim updated this revision to Diff 475094.Nov 14 2022, 3:31 AM

@craig.topper

When I previously said that now optimization works in all the cases I was wrong. You're right that we don't need to consider the non-strict inequalities. But we still need to extend the logic to work on the "inverted" patterns where the zero is the true value in the select instruction.

mgudim edited the summary of this revision. (Show Details)Nov 14 2022, 3:33 AM
mgudim updated this revision to Diff 475099.Nov 14 2022, 3:58 AM
mgudim updated this revision to Diff 475133.Nov 14 2022, 6:47 AM
craig.topper added inline comments.Nov 16 2022, 9:36 PM
llvm/test/CodeGen/RISCV/select-to-shift-and.ll
2

Why did you stop on MIR instead of assembly?

mgudim added inline comments.Nov 17 2022, 1:45 AM
llvm/test/CodeGen/RISCV/select-to-shift-and.ll
2

Since we are testing something in ISel I thought it would make sense to run only ISel.

I do see that most of other tests go all the way to assembly. I am fine with either way.

mgudim added a comment.EditedDec 8 2022, 1:47 AM

@craig.topper For the test, do you want to go all the way to assembly or stopping after isel is OK? Since the optimization is target-independent, where should the test go? Do you have further comments?

reames requested changes to this revision.Dec 8 2022, 7:50 AM
reames added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
24476

A style, and process suggestion.

These first patterns can be handled via the common idiom of reading the operands, and then swapping the operand values before the main transform if required.

This would stand along well as it's own change. This is the part which currently matches your change description anyways.

24484

This part relies on an additional observation that X == 0, the two sides of the select return the same value.

There's an interesting question about whether we want to canonicalize these towards the other form. 0 and -1 are likely cheaper constants on at least some architectures. Might be worth an experiment to see how that plays out.

Assuming we don't, this can be a separate change with a clarified comment to emphasize the additional fact stated at the top of this comment.

llvm/test/CodeGen/RISCV/select-to-shift-and.ll
2

For codegen tests, it is expected to show assembly. That's normal practice, and you should follow that. Please replace update_mir with update_llc_test_checks.py

This revision now requires changes to proceed.Dec 8 2022, 7:50 AM
mgudim updated this revision to Diff 482823.Dec 14 2022, 5:39 AM

Implemented approach suggested by @reames : instead of considering all the possible cases, assume that the zero constant always appears as the "false" option of select and if not, invert the comparison, switch the true and false option and call foldSelectCCToShiftAnd recursively.

Updated test to run all the way to final assembly.

mgudim marked 3 inline comments as done.Dec 14 2022, 5:52 AM
mgudim added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
24484

Ok, let's leave at as a separate change for later.

24499

I thought about doing this reversal all the way up in SimplifySelectCC. This may enable other transformations that previously assumed the canonicalized form. For example:

// Fold select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C
// Fold select_cc setlt X, 0, C, ~C -> xor (ashr X, BW-1), ~C

I haven't tried it yet, but do you think this is worth exploring?

llvm/test/CodeGen/AArch64/arm64-fmax.ll
61–62

I need to update 59 tests for these change. In some tests the checks were not auto-generated. What is the usual procedure to update many tests? Is it ok to insert auto-generated checks in the tests that didn't have them before?

mgudim marked an inline comment as done.Dec 14 2022, 8:20 AM
mgudim updated this revision to Diff 483109.Dec 15 2022, 3:00 AM

Added necessary checks missing in previous version. This fixed the failing tests.

mgudim marked an inline comment as not done.Dec 15 2022, 4:10 AM
mgudim updated this revision to Diff 483123.Dec 15 2022, 4:15 AM

Removed sve-select.s which was added by accident.

@reames Friendly ping.