This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Use branchless form for selects with -1 in either arm
ClosedPublic

Authored by reames on Oct 5 2022, 1:33 PM.

Details

Summary

We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests).

Reviewers - I'm assuming the result of a setcc is [0,1] in this patch. This is correct right?

Diff Detail

Event Timeline

reames created this revision.Oct 5 2022, 1:33 PM
reames requested review of this revision.Oct 5 2022, 1:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2022, 1:33 PM
reames retitled this revision from [RISCV] Fold selects with -1 in either arm to [RISCV] Use branchless form for selects with -1 in either arm.Oct 5 2022, 1:34 PM
craig.topper added inline comments.Oct 5 2022, 1:47 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9420

Swapping operands is not the inverse.

reames updated this revision to Diff 465567.Oct 5 2022, 3:03 PM

Fix bug pointed out by Craig

craig.topper added inline comments.Oct 5 2022, 3:20 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
9414

Not -> Neg

llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

Is the seqz+addi equivalent to neg since a0 is [0,1]?

reames added inline comments.Oct 5 2022, 3:31 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

It is. Not sure what's causing this to be formed for RV32. We get the not form for RV64.

(I literally just worked through the two cases by hand if you want to double check my reasoning.)

reames updated this revision to Diff 465578.Oct 5 2022, 3:34 PM

Address style comment.

reames added inline comments.Oct 5 2022, 3:55 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

Sorry, typo. We got the *neg* form for RV64.

craig.topper added inline comments.Oct 5 2022, 4:00 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

Weird ordering quirk.

On RV32, we expand a select_cc that has a uaddo condition. This gives a select_cc with a compare with. The uaddo is expanded later and introduces a setcc. Now we have a select_cc with a setcc as the condition LHS. Your select_cc optimization runs before we get a chance to fold the setcc into the select_cc. This creates a seteq with a setcc input. A combine on the neg creates the addi. We don't have any combine to fix the two setccs together.

On RV64, the uaddo is expanded during type legalizaton instead of LegalizeDAG so we form a different select_cc in LegalizeDAG.

What happens if you move your new combine below the call to combine_CC?

reames added inline comments.Oct 6 2022, 1:35 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

It isn't clear to me why moving the code below another transform in what is presumably a fixed point loop would cause any difference. I tried it, and confirmed there was no change in this test. Unless maybe I misunderstood you here?

reames added inline comments.Oct 6 2022, 1:44 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

Er ignore for moment. I was on the wrong branch, and dealing with the wrong patch entirely..

craig.topper added inline comments.Oct 6 2022, 1:48 PM
llvm/test/CodeGen/RISCV/uadd_sat_plus.ll
19

I expect it will change because (select_cc (setult X, Y), 0, ne, A, B) will get combined to (select_cc X, Y, ult, A, B) by combine_CC before your combine processes the select_cc.

reames updated this revision to Diff 465869.Oct 6 2022, 1:57 PM

Apply @craig.topper's suggestion - he was right about the effect.

This revision is now accepted and ready to land.Oct 6 2022, 2:27 PM
This revision was landed with ongoing or failed builds.Oct 6 2022, 3:19 PM
This revision was automatically updated to reflect the committed changes.