This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GISel] Select llvm.amdgcn.fcmp intrinsics
ClosedPublic

Authored by Pierre-vh on Oct 24 2022, 4:37 AM.

Details

Summary

Adds FP CCs opcodes/selection logic, including src mods selection

Depends on D136591, D136448
Resolves #58326 (https://github.com/llvm/llvm-project/issues/58326)

Diff Detail

Event Timeline

Pierre-vh created this revision.Oct 24 2022, 4:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 4:37 AM
Pierre-vh requested review of this revision.Oct 24 2022, 4:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 4:37 AM
Pierre-vh added inline comments.Oct 24 2022, 4:42 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1179

Note: I used the ops used in VOPCInstructions.td:1071 - not sure if it's correct or if we want different codegen for these intrinsics.

1376

Does omod need to be selected too? How?

foad added a comment.Oct 24 2022, 5:37 AM

Add a RegBankCombine to transform the TRUE/FALSE FP CCs into simple constants for less complicated codegen.

Isn't this already done at the IR level in GCNTTIImpl::instCombineIntrinsic?

Pierre-vh updated this revision to Diff 470124.Oct 24 2022, 5:41 AM

Indeed that's done at the IR level, woops

  • Remove combine
Pierre-vh edited the summary of this revision. (Show Details)Oct 24 2022, 5:42 AM
foad added inline comments.Oct 24 2022, 5:48 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1376

This operand is clamp, not omod, and for V_CMP it enables "signalling compare". But I'm not sure if we ever make use of it in the compiler.

Pierre-vh updated this revision to Diff 470143.Oct 24 2022, 7:29 AM

DAG->SDAG prefix for filecheck

arsenm added inline comments.Oct 24 2022, 1:16 PM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1221

I think the TRUE/FALSE cases should trivially select through, no fold to constant. Although these intrinsics are deprecated

Pierre-vh added inline comments.Oct 25 2022, 1:25 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1221

I can add logic to select them if needed, but it's never going to be hit unless I write a MIR test. I initially did that but it also had a lot of copy/pasted code from selectG_CONSTANT.
Another idea would be to fold them to a constant using a RegBankSelect combine.
What do you prefer? Adding selection logic + mir test?

arsenm added inline comments.Oct 25 2022, 2:28 PM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1221

Yes, MIR test. You don't need to fold to a constant here, I expect these to literally pass through as all the other instructions here. The instructions do exist

Pierre-vh updated this revision to Diff 470776.Oct 26 2022, 4:24 AM

Add true/false pred selection

Pierre-vh marked an inline comment as done.Oct 26 2022, 4:24 AM
arsenm added inline comments.Oct 27 2022, 12:02 PM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1352

This has different behavior with exec. This doesn't and exec. For some reason, we do have V_CMP_T_* and V_CMP_F_* you can directly select to, without having to write special case code for these

Pierre-vh updated this revision to Diff 471450.Oct 28 2022, 2:16 AM
Pierre-vh marked an inline comment as done.

Emit V_CMP_TRU/F

foad added a comment.Oct 28 2022, 6:16 AM

Summary needs updating. Nits inline. LGTM apart from that.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
1354

This is clamp not omod.

1356

Need braces around this block.

Pierre-vh edited the summary of this revision. (Show Details)Oct 28 2022, 6:25 AM
Pierre-vh updated this revision to Diff 471522.Oct 28 2022, 6:25 AM
Pierre-vh marked 2 inline comments as done.

Comments

foad accepted this revision.Oct 28 2022, 6:32 AM
This revision is now accepted and ready to land.Oct 28 2022, 6:32 AM
arsenm accepted this revision.Oct 28 2022, 8:59 AM
Pierre-vh updated this revision to Diff 472621.Nov 2 2022, 8:22 AM

Add wave32/wave64 tests

Pierre-vh requested review of this revision.Nov 2 2022, 8:23 AM

Please re-review, I added wave32 tests so it's thoroughly tested.

arsenm accepted this revision.Nov 2 2022, 8:28 AM
This revision is now accepted and ready to land.Nov 2 2022, 8:28 AM
Pierre-vh updated this revision to Diff 477096.Nov 22 2022, 1:00 AM

Rebase + had to make some changes D138044

Pierre-vh requested review of this revision.Nov 22 2022, 1:01 AM

Please take another look as I had to make some minor changes due to the rebase

arsenm accepted this revision.Nov 22 2022, 6:11 AM
This revision is now accepted and ready to land.Nov 22 2022, 6:11 AM
This revision was automatically updated to reflect the committed changes.
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.w32.ll