This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Better legalization of ctlz/cttz
ClosedPublic

Authored by foad on Aug 4 2021, 9:05 AM.

Diff Detail

Event Timeline

foad created this revision.Aug 4 2021, 9:05 AM
foad requested review of this revision.Aug 4 2021, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 4 2021, 9:05 AM
arsenm accepted this revision.Aug 4 2021, 1:40 PM
This revision is now accepted and ready to land.Aug 4 2021, 1:40 PM
hliao added a comment.Aug 4 2021, 3:21 PM

Could you add SelectionDAG support as well?

foad added a comment.EditedAug 5 2021, 2:41 AM

Could you add SelectionDAG support as well?

D107546 for the 64-bit cases, D107566 for the 32-bit cases.

foad updated this revision to Diff 364445.Aug 5 2021, 6:38 AM

Rebase.

foad added inline comments.Aug 5 2021, 6:44 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/cvt_f32_ubyte.ll
1126–1136

This is unfortunate. The problem is that when CTLZ is expanded using FFBH, AMDGPUPostLegalizerCombinerHelper::matchUCharToFloat can no longer see that CTLZ of the high half of %masked = and i64 %arg0, 255 is known to be 32. It seems like we would need a whole bunch of extra constant folds and/or known bits logic to make this work again.

This revision was landed with ongoing or failed builds.Aug 6 2021, 1:43 AM
This revision was automatically updated to reflect the committed changes.
RKSimon added inline comments.
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
2542

@foad Coverity is complaining that you've repeated the 'Opc == AMDGPU::G_CTLZ_ZERO_UNDEF' check (which is dead code) - should the second one be 'Opc == AMDGPU::G_CTTZ_ZERO_UNDEF' ?

foad added inline comments.Aug 22 2021, 12:44 PM
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
2542

Yes, and I see you fixed it in D108210 - thanks!