This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Divergence driven selection for fused bitlogic
ClosedPublic

Authored by rampitec on Oct 15 2021, 1:15 PM.

Details

Summary

The change adds divergence predicates for fused logical operations.
The problem with selecting a scalar fused op such as S_NOR_B32 is
that it does not have a VALU counterpart and will be split in
moveToVALU. At the same time it prevents selection of a better
opcode on the VALU side (such as V_OR3_B32) which does not have a
counterpart on SALU side.

XNOR opcodes are left as is and selected as scalar to get advantage
of the SIInstrInfo::lowerScalarXnor() code which can commute
operations to keep one of two opcodes on SALU if possible. See
xnor.ll test for this.

Diff Detail

Event Timeline

rampitec created this revision.Oct 15 2021, 1:15 PM
rampitec requested review of this revision.Oct 15 2021, 1:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2021, 1:15 PM
Herald added a subscriber: wdng. · View Herald Transcript
foad added a comment.Oct 15 2021, 1:32 PM

Seems fine, but could you precommit the test so we can see the effect of the patch on it?

rampitec updated this revision to Diff 380097.Oct 15 2021, 1:58 PM

Rebased on pre-commited test.

Seems fine, but could you precommit the test so we can see the effect of the patch on it?

The only real change is v_or3_b32 which has started all of that patch. The rest selects differently but results in the same code so we just skipping moveToVALU.

foad accepted this revision.Oct 18 2021, 12:20 AM

LGTM.

This revision is now accepted and ready to land.Oct 18 2021, 12:20 AM
This revision was automatically updated to reflect the committed changes.