This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Use S_BITCMP1_* to replace AND in optimizeCompareInstr
ClosedPublic

Authored by rampitec on Sep 1 2021, 12:19 PM.

Diff Detail

Event Timeline

rampitec created this revision.Sep 1 2021, 12:19 PM
rampitec requested review of this revision.Sep 1 2021, 12:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2021, 12:19 PM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec updated this revision to Diff 370029.Sep 1 2021, 12:45 PM

Updated comment describing the transformation.

rampitec updated this revision to Diff 370030.Sep 1 2021, 12:49 PM

Updated comment again.

rampitec updated this revision to Diff 370083.Sep 1 2021, 3:22 PM

Fixed typo in the comment.

arsenm accepted this revision.Sep 1 2021, 3:26 PM

LGTM although this looks identical in terms of code size and cycles

This revision is now accepted and ready to land.Sep 1 2021, 3:26 PM

LGTM although this looks identical in terms of code size and cycles

Yes, but is saves one SGPR used for AND.

This revision was landed with ongoing or failed builds.Sep 1 2021, 4:10 PM
This revision was automatically updated to reflect the committed changes.
foad added a comment.Sep 2 2021, 1:08 AM

Shouldn't we have normal instructions selection patterns for s_bitcmp0/1? Would this patch still be required if we had that?

Shouldn't we have normal instructions selection patterns for s_bitcmp0/1? Would this patch still be required if we had that?

I don't think so. First you cannot probably cover all scenarios with selection patterns, including patterns introduced by user's code. Second the moment you will introduce these instructions you will have to handle it everywhere, starting with moveToVALU. My idea it is better to stick with normal compares and replace it later when you done with potential transformations.

One thing which can be done is to also add this to SIShrinkInstructions, so it does not depend on the presence of a compare. That however will not allow reversing of the condition like in the subsequent patch as it needs to work together with a compare.

Lastly this optimization is independent of the selection, so agnostic to the type of ISel.

I.e. to my view relying on an ISel is a layering violation. ISel is like an FE for MI passes, they shall handle whatever is dumped on them as good as they can unless information is lost in the selection, just like a BE shall handle whatever an unoptimized FE creates unless it loses semantics info.

foad added a comment.Sep 2 2021, 1:30 AM

I.e. to my view relying on an ISel is a layering violation.

In my view ISel is responsible for selection of instructions, and anything which changes those instructions to different ones later on is a bit of a hack :)

But I take your point about moveToVALU. That does sound a bit awkward.

I.e. to my view relying on an ISel is a layering violation.

In my view ISel is responsible for selection of instructions, and anything which changes those instructions to different ones later on is a bit of a hack :)

But I take your point about moveToVALU. That does sound a bit awkward.

Everything about VALU to SALU relationship is awkward. We basically have two separate engines tied together and that does not make compiler life easy. For example x86 has finally killed their x87 coprocessor because it is awkward to deal with.