This is an archive of the discontinued LLVM Phabricator instance.

[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl
ClosedPublic

Authored by RKSimon on Nov 22 2021, 3:07 AM.

Details

Summary

If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift.

For the ARM rev16 pattern, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task.

Diff Detail

Event Timeline

RKSimon created this revision.Nov 22 2021, 3:07 AM
RKSimon requested review of this revision.Nov 22 2021, 3:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2021, 3:07 AM

I don't understand the changes to the SystemZ test cases. The new behavior seems to be simply wrong?

Note that the tests you change are specifically constructed to require bits from both arms of the rotate pattern, so they cannot be optimized to a simple shift - if this optimization does that in these cases, something seems to be wrong here.

lebedev.ri requested changes to this revision.Nov 22 2021, 9:51 AM
lebedev.ri added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1749–1755

Are you sure this is right?
https://alive2.llvm.org/ce/z/EJDzlC

This revision now requires changes to proceed.Nov 22 2021, 9:51 AM
RKSimon planned changes to this revision.Nov 22 2021, 10:27 AM

run out of time, will look at this later

lebedev.ri added inline comments.Nov 22 2021, 2:44 PM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1749–1755

Perhaps this is what you had in mind:

craig.topper added inline comments.Nov 22 2021, 4:41 PM
llvm/lib/Target/ARM/ARMISelLowering.cpp
17052

Can this be done in isel pattern? There's an existing top16Zero PatFrag that calls MaskedValueIsZero in ARMInstrThumb2.td

dmgreen added inline comments.Nov 23 2021, 1:45 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
17052

That does sound good if it will work. It would be a good way to keep the same pattern working, and I would say a tablegen pattern is preferable to a new node type.

I noticed that llvm.bswap.i16 would no longer generate a rev16, which would be a shame to see. The same thing didn't seem to happen on AArch64 though, it was still fine. I'm not entirely sure what the difference was.

RKSimon updated this revision to Diff 389127.Nov 23 2021, 2:09 AM
RKSimon edited the summary of this revision. (Show Details)

Fixed dumb typo in original variant (sorry, I was focussing too much on the SSE codegen and nothing else)

Added PatFrag to handled ARM rev16 instruction pattern.

RKSimon added inline comments.Nov 23 2021, 2:12 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1749–1755

Thanks @lebedev.ri !

llvm/lib/Target/ARM/ARMISelLowering.cpp
17052

Sorry - I missed these comments before doing my own version, which handles the bswap_upperzero specific case.

lebedev.ri accepted this revision.Nov 23 2021, 3:22 AM

LGTM, thank you.

This revision is now accepted and ready to land.Nov 23 2021, 3:22 AM
dmgreen added inline comments.Nov 23 2021, 6:18 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
17052

Thanks. This seems OK. although reusing top16Zero might be a little cleaner.

We probably need patterns for tREV16 and t2REV16 too (and some tests to go with them!) I'll update the existing tests with some more triples. I think the patterns should be the same.

RKSimon updated this revision to Diff 389322.Nov 23 2021, 2:33 PM

Move existing top16Zero pat leaf to ARMInstrInfo.td and reuse for ARM6/Thumb/Thumb2 rev16 patterns

dmgreen accepted this revision.Nov 23 2021, 2:35 PM

Thanks! LGTM

This revision was landed with ongoing or failed builds.Nov 24 2021, 3:28 AM
This revision was automatically updated to reflect the committed changes.