Page MenuHomePhabricator

[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl

Authored by RKSimon on Mon, Nov 22, 3:07 AM.



If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift.

For the ARM rev16 pattern, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task.

Diff Detail

Event Timeline

RKSimon created this revision.Mon, Nov 22, 3:07 AM
RKSimon requested review of this revision.Mon, Nov 22, 3:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptMon, Nov 22, 3:07 AM

I don't understand the changes to the SystemZ test cases. The new behavior seems to be simply wrong?

Note that the tests you change are specifically constructed to require bits from both arms of the rotate pattern, so they cannot be optimized to a simple shift - if this optimization does that in these cases, something seems to be wrong here.

lebedev.ri requested changes to this revision.Mon, Nov 22, 9:51 AM
lebedev.ri added inline comments.

Are you sure this is right?

This revision now requires changes to proceed.Mon, Nov 22, 9:51 AM
RKSimon planned changes to this revision.Mon, Nov 22, 10:27 AM

run out of time, will look at this later

lebedev.ri added inline comments.Mon, Nov 22, 2:44 PM

Perhaps this is what you had in mind:

craig.topper added inline comments.Mon, Nov 22, 4:41 PM

Can this be done in isel pattern? There's an existing top16Zero PatFrag that calls MaskedValueIsZero in

dmgreen added inline comments.Tue, Nov 23, 1:45 AM

That does sound good if it will work. It would be a good way to keep the same pattern working, and I would say a tablegen pattern is preferable to a new node type.

I noticed that llvm.bswap.i16 would no longer generate a rev16, which would be a shame to see. The same thing didn't seem to happen on AArch64 though, it was still fine. I'm not entirely sure what the difference was.

RKSimon updated this revision to Diff 389127.Tue, Nov 23, 2:09 AM
RKSimon edited the summary of this revision. (Show Details)

Fixed dumb typo in original variant (sorry, I was focussing too much on the SSE codegen and nothing else)

Added PatFrag to handled ARM rev16 instruction pattern.

RKSimon added inline comments.Tue, Nov 23, 2:12 AM

Thanks @lebedev.ri !


Sorry - I missed these comments before doing my own version, which handles the bswap_upperzero specific case.

lebedev.ri accepted this revision.Tue, Nov 23, 3:22 AM

LGTM, thank you.

This revision is now accepted and ready to land.Tue, Nov 23, 3:22 AM
dmgreen added inline comments.Tue, Nov 23, 6:18 AM

Thanks. This seems OK. although reusing top16Zero might be a little cleaner.

We probably need patterns for tREV16 and t2REV16 too (and some tests to go with them!) I'll update the existing tests with some more triples. I think the patterns should be the same.

RKSimon updated this revision to Diff 389322.Tue, Nov 23, 2:33 PM

Move existing top16Zero pat leaf to and reuse for ARM6/Thumb/Thumb2 rev16 patterns

dmgreen accepted this revision.Tue, Nov 23, 2:35 PM

Thanks! LGTM

This revision was landed with ongoing or failed builds.Wed, Nov 24, 3:28 AM
This revision was automatically updated to reflect the committed changes.