Page MenuHomePhabricator

[DAG] Add SimplifyDemandedBits support for BSWAP
AcceptedPublic

Authored by RKSimon on Feb 10 2019, 11:34 AM.

Details

Summary

This affects a couple of changes that I need some target-specific advice on:

aarch64 - we're losing this as the zext is being simplified to aext, so the canonicalization fails to confirm that the upper bits are zero. I can try adding a zext(bswap(trunc(x))) variant if that'd be useful as an alternative?

amdgpu - I think these are all benign.

powerpc - this one is annoying, I think simplifying the shifts is enough to cause DAGCombiner::MatchLoadCombine to fail. It might be possible to improve this by adding BSWAP support to calculateByteProvider.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Feb 10 2019, 11:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2019, 11:34 AM

and+rev16/rev32 isn't really any better than rev+lsr; that's fine as far as it goes. But please make sure we have coverage for cases where the zero-extension is free (e.g. the operand is a load, or a zeroext value, or the result of i32 arithmetic).

Hmm, this is one of those cases where it'd be awesome to have a Godbolt for the tests.

AMDGPU/bitreverse.ll looks good to me. AMDGPU/bswap.ll is _probably_ fine but I'd feel more comfortable seeing the assembly in full.

arsenm added inline comments.Feb 22 2019, 7:06 AM
test/CodeGen/AMDGPU/bitreverse.ll
117–118 ↗(On Diff #186158)

This increased the instruction count?

RKSimon updated this revision to Diff 188350.Feb 26 2019, 5:33 AM

updated with the new aarch64 tests

  • the zext(load()) test fails to recognise the upper bits are zero as it becomes anyext(load())
  • the arithmetic ubfx tests do still exercise the rotr(bswap()) combine

just need to address the powerpc regression now

arsenm accepted this revision.Feb 26 2019, 7:40 AM

LGTM

This revision is now accepted and ready to land.Feb 26 2019, 7:40 AM

@nemanjai Do you have any comments on the powerpc issue? You created the original test case at rL347288.

efriedma added inline comments.Feb 26 2019, 1:27 PM
test/CodeGen/AArch64/arm64-rev.ll
102

This is a regression... but it's sort of orthogonal to this patch, I guess. It's okay for now.

RKSimon marked an inline comment as done.Feb 27 2019, 3:13 AM
RKSimon added inline comments.
test/CodeGen/AArch64/arm64-rev.ll
102

I've raised https://bugs.llvm.org/show_bug.cgi?id=40881 to cover this - its a common issue as we make SimplifyDemandedBits more capable.

RKSimon updated this revision to Diff 199159.May 11 2019, 2:07 PM
RKSimon retitled this revision from [DAG] Add SimplifyDemandedBits support for BSWAP/BITREVERSE to [DAG] Add SimplifyDemandedBits support for BSWAP.

rebased now that BITREVERSE support has been committed in rL360534