Page MenuHomePhabricator

[SelectionDAG] Remove special call to LHS computeKnownBits for ANDs with constant RHS.
AbandonedPublic

Authored by craig.topper on Apr 5 2017, 1:56 PM.

Details

Summary

This code seems largely unnecessary. Most of what it was currently fixing were differences between computeKnownBits and SimplifyDemandedBits for SETCC and AssertZExt. I've submitted a separate patch to clean that up.

This leaves us with the few test issues seen here.

The AArch64/fast-isel-select.ll test is because we now try to create -1 as the constant for the XOR when we descend through SimplifyDemandedBits for the LHS of the AND. Previously we descended down the LHS through computeKnownBits and then came back up and determined the AND was useless and deleted it. Now we go through SimplifyDemandedBits first and the XOR sees that the upper bits aren't demanded so it creates a -1. This gets folded into an ORN(or not) with the normal isel. The fast isel case doesn't go through SimplifyDemandedBits so doesn't get optimized. The AND instruction was always there in the fastisel case before this patch it just wasn't being checked for by FileCheck.

The PowerPC/rlwimi-and.ll test appears to be a case where there are multiple ANDs separated by other operations in a chain. Previously we removed a late AND in the chain because the earlier AND made it redundant. But now we remove the AND earlier in the chain because we see the later AND made it redundant.

The AMDGPU/fneg.f16.ll is similar to the PowerPC test in that there are multiple ANDs in a chain. In the original code we removed the late AND and kept an earlier AND. The earlier AND got folded with an anyext load to create a zext load. Now we remove the earlier AND and fail to create the ZEXT load because there is an XOR between the load and the AND. We could recover this if we had a DAG combine to move AND with a constant above an XOR with a constant if all the bits in the XOR constant are set in the AND constant.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 5 2017, 1:56 PM
RKSimon edited edge metadata.

Adding target specialists as reviewers

nemanjai added inline comments.May 4 2017, 5:22 PM
test/CodeGen/PowerPC/rlwimi-and.ll
33

It's hard to tell exactly what's going on without context (i.e. being able to produce the code with this patch). I tried applying this patch to do so but it appears this patch may depend on the previous patches (not upstream yet).

Semantically these instruction sequences are the same. Take bits 23 and 31 from two different inputs (with the former input being shifted right 8 bits). Performance wise there isn't any real difference either. So this LGTM.

craig.topper added inline comments.May 4 2017, 5:30 PM
test/CodeGen/PowerPC/rlwimi-and.ll
33

I think I may need to rebase the patch do to a lot of KnownBits related changed lately. I'll do that tonight or tomorrow

Rebase for recent changes.

I got a new regression failure in X86 for combine-and.ll. I suspect there's some missing known bits support on some target node, but I haven't looked into it yet.

Correction, the combine-and problem seems to be because SimplifyDemandedBits only considers scalar constants for shift amount, not constant splats from a build_vector.

spatel edited edge metadata.May 12 2017, 2:09 PM

Correction, the combine-and problem seems to be because SimplifyDemandedBits only considers scalar constants for shift amount, not constant splats from a build_vector.

This is an efficiency improvement, but shouldn't we fix that first to avoid the regression?

Yeah I think we should. That failure only showed up when I was rebasing. Is that something I should look into?

Yeah I think we should. That failure only showed up when I was rebasing. Is that something I should look into?

Sure. I fixed some of the opcodes to be splat-aware a few weeks ago, but I'll take any help I can get to complete the job. :)
Beware: I think I saw a miscompile from one of those opcodes when I changed it to use 'isConstOrConstSplat'. There may be something ugly going on here. I'll see if I left any notes for myself about what went wrong.

craig.topper planned changes to this revision.May 22 2017, 11:50 AM
aemerson resigned from this revision.Sep 13 2017, 11:49 AM

@craig.topper Is this patch still relevant?

It's producing more test changes now on X86 and other targets. It's producing several regressions in the combine-sdiv.ll test by failing to delete some and instructions. The other X86 changes look pretty neutral. I haven't looked at the other targets yet.

@craig.topper Is this still relevant? At least some of these changes have been fixed by improvements to SimplifyDemandedBits

craig.topper abandoned this revision.Fri, Sep 6, 9:43 AM

I still think its odd that we do something different than what InstCombine does, but its probably not worth fixing.