This is an archive of the discontinued LLVM Phabricator instance.

[DAG] SimplifyDemandedBits - don't early-out for multiple use values
ClosedPublic

Authored by RKSimon on Jul 14 2022, 5:43 AM.

Details

Summary

SimplifyDemandedBits currently early-outs for multi-use values beyond the root node for simplification (just returning the knownbits), which is missing a number of optimizations as there are plenty of cases where we can still simplify when initially demanding all elements/bits.

@lenary has confirmed that the test cases in aea-erratum-fix.ll need refactoring and the current increase codegen is not a major concern.

Diff Detail

Event Timeline

RKSimon created this revision.Jul 14 2022, 5:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2022, 5:43 AM
RKSimon requested review of this revision.Jul 14 2022, 5:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2022, 5:43 AM
foad added a comment.Jul 14 2022, 6:38 AM

I currently have an assertion in mve-sext-masked-load.ll when '-early-live-intervals' is set - @foad added this to the test last year - can you remember why?

I did that as a regression test for a fix to how TwoAddressInstruction updates LiveIntervals, in the hope that one day we can enable -early-live-intervals by default. I will take a look at the failure.

@lenary Any thoughts on what is going so wrong in aea-erratum-fix.ll? SimplifyDemandedBits does tend to mess with specific patterns that this might be depending on.

A quick look suggests you shouldn't be worried about it. I need to revisit the tests to simplify them, because the test is just checking the inputs to the aese/aesd instructions are from particular instructions, rather than all the prior logic, and I had failed to notice how complex that test had become.

RKSimon updated this revision to Diff 444685.Jul 14 2022, 8:52 AM

Fix sad.ll regression - if the multiple-use "all demanded elts" mode failed to find any knownbits, then try again with just the original demanded elts.

foad added a comment.Jul 14 2022, 9:02 AM

I currently have an assertion in mve-sext-masked-load.ll when '-early-live-intervals' is set - @foad added this to the test last year - can you remember why?

I did that as a regression test for a fix to how TwoAddressInstruction updates LiveIntervals, in the hope that one day we can enable -early-live-intervals by default. I will take a look at the failure.

D110182 fixes the assertion failure.

Thanks! Do you think that mve failure could be reduced into something that could be used as a unit test?

or we remove the -early-live-intervals test case from mve-sext-masked-load.ll here and re-add it as part of D110182?

The code still looks OK as far as I can see. Slightly different order but still doing the same thing. Subreg liveness can be pretty nice for dealing with the nested register classes under MVE, it would be a shame to loose it.

RKSimon added inline comments.Jul 14 2022, 10:31 AM
llvm/test/CodeGen/AArch64/parity.ll
85 ↗(On Diff #444685)

I'm going to take a look at this, but for reference I'm surprised it doesn't get performed on the neon unit like ctpop: https://github.com/llvm/llvm-project/issues/56531

or we remove the -early-live-intervals test case from mve-sext-masked-load.ll here and re-add it as part of D110182?

Given that a large number of other tests already fail in trunk if -early-live-intervals is enabled - I'm going to remove it to unblock this patch.

RKSimon updated this revision to Diff 445775.Jul 19 2022, 4:13 AM
RKSimon edited the summary of this revision. (Show Details)

rebase - @foad has landed D110182 which fixes the assertion in mve-sext-masked-load.ll

llvm/test/CodeGen/AArch64/parity.ll
85 ↗(On Diff #444685)

This looks like its fixed by D127115

RKSimon retitled this revision from [DAG] SimplifyDemandedBits - don't early-out for multiple use values (WIP) to [DAG] SimplifyDemandedBits - don't early-out for multiple use values.Jul 21 2022, 8:19 AM
RKSimon edited the summary of this revision. (Show Details)
RKSimon added inline comments.
llvm/test/CodeGen/AArch64/parity.ll
85 ↗(On Diff #444685)

D130246 converts i64/i128 parity to use the neon unit

deadalnix added inline comments.Jul 22 2022, 8:14 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1120–1126

Why not start with the early returns?

RKSimon added inline comments.Jul 22 2022, 8:49 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1120–1126

I'll clean it up - we can move the max depth limit to the top now, but the multi-use case still has to come before the no demanded case.

RKSimon updated this revision to Diff 446882.Jul 22 2022, 9:57 AM

Test for multi-use checks first - this allows us to check for no demanded bits/elts before the max depth limit which avoids a regression in the x86 avg tests

RKSimon added inline comments.Jul 22 2022, 9:58 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1120–1126

This actually caused regressions because we had a couple of cases where we had no demanded bits/elts but were at the depth limit - returning UNDEF allowed for further simplifications later on.

deadalnix accepted this revision.Jul 24 2022, 6:17 PM
This revision is now accepted and ready to land.Jul 24 2022, 6:17 PM
This revision was landed with ongoing or failed builds.Jul 27 2022, 2:54 AM
This revision was automatically updated to reflect the committed changes.

Hi,

I'm seeing a crash

13:07:13 llc: ../lib/CodeGen/SelectionDAG/TargetLowering.cpp:1878: bool llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, const llvm::APInt &, const llvm::APInt &, llvm::KnownBits &, llvm::TargetLowering::TargetLoweringOpt &, unsigned int, bool) const: Assertion `!Known.hasConflict() && "Bits known to be one AND zero?"' failed.

with this patch. I'll see if I can extract a reproducer.

Hi,

I'm seeing a crash

13:07:13 llc: ../lib/CodeGen/SelectionDAG/TargetLowering.cpp:1878: bool llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, const llvm::APInt &, const llvm::APInt &, llvm::KnownBits &, llvm::TargetLowering::TargetLoweringOpt &, unsigned int, bool) const: Assertion `!Known.hasConflict() && "Bits known to be one AND zero?"' failed.

with this patch. I'll see if I can extract a reproducer.

Cheers, its almost certainly exposed an existing bug

Hi,

I'm seeing a crash

13:07:13 llc: ../lib/CodeGen/SelectionDAG/TargetLowering.cpp:1878: bool llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, const llvm::APInt &, const llvm::APInt &, llvm::KnownBits &, llvm::TargetLowering::TargetLoweringOpt &, unsigned int, bool) const: Assertion `!Known.hasConflict() && "Bits known to be one AND zero?"' failed.

with this patch. I'll see if I can extract a reproducer.

Ok, like this:

llc -march=x86-64 -mcpu=corei7 -o /dev/null sim.ll

Thanks - I see the problem (as I said, we've just exposed something not really caused it).