This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] Improve SimplifyDemandedBits for AND and OR
Needs ReviewPublic

Authored by foad on Sep 10 2020, 9:03 AM.

Details

Summary

Simplify each operand of AND and OR using demanded bits based on the
known bits of the other operand.

Diff Detail

Event Timeline

foad created this revision.Sep 10 2020, 9:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2020, 9:03 AM
foad requested review of this revision.Sep 10 2020, 9:03 AM
foad added a subscriber: nikic.Sep 10 2020, 9:04 AM

@nikic this could perhaps increase compile time. Could you test it please?

RKSimon added inline comments.Sep 10 2020, 9:40 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1217

If we're worried about the cost of the extra SimplifyDemandedBits, we might be able to just create Op0DemandedBits/Op1DemandedBits from the knowbits and only use these on the SimplifyMultipleUseDemandedBits cases below? The calls currently only use the original DemandedBits.

foad added a comment.Sep 11 2020, 1:39 AM

@nikic this could perhaps increase compile time. Could you test it please?

Not seeing any impact from this change (https://llvm-compile-time-tracker.com/compare.php?from=a5168bdb4a25485ac62e18bdc538b4842bc9fbd9&to=c23aa3c1c163fc5343b1df5254b8c0782d4324cb&stat=instructions).

Thanks!

foad added inline comments.Sep 11 2020, 7:46 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1217

I did try using Op0DemandedBits/Op1DemandedBits for the calls to SimplifyMultipleUseDemandedBits, in addition to the current patch. It only affected one test case and the diff isn't very compelling:

diff --git a/llvm/test/CodeGen/X86/shift-parts.ll b/llvm/test/CodeGen/X86/shift-parts.ll
index da00f377020d..ef82065ad12a 100644
--- a/llvm/test/CodeGen/X86/shift-parts.ll
+++ b/llvm/test/CodeGen/X86/shift-parts.ll
@@ -12,11 +12,12 @@ define i32 @int87(i32 %uint64p_8, i1 %cond) nounwind {
 ; CHECK-NEXT:    movq g_144+{{.*}}(%rip), %rax
 ; CHECK-NEXT:    movq g_144+{{.*}}(%rip), %rcx
 ; CHECK-NEXT:    movzbl %sil, %edx
+; CHECK-NEXT:    andl $1, %edx
 ; CHECK-NEXT:    shll $6, %edx
 ; CHECK-NEXT:    .p2align 4, 0x90
 ; CHECK-NEXT:  .LBB0_1: # %for.cond
 ; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
-; CHECK-NEXT:    testb $64, %dl
+; CHECK-NEXT:    testb %dl, %dl
 ; CHECK-NEXT:    movq %rcx, %rsi
 ; CHECK-NEXT:    cmovneq %rax, %rsi
 ; CHECK-NEXT:    orl $0, %esi
craig.topper added inline comments.Sep 11 2020, 1:56 PM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1268

Can we trust Known2.One's value is correct for the bits that weren't checked due to Known.One being used to filter the DemandedBits for the previous call?

foad added inline comments.Sep 14 2020, 7:07 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1268

No we can't according to the comment on SimplifyDemandedBits:

/// expression (used to simplify the caller).  The KnownZero/One bits may only
/// be accurate for those bits in the Demanded masks.

This was a pre-existing problem with the call on line 1264.

foad added inline comments.Sep 14 2020, 7:38 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1268

It seems to me that there are various pre-existing problems in this area.

This should be a no-op but causes test case diffs: https://github.com/jayfoad/llvm-project/commit/dfdc519d44ac4f5ad00d9d5322d641e2fcc5881a

This fixes the pre-existing problem I pointed out on line 1264, but causes various code quality regressions: https://github.com/jayfoad/llvm-project/commit/90a5c28ba29edf54fb22ea36bf1ba2625ce557a4

I'm not sure how to proceed now. Should we try to get to a state where known bits information is trustworthy for undemanded bits?

RKSimon added inline comments.Sep 14 2020, 7:49 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1268

If you're willing to try to do the audit then by all means - we should at least be able to get to a position where all the KnownBits One/Zero bits will not be set unless they are definitely known - but some cases (e.g. bitcasts through vectors, partial demanded elts etc.) we might need to play safe.

RKSimon added inline comments.Jan 2 2021, 8:04 AM
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1203–1204

Pull out

APInt Op0DemandedBits = ~Known.Zero & DemandedBits;
1222

Op0DemandedBits ?

1224

Op1DemandedBits ?

yubing added a subscriber: yubing.Jan 12 2021, 7:09 PM
foad updated this revision to Diff 316872.Jan 15 2021, 1:40 AM

Rebase. Pass Op0/1DemandedBits into SimplifyMultipleUseDemandedBits.

foad marked 3 inline comments as done.Jan 15 2021, 1:43 AM
foad added inline comments.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
1222

Done. The only effect on the tests is to add an instruction outside the loop in X86/shift-parts.ll which does not seem like a significant regression to me.

foad marked an inline comment as done.Jan 15 2021, 1:49 AM

Reping

The status is: I'm still nervous about this patch because of the issue @craig.topper pointed out with SimplifyDemandedBits not necessarily returning accurate "known" information for bits that are not demanded.

foad updated this revision to Diff 330597.Mar 15 2021, 3:45 AM

Rebase.