This is an archive of the discontinued LLVM Phabricator instance.

[KnownBits] Factor out and improve the lowbit computation for {u,s}div
ClosedPublic

Authored by goldstein.w.n on May 18 2023, 5:38 PM.

Details

Summary

There are some new cases if the division is exact:

1: If `TZ(LHS) == TZ(RHS)` then the result is always Odd
2: If `TZ(LHS) > TZ(RHS)` then the `TZ(LHS)-TZ(RHS)` bits of the
   result are zero.

Proofs: https://alive2.llvm.org/ce/z/3rAZqF

As well, return zero in known poison cases to be consistent rather
than just working about the bits we are changing.

Diff Detail

Event Timeline

goldstein.w.n created this revision.May 18 2023, 5:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2023, 5:38 PM
goldstein.w.n requested review of this revision.May 18 2023, 5:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2023, 5:38 PM
foad added inline comments.May 19 2023, 2:19 AM
llvm/lib/Support/KnownBits.cpp
757

What does the comment mean? What are we skipping?

759

Seems weird to pass this in by const ref and then immediately copy it. Why not pass it in by value?

763–774

Can I suggest a slightly more unified way of handling both of these cases?

int MinTZ = (int)LHS.countMinTrailingZeros() - (int)RHS.countMaxTrailingZeros();
int MaxTZ = (int)LHS.countMaxTrailingZeros() - (int)RHS.countMinTrailingZeros();
if (MinTZ >= 0) {
  // Result has at least MinTZ trailing zeros.
  KnownOut.Zero.setLowBits(MinTZ);
  if (MinTZ == MaxTZ) { // might also need to check this is < BitWidth ???
    // Result has exactly MinTZ trailing zeros.
    KnownOut.One.setBit(MinTZ);
  }
}
840

Nit: could use intersectWith instead of passing in Known.

goldstein.w.n marked 2 inline comments as done.May 19 2023, 12:14 PM
goldstein.w.n added inline comments.
llvm/lib/Support/KnownBits.cpp
763–774

Can I suggest a slightly more unified way of handling both of these cases?

int MinTZ = (int)LHS.countMinTrailingZeros() - (int)RHS.countMaxTrailingZeros();
int MaxTZ = (int)LHS.countMaxTrailingZeros() - (int)RHS.countMinTrailingZeros();
if (MinTZ >= 0) {
  // Result has at least MinTZ trailing zeros.
  KnownOut.Zero.setLowBits(MinTZ);
  if (MinTZ == MaxTZ) { // might also need to check this is < BitWidth ???
    // Result has exactly MinTZ trailing zeros.
    KnownOut.One.setBit(MinTZ);
  }
}

Nice! Just need to add:

if (LHS.One[0])
  Known.One.setBit(0);

Because the rest of the logic relies on knowing LHS and RHS whereas LHS Odd alone is enough to imply the lowbit.

840

Not unionWith?
We need to check conflict as the tests test alot of poison values. I thought it would be cleaner (and less bugprone) to isolate that check to where its needed rather than put it in the common path which would be required if we made unionWith.

Simplify + improve logic (thanks foad)

nikic accepted this revision.Jun 6 2023, 5:36 AM

LGTM

This revision is now accepted and ready to land.Jun 6 2023, 5:36 AM
chapuni added a subscriber: chapuni.Jun 7 2023, 6:20 AM

I am still investigating, though, is this intended as NFC?

llvm/lib/Support/KnownBits.cpp
869

Is it a compatible change?

nikic added a comment.Jun 7 2023, 6:22 AM

I am still investigating, though, is this intended as NFC?

No, this change is not NFC.

FYI, I met different behavior between amd64-ubuntu and aarch64-ubuntu. Not sure if this triggers.

FYI, I met different behavior between amd64-ubuntu and aarch64-ubuntu. Not sure if this triggers.

Are you seeing a failure potentially as a result of this? If so can you link? Do you want me to revert?

goldstein.w.n added inline comments.Jun 7 2023, 9:43 AM
llvm/lib/Support/KnownBits.cpp
869

What do you mean by compatible? This change should allow us to compute more low bits than before if that's what you're asking.

@goldstein.w.n
I am sorry. As far as I have been investigating, this is not the cause but 1st commit to trigger. The failure was hidden when I added debug codes, heisenbug.
The failure can be seen in my internal builders.

The failure is;

  • Observable for targeting aarch64. PerfReader.cpp sometimes differs.
  • I have not seen for targeting x86-64.
  • I have seen for targeting aarch64, on aarch64-linux and cross on x86_64-linux.
    • Differs between aarh64-host and x86_64-host (for targeting aarch64)
    • Differs between stage2 and stage3 on aarch64-linux.
goldstein.w.n added a comment.EditedJun 8 2023, 10:32 AM

@goldstein.w.n
I am sorry. As far as I have been investigating, this is not the cause but 1st commit to trigger. The failure was hidden when I added debug codes, heisenbug.
The failure can be seen in my internal builders.

The failure is;

  • Observable for targeting aarch64. PerfReader.cpp sometimes differs.
  • I have not seen for targeting x86-64.
  • I have seen for targeting aarch64, on aarch64-linux and cross on x86_64-linux.
    • Differs between aarh64-host and x86_64-host (for targeting aarch64)
    • Differs between stage2 and stage3 on aarch64-linux.

I see, so probably some misuse of knownbits in the aarch64 backend? Are you sure its this exact commit and not the prior one? Also it might be an incorrect "exact" flag on a division.

@goldstein.w.n I found the issue was due to D141712 and fixed (not by me) in rG282324aa4a6c29d5ce31c66f8def15d9bd8e84e4
Sorry for the noise.