[PPC CodeGen] Fix the bitreverse.i64 intrinsic.
ClosedPublic

Authored by MaskRay on Mon, Oct 9, 1:38 PM.

Details

Summary

The two 32-bit words were swapped.

Diff Detail

Repository
rL LLVM
MaskRay created this revision.Mon, Oct 9, 1:38 PM
Carrot added a comment.Mon, Oct 9, 1:40 PM

It's better to add a test case for it.

MaskRay updated this revision to Diff 118269.Mon, Oct 9, 1:58 PM

Update test

So the bitreverse code appears to be functionally broken. Test case projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c was added to test this functionally. How does it miss the fact that the words are swapped in the register?
Actually, looking at the test case, there is no real test for functionality there (not obviously verifiable anyway). Can you please also add a straightforward test in there that will reverse a bit pattern that can be visually confirmed (of course ensuring that the optimizer doesn't just get rid of the code and emit a constant)?

For example:
0x1122334455667788 -> 11EE66AA22CC4488 (I think I got that right, but it's just a quick back-of-the-envelope conversion). Of course, similarly for 32-bit.

I'm also adding the author of the original code.

lib/Target/PowerPC/PPCInstrInfo.td
4709 ↗(On Diff #118269)

If this is the right orientation, I think a name change is in order somewhere. It doesn't make sense to do a bitwise or of something called ToHi32 and something called ExtendHi32.To64Bit (i.e. the names suggest we're or-ing the high word with the high word).

nemanjai added a subscriber: jtony.Mon, Oct 9, 2:28 PM

Where to put the test?

For projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c, the incorrect output was:

Sum1 = 807fffff800000, Sum2 = ff80000000000000

With this patch, it is:

Sum1 = 0, Sum2 = feff800000800000

The same as projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.reference_output

MaskRay updated this revision to Diff 118286.Mon, Oct 9, 3:50 PM

Name change.

MaskRay marked an inline comment as done.Mon, Oct 9, 3:50 PM

Where to put the test?

For projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c, the incorrect output was:

Sum1 = 807fffff800000, Sum2 = ff80000000000000

With this patch, it is:

Sum1 = 0, Sum2 = feff800000800000

The same as projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.reference_output

What I'm getting at is that the test case is written in a way that we can clearly miss this functional problem and still pass the test case. Which in my mind makes it a useless test case. I think it's meant do run for a certain amount of time to make sure that the fast algorithm is used rather than the obvious one, but a check that the final sum is something like 0xfeff8... is not meaningful as a reader isn't going to trace through the program to verify it is correct.
So what I'm getting at is that the program does the following before the loop:

  • Get a 32 bit known value (that isn't an integral constant expression so the optimizer doesn't get rid of it)
  • Get a 64 bit known value (same conditions as above)
  • Reverse the bits in both (both through __builtin_bitreverse() if available and through the idiom-recognition path) and print them

One possibility for where the two input values can come from is from the command (through atoi/atoll or similar).

All this because we currently don't have any testing that a reviewer an look at and say "Yup, this does the right thing." Tracing through the SDAG patterns, the generated code or the bit pattern after a a huge number of reversals isn't feasible for a human reviewer.

MaskRay removed a subscriber: jtony.
jtony added a comment.Sun, Oct 15, 7:04 PM

This is my omission in the implementation. All the 1-bit, 2-bit, 4-bit and bytes are swapped, except the high word and low word is not swapped, which is the root cause for the problem. The reason why the test case : projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c didn't catch it was probably LLVM OPT optimizes ReverseBits64(__builtin_bitreverse64(i)) to just i, so it never tests the code I added (Note I retest it, it generates the same result with gcc at -O2 but generates wrong result without optimization). I also did some other functional testing on my own machine before committing, but I used the number 0x5555555555555555, which was not general enough, so it also didn't catch this subtle bug. I should have used a more general test case. I suggest modifying the test case in the Benchmark (https://reviews.llvm.org/D35188) to be stronger to catch this (like instead of add ReverseBits64(__builtin_bitreverse64) to sum, add ReverseBits64 and __builtin_bitreverse64 separately to the sum so LLVM OPT couldn't recognize it is a reversal of a reversal.

lib/Target/PowerPC/PPCInstrInfo.td
4703 ↗(On Diff #118286)

I would add a comment like "Now byte-swap within the high word and low word both are done, next swap the high word and low word."

4709 ↗(On Diff #118286)

I would change ExtendHi32.ToLo32 to MoveHi32.ToLo32 to be more clear.

MaskRay updated this revision to Diff 119173.Mon, Oct 16, 10:20 AM

Add comment suggested by jtony

Added comment suggest by jtony.

The test can be updated to something like

for (int i = 0; i < NUM; ++i) {
  sum1 += ReverseBits32(i);
  sum2 += ReverseBits64(i);
}
for (int i = 0; i < NUM; ++i) {
  sum1 -= __builtin_bitreverse32(i);
  sum2 -= __builtin_bitreverse64(i);
}
return sum1 == 0 && sum2 == 0 ? 0 : 1;

But I use git and the document does not mention how to use arc diff to update a file that is in another repo (./projects/test-suite/)

% find . -name .git
./.git
./projects/libcxx/.git
./projects/test-suite/.git
./projects/compiler-rt/.git
./projects/libcxxabi/.git
./tools/clang/.git
./tools/clang/tools/extra/.git

jtony added inline comments.Mon, Oct 16, 12:28 PM
lib/Target/PowerPC/PPCInstrInfo.td
4711 ↗(On Diff #119173)

Can you also address this comment MaskRay? Since the verb extend is no longer suitable now. Thank you very much for all your effort put in fixing this issue.

MaskRay updated this revision to Diff 119208.Mon, Oct 16, 1:53 PM
MaskRay marked 2 inline comments as done.

Renamed ExtendHi32.ToLo32 to MoveHi32.ToLo32

This comment was removed by MaskRay.
MaskRay marked an inline comment as done.Tue, Oct 17, 9:48 AM
jtony added a comment.Wed, Oct 18, 8:32 AM

Added comment suggest by jtony.

The test can be updated to something like

for (int i = 0; i < NUM; ++i) {
  sum1 += ReverseBits32(i);
  sum2 += ReverseBits64(i);
}
for (int i = 0; i < NUM; ++i) {
  sum1 -= __builtin_bitreverse32(i);
  sum2 -= __builtin_bitreverse64(i);
}
return sum1 == 0 && sum2 == 0 ? 0 : 1;

But I use git and the document does not mention how to use arc diff to update a file that is in another repo (./projects/test-suite/)

% find . -name .git
./.git
./projects/libcxx/.git
./projects/test-suite/.git
./projects/compiler-rt/.git
./projects/libcxxabi/.git
./tools/clang/.git
./tools/clang/tools/extra/.git

The test-suite/ update should be put in a different patch, since that is in a different project.

jtony accepted this revision.Wed, Oct 18, 8:34 AM

This patch now LGTM.

This revision is now accepted and ready to land.Wed, Oct 18, 8:34 AM
MaskRay added a comment.EditedWed, Oct 18, 8:54 AM

This patch now LGTM.

"This revision is now accepted and ready to land."

Waiting for an account

This revision was automatically updated to reflect the committed changes.

Could you please fix or revert the patch?
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/8989/steps/check-llvm%20asan/logs/stdio

FAIL: LLVM :: CodeGen/PowerPC/pr33093.ll (9007 of 22445)
******************** TEST 'LLVM :: CodeGen/PowerPC/pr33093.ll' FAILED ********************
Script:
--
/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 < /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll
/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 < /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll
--
Exit Code: 1

Command Output (stderr):
--
/b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll:94:15: error: expected string not found in input
; CHECK-NEXT: oris 12, 5, 52428
              ^
<stdin>:90:2: note: scanning from here
 oris 9, 5, 52428
 ^

--

Created a refix D39163