The two 32-bit words were swapped.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
So the bitreverse code appears to be functionally broken. Test case projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c was added to test this functionally. How does it miss the fact that the words are swapped in the register?
Actually, looking at the test case, there is no real test for functionality there (not obviously verifiable anyway). Can you please also add a straightforward test in there that will reverse a bit pattern that can be visually confirmed (of course ensuring that the optimizer doesn't just get rid of the code and emit a constant)?
For example:
0x1122334455667788 -> 11EE66AA22CC4488 (I think I got that right, but it's just a quick back-of-the-envelope conversion). Of course, similarly for 32-bit.
I'm also adding the author of the original code.
lib/Target/PowerPC/PPCInstrInfo.td | ||
---|---|---|
4709 ↗ | (On Diff #118269) | If this is the right orientation, I think a name change is in order somewhere. It doesn't make sense to do a bitwise or of something called ToHi32 and something called ExtendHi32.To64Bit (i.e. the names suggest we're or-ing the high word with the high word). |
Where to put the test?
For projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c, the incorrect output was:
Sum1 = 807fffff800000, Sum2 = ff80000000000000
With this patch, it is:
Sum1 = 0, Sum2 = feff800000800000
The same as projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.reference_output
What I'm getting at is that the test case is written in a way that we can clearly miss this functional problem and still pass the test case. Which in my mind makes it a useless test case. I think it's meant do run for a certain amount of time to make sure that the fast algorithm is used rather than the obvious one, but a check that the final sum is something like 0xfeff8... is not meaningful as a reader isn't going to trace through the program to verify it is correct.
So what I'm getting at is that the program does the following before the loop:
- Get a 32 bit known value (that isn't an integral constant expression so the optimizer doesn't get rid of it)
- Get a 64 bit known value (same conditions as above)
- Reverse the bits in both (both through __builtin_bitreverse() if available and through the idiom-recognition path) and print them
One possibility for where the two input values can come from is from the command (through atoi/atoll or similar).
All this because we currently don't have any testing that a reviewer an look at and say "Yup, this does the right thing." Tracing through the SDAG patterns, the generated code or the bit pattern after a a huge number of reversals isn't feasible for a human reviewer.
This is my omission in the implementation. All the 1-bit, 2-bit, 4-bit and bytes are swapped, except the high word and low word is not swapped, which is the root cause for the problem. The reason why the test case : projects/test-suite/SingleSource/Benchmarks/Misc/revertBits.c didn't catch it was probably LLVM OPT optimizes ReverseBits64(__builtin_bitreverse64(i)) to just i, so it never tests the code I added (Note I retest it, it generates the same result with gcc at -O2 but generates wrong result without optimization). I also did some other functional testing on my own machine before committing, but I used the number 0x5555555555555555, which was not general enough, so it also didn't catch this subtle bug. I should have used a more general test case. I suggest modifying the test case in the Benchmark (https://reviews.llvm.org/D35188) to be stronger to catch this (like instead of add ReverseBits64(__builtin_bitreverse64) to sum, add ReverseBits64 and __builtin_bitreverse64 separately to the sum so LLVM OPT couldn't recognize it is a reversal of a reversal.
lib/Target/PowerPC/PPCInstrInfo.td | ||
---|---|---|
4703 ↗ | (On Diff #118286) | I would add a comment like "Now byte-swap within the high word and low word both are done, next swap the high word and low word." |
4709 ↗ | (On Diff #118286) | I would change ExtendHi32.ToLo32 to MoveHi32.ToLo32 to be more clear. |
Added comment suggest by jtony.
The test can be updated to something like
for (int i = 0; i < NUM; ++i) { sum1 += ReverseBits32(i); sum2 += ReverseBits64(i); } for (int i = 0; i < NUM; ++i) { sum1 -= __builtin_bitreverse32(i); sum2 -= __builtin_bitreverse64(i); } return sum1 == 0 && sum2 == 0 ? 0 : 1;
But I use git and the document does not mention how to use arc diff to update a file that is in another repo (./projects/test-suite/)
% find . -name .git
./.git
./projects/libcxx/.git
./projects/test-suite/.git
./projects/compiler-rt/.git
./projects/libcxxabi/.git
./tools/clang/.git
./tools/clang/tools/extra/.git
lib/Target/PowerPC/PPCInstrInfo.td | ||
---|---|---|
4711 ↗ | (On Diff #119173) | Can you also address this comment MaskRay? Since the verb extend is no longer suitable now. Thank you very much for all your effort put in fixing this issue. |
The test-suite/ update should be put in a different patch, since that is in a different project.
Could you please fix or revert the patch?
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/8989/steps/check-llvm%20asan/logs/stdio
FAIL: LLVM :: CodeGen/PowerPC/pr33093.ll (9007 of 22445) ******************** TEST 'LLVM :: CodeGen/PowerPC/pr33093.ll' FAILED ******************** Script: -- /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 < /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 < /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll -- Exit Code: 1 Command Output (stderr): -- /b/sanitizer-x86_64-linux-fast/build/llvm/test/CodeGen/PowerPC/pr33093.ll:94:15: error: expected string not found in input ; CHECK-NEXT: oris 12, 5, 52428 ^ <stdin>:90:2: note: scanning from here oris 9, 5, 52428 ^ --
I've reverted in r316276 as this broke multiple build bots.
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951