This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Fix the rotate direction used in countl_zero()
ClosedPublic

Authored by ldionne on Sep 25 2022, 11:15 PM.

Details

Summary

This "bug" was probably not noticed because it doesn't affect any integer
type we currently support. It requires integers with more than 2x the
size of unsigned long long. However, with such types, the algorithm
used to break down the large integer into groups of size unsigned long long
didn't work because we rotated in the wrong direction.

For example, the 256 bit number (1 << 255) would yield the wrong answer
when used with the algorithm before this patch.

In particular, note that the current rotation happens to work for 128 bit
integers because it just swaps the halves in this case.

Diff Detail

Event Timeline

Delta-dev-99 created this revision.Sep 25 2022, 11:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 11:15 PM
Delta-dev-99 requested review of this revision.Sep 25 2022, 11:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2022, 11:15 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
jloser added a subscriber: jloser.Sep 26 2022, 5:19 AM

Is there a test we can add to exhibit this bug and verify this fixes the problem?

Is there a test we can add to exhibit this bug and verify this fixes the problem?

Indeed, let's add a test. Also, how did you encounter this bug?

I was playing with the core libraries and details of C++ and tried compiling and running programs without standard libraries and with custom versions of standard library.
I was implementing bit operations and decided to take a look at llvm's implementation to compare with my own. I had a very similar idea, but the rotation direction was different on libcxx.
That's how I found it. I didn't had a test case.

Delta-dev-99 added a comment.EditedSep 26 2022, 2:48 PM

Possible test case: apply the operation to 256-bit integer 0x8000000000000000000000000000000000000000000000000000000000000000 (first bit from the left is set, all other bits cleared)
Correct answer: 0. The first bit from the left is set.
Current implementation's answer (probably, not tested yet): 128. It counts the bits cleared in the 2 blocks (64 bits each) in the middle of the number (bits 64 to 191) before finding the bit set in the left-most block (block 3).

Another possible test case: 256-bit integer 0x0000000000000000000000000000000080000000000000000000000000000000 (bit 127 is set, which is the left-most bit of block 1, all other bits cleared)
Correct answer: 128
Current implementation's answer (probably, not tested yet): 0

To clarify, all indexes are taken from the right, starting from 0.
Less significant bits/blocks get lower indexes

Mordante requested changes to this revision.Sep 4 2023, 10:00 AM
Mordante added a subscriber: Mordante.

We're moving to GitHub PRs it would be great when this patch can be finished before moving. Are you interested in finishing it? If not I'll take over. Please start by rebasing the patch to see whether it passes the CI.

This revision now requires changes to proceed.Sep 4 2023, 10:00 AM
ldionne commandeered this revision.Sep 12 2023, 7:40 AM
ldionne added a reviewer: Delta-dev-99.

[Github PR transition cleanup]

I revisited this in the context of cleaning up our review queue. I think the author of the patch is right in theory, however this can't be tested because we don't support types larger than 128 bits (and for these types both before and after the patch are equivalent). But if we did support e.g. 256 bit integers, our current algorithm would give the wrong answer. Here's a walkthrough with a 256 bit integer.

Current algorithm:

// We have the following 256 bit number. The answer to countl_zero should be 0 because the leftmost bit is set:
// 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// We start by doing __rotr(__t, unsigned long long digits == 64), which gives us the following number:
// 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// Now countl_zero of that number (casted to unsigned long long) is:
// countl_zero(00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000) == 64

// Since that is equal to numeric_limits<unsigned long long>::digits, don't get into the if-then-break, we set __ret = 64 and we keep going.

// Now we do __rotr(__t, unsigned long long digits) again, which gives us the following number:
// 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// We countl_zero again, which gives us 64 and we set __ret = 128. We keep going again.

// We do __rotr(__t, unsigned long long digits) again, which gives us the following number:
// 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// This time countl_zero(10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000) is 0, which is not equal to numeric_limits<unsigned long long>::digits. So we enter the if-then-break and we return 128.

If instead we used __rotl(__t, unsigned long long digits) as proposed by this patch, we'd get this:

// This is the number we start with, same as above:
// 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// We now do __rotl(__t, unsigned long long digits == 64), which gives us the following number:
// 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000   10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

// Now countl_zero of that number (casted to unsigned long long) is:
// countl_zero(10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000) == 0

// That is equal to 0, which is not the same as __ulldigits, so we enter the if-then-break and we return __ret which is 0.

I also tried with the number 0x0000000000000000000000000000000080000000000000000000000000000000 as suggested and the same thing happens, we give the wrong answer. When you think about it it really makes sense, we need to "eat" the 0 digits to the left of the first digit that's 1 first, that's the only correct way to break down a large width integer into unsigned long long chunks. So I'll commandeer this patch and rebase it to finish it up.

ldionne updated this revision to Diff 556572.Sep 12 2023, 7:50 AM
ldionne retitled this revision from Summary: bug fix! Rotation direction on `__countl_zero()` probably unnoticed because only affects rare cases (does not affect 128 bit ints because rotation is effectively swap) (does not affect integers of sizes less or equal to 64bits) to [libc++] Fix the rotate direction used in countl_zero().
ldionne edited the summary of this revision. (Show Details)

Rebase and fix CI. I can't add tests because this isn't technically a "live bug" -- but the bug would trigger if we started supporting 256 bit integers.

ldionne accepted this revision.Sep 12 2023, 3:19 PM

Simd failures are unrelated, shipping.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 12 2023, 3:20 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.