This is an archive of the discontinued LLVM Phabricator instance.

[Support] Fix what I think is an off by 1 bug in UnsignedDivisionByConstantInfo.
ClosedPublic

Authored by craig.topper on Dec 23 2022, 1:55 PM.

Details

Summary

The code in Hacker's Delight says
nc = -1 - (-d)%d;

But we have
NC = AllOnes - (AllOnes-D)%D

The Hacker's Delight code is written for the LeadingZeros==0 case.
AllOnes - D is not the same as -d from Hacker's Delight.

This patch changes the code to
NC = AllOnes - (AllOnes+1-D)%D

This will increment AllOnes to 0 in the LeadingZeros==0 case. This
will make it equivalent to -D. I believe this is also correct for
LeadingZeros>0.

I've been unable to find a case that changes with the new code.
I've checked all divisors for i8 and i16 udiv and the resulting
code is the same. i32 is a much larger search space to check.

Diff Detail

Event Timeline

craig.topper created this revision.Dec 23 2022, 1:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 23 2022, 1:55 PM
craig.topper requested review of this revision.Dec 23 2022, 1:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 23 2022, 1:55 PM

Those test cases test i8 udiv by 78 and 116. The tests for those are still in test/CodeGen/X86/div-by-constant.ll and did not change with this patch.

I'd like to know what, exactly, changed between 3aab6a86 and now to make this correct. Otherwise this seems sort of suspicious, even if it doesn't appear to have any effect on the current code generation.

lebedev.ri edited the summary of this revision. (Show Details)Dec 24 2022, 1:24 PM
lebedev.ri added a subscriber: lebedev.ri.

Let me add some tests here.

barannikov88 added a comment.EditedDec 24 2022, 2:43 PM

Let me add some tests here.

In case that helps, I've tried a few bitwidth variants.
With LeadingZeros = 0, I get different magics iff the divisor has LSB and MSB ones and all other bits zeros. (I.e. 0b1000...0001 for any bitwidth.)
Example: APInt(8, 0x81).

With LeadingZeros != 0, the old algorithm asserts iff the divisor is a certain power of two. Examples:
D = APInt(8, 0x80) and LeadingZeros = 1
D = APInt(8, 0x40) and LeadingZeros = 2.
The fixed version does not crash and returns what seems to be correct values (can that be enough for a unittest?)

Add a comment and assert to describe what NC is supposed to be.

craig.topper added a comment.EditedDec 24 2022, 3:07 PM

Let me add some tests here.

In case that helps, I've tried a few bitwidth variants.
With LeadingZeros = 0, I get different magics iff the divisor has LSB and MSB ones and all other bits zeros. (I.e. 0b1000...0001 for any bitwidth.)
Example: APInt(8, 0x81).

With LeadingZeros != 0, the old algorithm asserts iff the divisor is a certain power of two. Examples:
D = APInt(8, 0x80) and LeadingZeros = 1
D = APInt(8, 0x40) and LeadingZeros = 2.
The fixed version does not crash and returns what seems to be correct values (can that be enough for a unittest?)

Thanks. I thought I checked all of the i8 cases. But you're right there is a difference with 0x81.

I think what happened is that I checked by writing the code in C and InstCombine has a replacement for udiv with a divisor that large. So I didn't test udiv for that value.

I think what happened is that I checked by writing the code in C and InstCombine has a replacement for udiv with a divisor that large. So I didn't test udiv for that value.

Yeah, that's probably the reason the bug hasn't been reported yet (I found this case by directly calling the function from a simple GTest).
It won't assert with APInt(8, 0x80) on a C code either, since division by a power of two is optimized into a shift.
We probably just need a targeted test in llvm/unittests/Support

lebedev.ri accepted this revision.Dec 24 2022, 5:16 PM

I've added exhaustive test coverage for this division expansion: 066b492b747a7e00f537eab9f0196575522ec285 (that now always tests bit widths 2..12)
And additionally brute-forced it for i16, without and with this patch, using: https://godbolt.org/z/3xGzTM881.
This change does not cause any changes, our existing expansion is correct regardless.
TargetLowering::BuildUDIV() seen a lot of changes since then.

I'm going to go ahead and stamp this.

This revision is now accepted and ready to land.Dec 24 2022, 5:16 PM

@lebedev.ri
Thank you for volunteering adding the tests. Rare developer is willing to do so.

I have some concerns about the quality of the tests though.
The tests do not actually test the algorithm in question. They test a division algorithm
using the magic numbers, which is another, and huge, failure point.
Because of this failure point, the tests do not catch a real bug,
(e.g. the generated magic numbers are invalid for APInt(8, 0x81)).
Also, the tests should also cover non-zero LeadingZeros case.

Iterating over the whole search space may be nice. However, the most interesting cases of 32- and 64-bit divisor are not covered
(and can't be covered for performance reasons).

Would you mind replacing them with a few targeted tests that only call the 'get' method and check the returned result?
If you find this request inappropriate given the time you've already spent, that's absolutely OK.

Let me add some tests here.

In case that helps, I've tried a few bitwidth variants.
With LeadingZeros = 0, I get different magics iff the divisor has LSB and MSB ones and all other bits zeros. (I.e. 0b1000...0001 for any bitwidth.)
Example: APInt(8, 0x81).

With LeadingZeros != 0, the old algorithm asserts iff the divisor is a certain power of two. Examples:
D = APInt(8, 0x80) and LeadingZeros = 1
D = APInt(8, 0x40) and LeadingZeros = 2.
The fixed version does not crash and returns what seems to be correct values (can that be enough for a unittest?)

I'm not sure I expect the algorithm to work if the divisor does not have at least as many leading zeros as the dividend. The AllOnes value the algorithm creates should be >= D. Norrmally when LeadingZeros is used both the dividend and the divisor have been shifted right by that many bits.

Let me add some tests here.

In case that helps, I've tried a few bitwidth variants.
With LeadingZeros = 0, I get different magics iff the divisor has LSB and MSB ones and all other bits zeros. (I.e. 0b1000...0001 for any bitwidth.)
Example: APInt(8, 0x81).

With LeadingZeros != 0, the old algorithm asserts iff the divisor is a certain power of two. Examples:
D = APInt(8, 0x80) and LeadingZeros = 1
D = APInt(8, 0x40) and LeadingZeros = 2.
The fixed version does not crash and returns what seems to be correct values (can that be enough for a unittest?)

I'm not sure I expect the algorithm to work if the divisor does not have at least as many leading zeros as the dividend. The AllOnes value the algorithm creates should be >= D. Norrmally when LeadingZeros is used both the dividend and the divisor have been shifted right by that many bits.

ACK. It's not a nice API design, but you can't just pass random LeadingZeros there.