Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our lshr implementation so it can operate in-place.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Can we just merge lshrNear fully into lshrInPlace and just use lshrInPlace in the other place that uses lshrNear?
lib/Support/APInt.cpp | ||
---|---|---|
776 ↗ | (On Diff #94920) | Should we add an assert that Shift is less than APINT_BITS_PER_WORD? |
783 ↗ | (On Diff #94920) | Use APINT_WORD_SIZE instead of 8. |
790 ↗ | (On Diff #94920) | Use APINT_BITS_PER_WORD instead of 64. |
1221 ↗ | (On Diff #94920) | Can we divide by APINT_BITS_PER_WORD and let the compiler optimize to shift? |
1225 ↗ | (On Diff #94920) | Multiply by APINT_BITS_PER_WORD instead of shifting by 6. |
Merging lshrNear into lshrInPlace makes the code significantly less clear (relabeling the variables in the call helps a lot).
I switched the other caller (APInt::byteSwap) to use lshrInPlace. We can actually just remove that function if you prefer, since it is unused.
lib/Support/APInt.cpp | ||
---|---|---|
783 ↗ | (On Diff #94920) | Done, I also changed the parameter type from uint64_t to APInt::WordType to avoid specifying "64" there but not here. |
Thanks for using WordType. At some point I'll continue my cleanup of APInt and use that everwhere.
LGTM with the one question in the test cases.
I guess I didn't submit the comment. Should be there now.
unittests/ADT/APIntTest.cpp | ||
---|---|---|
2016 ↗ | (On Diff #95032) | Is this equivalent to APInt HugePrime(APInt::getLowBitsSet(4450, 4423))? |
unittests/ADT/APIntTest.cpp | ||
---|---|---|
2016 ↗ | (On Diff #95032) | Looks like it is :) Switched to using that prior to commit. |