This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/ADT/
-
llvm/
-
ADT/
-
APInt.h
-
lib/Support/
-
Support/
5/6
APInt.cpp
-
unittests/ADT/
-
ADT/
1/2
APIntTest.cpp

Differential D31968

Remove all allocation and divisions from GreatestCommonDivisor
ClosedPublic

Authored by rsmith on Apr 11 2017, 6:17 PM.

Download Raw Diff

Details

Reviewers

dblaikie
craig.topper

Commits

rG55bd375b69c6: Remove all allocation and divisions from GreatestCommonDivisor
rL300252: Remove all allocation and divisions from GreatestCommonDivisor

Summary

Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our lshr implementation so it can operate in-place.

Diff Detail

Repository: rL LLVM

Event Timeline

rsmith created this revision.Apr 11 2017, 6:17 PM

Can we just merge lshrNear fully into lshrInPlace and just use lshrInPlace in the other place that uses lshrNear?

lib/Support/APInt.cpp
776	Should we add an assert that Shift is less than APINT_BITS_PER_WORD?
783	Use APINT_WORD_SIZE instead of 8.
790	Use APINT_BITS_PER_WORD instead of 64.
1221	Can we divide by APINT_BITS_PER_WORD and let the compiler optimize to shift?
1225	Multiply by APINT_BITS_PER_WORD instead of shifting by 6.

In D31968#724522, @craig.topper wrote:

Can we just merge lshrNear fully into lshrInPlace and just use lshrInPlace in the other place that uses lshrNear?

Merging lshrNear into lshrInPlace makes the code significantly less clear (relabeling the variables in the call helps a lot).

I switched the other caller (APInt::byteSwap) to use lshrInPlace. We can actually just remove that function if you prefer, since it is unused.

lib/Support/APInt.cpp
783	Done, I also changed the parameter type from `uint64_t` to `APInt::WordType` to avoid specifying "64" there but not here.

rsmith updated this revision to Diff 95032.Apr 12 2017, 2:41 PM

rsmith marked an inline comment as done.

Thanks for using WordType. At some point I'll continue my cleanup of APInt and use that everwhere.

LGTM with the one question in the test cases.

This revision is now accepted and ready to land.Apr 12 2017, 2:52 PM

In D31968#725369, @craig.topper wrote:

LGTM with the one question in the test cases.

What's the question? Phab doesn't seem to be showing it.

I guess I didn't submit the comment. Should be there now.

unittests/ADT/APIntTest.cpp
2016	Is this equivalent to APInt HugePrime(APInt::getLowBitsSet(4450, 4423))?

rsmith marked an inline comment as done.Apr 13 2017, 1:41 PM

rsmith added inline comments.

unittests/ADT/APIntTest.cpp
2016	Looks like it is :) Switched to using that prior to commit.

Closed by commit rL300252: Remove all allocation and divisions from GreatestCommonDivisor (authored by rsmith). · Explain WhyApr 13 2017, 1:42 PM

This revision was automatically updated to reflect the committed changes.

rsmith marked an inline comment as done.

Accidentally committed a couple of extra files, reverted in r300253.

Revision Contents

Path

Size

include/

llvm/

ADT/

APInt.h

11 lines

lib/

Support/

APInt.cpp

139 lines

unittests/

ADT/

APIntTest.cpp

44 lines

Diff 94920

include/llvm/ADT/APInt.h

Show First 20 Lines • Show All 859 Lines • ▼ Show 20 Lines	public:
/// \brief Arithmetic right-shift function.		/// \brief Arithmetic right-shift function.
///		///
/// Arithmetic right-shift this APInt by shiftAmt.		/// Arithmetic right-shift this APInt by shiftAmt.
APInt ashr(unsigned shiftAmt) const;		APInt ashr(unsigned shiftAmt) const;

/// \brief Logical right-shift function.		/// \brief Logical right-shift function.
///		///
/// Logical right-shift this APInt by shiftAmt.		/// Logical right-shift this APInt by shiftAmt.
APInt lshr(unsigned shiftAmt) const;		APInt lshr(unsigned shiftAmt) const {
		APInt R(*this);
		R.lshrInPlace(shiftAmt);
		return R;
		}

		/// Logical right-shift this APInt by shiftAmt in place.
		void lshrInPlace(unsigned shiftAmt);

/// \brief Left-shift function.		/// \brief Left-shift function.
///		///
/// Left-shift this APInt by shiftAmt.		/// Left-shift this APInt by shiftAmt.
APInt shl(unsigned shiftAmt) const {		APInt shl(unsigned shiftAmt) const {
assert(shiftAmt <= BitWidth && "Invalid shift amount");		assert(shiftAmt <= BitWidth && "Invalid shift amount");
if (isSingleWord()) {		if (isSingleWord()) {
if (shiftAmt >= BitWidth)		if (shiftAmt >= BitWidth)
▲ Show 20 Lines • Show All 1,055 Lines • ▼ Show 20 Lines	inline const APInt &umin(const APInt &A, const APInt &B) {
return A.ult(B) ? A : B;		return A.ult(B) ? A : B;
}		}

/// \brief Determine the larger of two APInts considered to be unsigned.		/// \brief Determine the larger of two APInts considered to be unsigned.
inline const APInt &umax(const APInt &A, const APInt &B) {		inline const APInt &umax(const APInt &A, const APInt &B) {
return A.ugt(B) ? A : B;		return A.ugt(B) ? A : B;
}		}

/// \brief Compute GCD of two APInt values.		/// \brief Compute GCD of two unsigned APInt values.
///		///
/// This function returns the greatest common divisor of the two APInt values		/// This function returns the greatest common divisor of the two APInt values
/// using Euclid's algorithm.		/// using Euclid's algorithm.
///		///
/// \returns the greatest common divisor of A and B.		/// \returns the greatest common divisor of A and B.
APInt GreatestCommonDivisor(APInt A, APInt B);		APInt GreatestCommonDivisor(APInt A, APInt B);

/// \brief Converts the given APInt to a double value.		/// \brief Converts the given APInt to a double value.
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

lib/Support/APInt.cpp

Show First 20 Lines • Show All 764 Lines • ▼ Show 20 Lines

unsigned APInt::countPopulationSlowCase() const {		unsigned APInt::countPopulationSlowCase() const {
unsigned Count = 0;		unsigned Count = 0;
for (unsigned i = 0; i < getNumWords(); ++i)		for (unsigned i = 0; i < getNumWords(); ++i)
Count += llvm::countPopulation(pVal[i]);		Count += llvm::countPopulation(pVal[i]);
return Count;		return Count;
}		}

/// Perform a logical right-shift from Src to Dst, which must be equal or		/// Perform a logical right-shift from Src to Dst of Words words, by Shift,
/// non-overlapping, of Words words, by Shift, which must be less than 64.		/// which must be less than 64. If the source and destination ranges overlap,
		/// we require that Src >= Dst (put another way, we require that the overall
		/// operation is a right shift).
		craig.topperUnsubmitted Done Reply Inline Actions Should we add an assert that Shift is less than APINT_BITS_PER_WORD? craig.topper: Should we add an assert that Shift is less than APINT_BITS_PER_WORD?
static void lshrNear(uint64_t Dst, uint64_t Src, unsigned Words,		static void lshrNear(uint64_t Dst, uint64_t Src, unsigned Words,
unsigned Shift) {		unsigned Shift) {
uint64_t Carry = 0;		if (!Words)
for (int I = Words - 1; I >= 0; --I) {		return;
uint64_t Tmp = Src[I];
Dst[I] = (Tmp >> Shift) \| Carry;		if (Shift == 0) {
Carry = Tmp << (64 - Shift);		std::memmove(Dst, Src, Words * 8);
		craig.topperUnsubmitted Done Reply Inline Actions Use APINT_WORD_SIZE instead of 8. craig.topper: Use APINT_WORD_SIZE instead of 8.
		rsmithAuthorUnsubmitted Not Done Reply Inline Actions Done, I also changed the parameter type from `uint64_t` to `APInt::WordType` to avoid specifying "64" there but not here. rsmith: Done, I also changed the parameter type from `uint64_t` to `APInt::WordType` to avoid…
		return;
		}

		uint64_t Low = Src[0];
		for (unsigned I = 1; I != Words; ++I) {
		uint64_t High = Src[I];
		Dst[I - 1] = (Low >> Shift) \| (High << (64 - Shift));
		craig.topperUnsubmitted Done Reply Inline Actions Use APINT_BITS_PER_WORD instead of 64. craig.topper: Use APINT_BITS_PER_WORD instead of 64.
		Low = High;
}		}
		Dst[Words - 1] = Low >> Shift;
}		}

APInt APInt::byteSwap() const {		APInt APInt::byteSwap() const {
assert(BitWidth >= 16 && BitWidth % 16 == 0 && "Cannot byteswap!");		assert(BitWidth >= 16 && BitWidth % 16 == 0 && "Cannot byteswap!");
if (BitWidth == 16)		if (BitWidth == 16)
return APInt(BitWidth, ByteSwap_16(uint16_t(VAL)));		return APInt(BitWidth, ByteSwap_16(uint16_t(VAL)));
if (BitWidth == 32)		if (BitWidth == 32)
return APInt(BitWidth, ByteSwap_32(unsigned(VAL)));		return APInt(BitWidth, ByteSwap_32(unsigned(VAL)));
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	for ((Val = Val.lshr(1)); Val != 0; (Val = Val.lshr(1))) {
--S;		--S;
}		}

Reversed <<= S;		Reversed <<= S;
return Reversed;		return Reversed;
}		}

APInt llvm::APIntOps::GreatestCommonDivisor(APInt A, APInt B) {		APInt llvm::APIntOps::GreatestCommonDivisor(APInt A, APInt B) {
while (!!B) {		// Fast-path a common case.
APInt R = A.urem(B);		if (A == B) return A;
A = std::move(B);
B = std::move(R);		// Corner cases: if either operand is zero, the other is the gcd.
		if (!A) return B;
		if (!B) return A;

		// Count common powers of 2 and remove all other powers of 2.
		unsigned Pow2;
		{
		unsigned Pow2_A = A.countTrailingZeros();
		unsigned Pow2_B = B.countTrailingZeros();
		if (Pow2_A > Pow2_B) {
		A.lshrInPlace(Pow2_A - Pow2_B);
		Pow2 = Pow2_B;
		} else if (Pow2_B > Pow2_A) {
		B.lshrInPlace(Pow2_B - Pow2_A);
		Pow2 = Pow2_A;
		} else {
		Pow2 = Pow2_A;
		}
}		}

		// Both operands are odd multiples of 2^Pow_2:
		//
		// gcd(a, b) = gcd(\|a - b\| / 2^i, min(a, b))
		//
		// This is a modified version of Stein's algorithm, taking advantage of
		// efficient countTrailingZeros().
		while (A != B) {
		if (A.ugt(B)) {
		A -= B;
		A.lshrInPlace(A.countTrailingZeros() - Pow2);
		} else {
		B -= A;
		B.lshrInPlace(B.countTrailingZeros() - Pow2);
		}
		}

return A;		return A;
}		}

APInt llvm::APIntOps::RoundDoubleToAPInt(double Double, unsigned width) {		APInt llvm::APIntOps::RoundDoubleToAPInt(double Double, unsigned width) {
union {		union {
double D;		double D;
uint64_t I;		uint64_t I;
} T;		} T;
▲ Show 20 Lines • Show All 297 Lines • ▼ Show 20 Lines
/// Logical right-shift this APInt by shiftAmt.		/// Logical right-shift this APInt by shiftAmt.
/// @brief Logical right-shift function.		/// @brief Logical right-shift function.
APInt APInt::lshr(const APInt &shiftAmt) const {		APInt APInt::lshr(const APInt &shiftAmt) const {
return lshr((unsigned)shiftAmt.getLimitedValue(BitWidth));		return lshr((unsigned)shiftAmt.getLimitedValue(BitWidth));
}		}

/// Logical right-shift this APInt by shiftAmt.		/// Logical right-shift this APInt by shiftAmt.
/// @brief Logical right-shift function.		/// @brief Logical right-shift function.
APInt APInt::lshr(unsigned shiftAmt) const {		void APInt::lshrInPlace(unsigned shiftAmt) {
if (isSingleWord()) {		if (isSingleWord()) {
if (shiftAmt >= BitWidth)		if (shiftAmt >= BitWidth)
return APInt(BitWidth, 0);		VAL = 0;
else		else
return APInt(BitWidth, this->VAL >> shiftAmt);		VAL >>= shiftAmt;
}		return;

// If all the bits were shifted out, the result is 0. This avoids issues
// with shifting by the size of the integer type, which produces undefined
// results. We define these "undefined results" to always be 0.
if (shiftAmt >= BitWidth)
return APInt(BitWidth, 0);

// If none of the bits are shifted out, the result is *this. This avoids
// issues with shifting by the size of the integer type, which produces
// undefined results in the code below. This is also an optimization.
if (shiftAmt == 0)
return *this;

// Create some space for the result.
uint64_t * val = new uint64_t[getNumWords()];

// If we are shifting less than a word, compute the shift with a simple carry
if (shiftAmt < APINT_BITS_PER_WORD) {
lshrNear(val, pVal, getNumWords(), shiftAmt);
APInt Result(val, BitWidth);
Result.clearUnusedBits();
return Result;
}

// Compute some values needed by the remaining shift algorithms
unsigned wordShift = shiftAmt % APINT_BITS_PER_WORD;
unsigned offset = shiftAmt / APINT_BITS_PER_WORD;

// If we are shifting whole words, just move whole words
if (wordShift == 0) {
for (unsigned i = 0; i < getNumWords() - offset; ++i)
val[i] = pVal[i+offset];
for (unsigned i = getNumWords()-offset; i < getNumWords(); i++)
val[i] = 0;
APInt Result(val, BitWidth);
Result.clearUnusedBits();
return Result;
}		}

// Shift the low order words		// Don't bother performing a no-op shift.
unsigned breakWord = getNumWords() - offset -1;		if (!shiftAmt)
for (unsigned i = 0; i < breakWord; ++i)		return;
val[i] = (pVal[i+offset] >> wordShift) \|
(pVal[i+offset+1] << (APINT_BITS_PER_WORD - wordShift));
// Shift the break word.
val[breakWord] = pVal[breakWord+offset] >> wordShift;

// Remaining words are 0		// Find number of complete words being shifted out and zeroed.
for (unsigned i = breakWord+1; i < getNumWords(); ++i)		const unsigned Words = getNumWords();
val[i] = 0;		const unsigned ShiftFullWords = std::min(shiftAmt >> 6, Words);
		craig.topperUnsubmitted Done Reply Inline Actions Can we divide by APINT_BITS_PER_WORD and let the compiler optimize to shift? craig.topper: Can we divide by APINT_BITS_PER_WORD and let the compiler optimize to shift?
APInt Result(val, BitWidth);
Result.clearUnusedBits();		// Fill in first Words - ShiftFullWords by shifting.
return Result;		lshrNear(pVal, pVal + ShiftFullWords, Words - ShiftFullWords,
		shiftAmt - (ShiftFullWords << 6));
		craig.topperUnsubmitted Done Reply Inline Actions Multiply by APINT_BITS_PER_WORD instead of shifting by 6. craig.topper: Multiply by APINT_BITS_PER_WORD instead of shifting by 6.

		// The remaining high words are all zero.
		for (unsigned I = Words - ShiftFullWords; I != Words; ++I)
		pVal[I] = 0;
}		}

/// Left-shift this APInt by shiftAmt.		/// Left-shift this APInt by shiftAmt.
/// @brief Left-shift function.		/// @brief Left-shift function.
APInt APInt::shl(const APInt &shiftAmt) const {		APInt APInt::shl(const APInt &shiftAmt) const {
// It's undefined behavior in C to shift by BitWidth or greater.		// It's undefined behavior in C to shift by BitWidth or greater.
return shl((unsigned)shiftAmt.getLimitedValue(BitWidth));		return shl((unsigned)shiftAmt.getLimitedValue(BitWidth));
}		}
▲ Show 20 Lines • Show All 1,597 Lines • Show Last 20 Lines

unittests/ADT/APIntTest.cpp

	Show First 20 Lines • Show All 1,971 Lines • ▼ Show 20 Lines
	TEST(APIntTest, getHiBits) {			TEST(APIntTest, getHiBits) {
	APInt i32(32, 0xfa);			APInt i32(32, 0xfa);
	i32.setHighBits(2);			i32.setHighBits(2);
	EXPECT_EQ(0xc, i32.getHiBits(4));			EXPECT_EQ(0xc, i32.getHiBits(4));
	APInt i128(128, 0xfa);			APInt i128(128, 0xfa);
	i128.setHighBits(2);			i128.setHighBits(2);
	EXPECT_EQ(0xc, i128.getHiBits(4));			EXPECT_EQ(0xc, i128.getHiBits(4));
	}			}

				TEST(APIntTest, GCD) {
				using APIntOps::GreatestCommonDivisor;

				for (unsigned Bits : {1, 2, 32, 63, 64, 65}) {
				// Test some corner cases near zero.
				APInt Zero(Bits, 0), One(Bits, 1);
				EXPECT_EQ(GreatestCommonDivisor(Zero, Zero), Zero);
				EXPECT_EQ(GreatestCommonDivisor(Zero, One), One);
				EXPECT_EQ(GreatestCommonDivisor(One, Zero), One);
				EXPECT_EQ(GreatestCommonDivisor(One, One), One);

				if (Bits > 1) {
				APInt Two(Bits, 2);
				EXPECT_EQ(GreatestCommonDivisor(Zero, Two), Two);
				EXPECT_EQ(GreatestCommonDivisor(One, Two), One);
				EXPECT_EQ(GreatestCommonDivisor(Two, Two), Two);

				// Test some corner cases near the highest representable value.
				APInt Max(Bits, 0);
				Max.setAllBits();
				EXPECT_EQ(GreatestCommonDivisor(Zero, Max), Max);
				EXPECT_EQ(GreatestCommonDivisor(One, Max), One);
				EXPECT_EQ(GreatestCommonDivisor(Two, Max), One);
				EXPECT_EQ(GreatestCommonDivisor(Max, Max), Max);

				APInt MaxOver2 = Max.udiv(Two);
				EXPECT_EQ(GreatestCommonDivisor(MaxOver2, Max), One);
				// Max - 1 == Max / 2 * 2, because Max is odd.
				EXPECT_EQ(GreatestCommonDivisor(MaxOver2, Max - 1), MaxOver2);
				}
				}

				// Compute the 20th Mersenne prime.
				APInt HugePrime(4423, 0);
				HugePrime.setAllBits();
				HugePrime = HugePrime.zext(4450);
				craig.topperUnsubmitted Done Reply Inline Actions Is this equivalent to APInt HugePrime(APInt::getLowBitsSet(4450, 4423))? craig.topper: Is this equivalent to APInt HugePrime(APInt::getLowBitsSet(4450, 4423))?
				rsmithAuthorUnsubmitted Not Done Reply Inline Actions Looks like it is :) Switched to using that prior to commit. rsmith: Looks like it is :) Switched to using that prior to commit.

				// 9931 and 123456 are coprime.
				APInt A = HugePrime * APInt(4450, 9931);
				APInt B = HugePrime * APInt(4450, 123456);
				APInt C = GreatestCommonDivisor(A, B);
				EXPECT_EQ(C, HugePrime);
				}