This is an archive of the discontinued LLVM Phabricator instance.

[ConstantRange] Remove costly udivrem from ConstantRange::truncate
ClosedPublic

Authored by craig.topper on Apr 29 2017, 12:07 PM.

Download Raw Diff

Details

Reviewers

spatel
sanjoy
davide
pete

Commits

rGf6e138d79497: [ConstantRange] Remove costly udivrem from ConstantRange::truncate
rL304733: [ConstantRange] Remove costly udivrem from ConstantRange::truncate

Summary

Truncate currently uses a udivrem call which is going to be slow particularly for larger than 64-bit widths.

As far as I can tell all we were trying to do was modulo LowerDiv by (MaxValue+1) and make sure whatever value was effectively subtracted from LowerDiv was also subtracted from UpperDiv.

This patch recognizes that MaxValue+1 is a power of 2 so we can just use a bitwise AND to accomplish a modulo operation or isolate the upper bits.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 29 2017, 12:07 PM

Refine this even more to remove another temporary APInt. Now we count bits to replace the compares with MaxValue.

craig.topper added inline comments.Apr 29 2017, 7:16 PM

lib/IR/ConstantRange.cpp
582	Note this countTrailingOnes check isn't strictly necessary. If Upper is equal to MaxValue(DstTy), Union will be FullSet. We could continue on and trust unionWith to return FullSet.
593–596	Note the original code used uge, but I believe ugt was sufficient. Thus I've only implemented the equivalent of ugt with the getActiveBits check here.

Ping

Ping x2

Ping x3.

Do we have (or can you add) unit tests to make sure this is behaving as expected?

We had some unittests. I added a missing EXPECT for one of them in r304693. And added a few more tests in r304694.

LGTM.

This revision is now accepted and ready to land.Jun 5 2017, 1:17 PM

Closed by commit rL304733: [ConstantRange] Remove costly udivrem from ConstantRange::truncate (authored by ctopper). · Explain WhyJun 5 2017, 1:48 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

IR/

ConstantRange.cpp

11 lines

Diff 97195

lib/IR/ConstantRange.cpp

	Show First 20 Lines • Show All 561 Lines • ▼ Show 20 Lines
	ConstantRange ConstantRange::truncate(uint32_t DstTySize) const {			ConstantRange ConstantRange::truncate(uint32_t DstTySize) const {
	assert(getBitWidth() > DstTySize && "Not a value truncation");			assert(getBitWidth() > DstTySize && "Not a value truncation");
	if (isEmptySet())			if (isEmptySet())
	return ConstantRange(DstTySize, /isFullSet=/false);			return ConstantRange(DstTySize, /isFullSet=/false);
	if (isFullSet())			if (isFullSet())
	return ConstantRange(DstTySize, /isFullSet=/true);			return ConstantRange(DstTySize, /isFullSet=/true);

	APInt MaxValue = APInt::getLowBitsSet(getBitWidth(), DstTySize);			APInt MaxValue = APInt::getLowBitsSet(getBitWidth(), DstTySize);
	APInt MaxBitValue = APInt::getOneBitSet(getBitWidth(), DstTySize);

	APInt LowerDiv(Lower), UpperDiv(Upper);			APInt LowerDiv(Lower), UpperDiv(Upper);
	ConstantRange Union(DstTySize, /isFullSet=/false);			ConstantRange Union(DstTySize, /isFullSet=/false);

	// Analyze wrapped sets in their two parts: [0, Upper) \/ [Lower, MaxValue]			// Analyze wrapped sets in their two parts: [0, Upper) \/ [Lower, MaxValue]
	// We use the non-wrapped set code to analyze the [Lower, MaxValue) part, and			// We use the non-wrapped set code to analyze the [Lower, MaxValue) part, and
	// then we do the union with [MaxValue, Upper)			// then we do the union with [MaxValue, Upper)
	if (isWrappedSet()) {			if (isWrappedSet()) {
	// If Upper is greater than Max Value, it covers the whole truncated range.			// If Upper is greater than Max Value, it covers the whole truncated range.
	if (Upper.uge(MaxValue))			if (Upper.uge(MaxValue))
	return ConstantRange(DstTySize, /isFullSet=/true);			return ConstantRange(DstTySize, /isFullSet=/true);

	Union = ConstantRange(APInt::getMaxValue(DstTySize),Upper.trunc(DstTySize));			Union = ConstantRange(APInt::getMaxValue(DstTySize),Upper.trunc(DstTySize));
				craig.topperAuthorUnsubmitted Not Done Reply Inline Actions Note this countTrailingOnes check isn't strictly necessary. If Upper is equal to MaxValue(DstTy), Union will be FullSet. We could continue on and trust unionWith to return FullSet. craig.topper: Note this countTrailingOnes check isn't strictly necessary. If Upper is equal to MaxValue…
	UpperDiv = APInt::getMaxValue(getBitWidth());			UpperDiv = APInt::getMaxValue(getBitWidth());

	// Union covers the MaxValue case, so return if the remaining range is just			// Union covers the MaxValue case, so return if the remaining range is just
	// MaxValue.			// MaxValue.
	if (LowerDiv == UpperDiv)			if (LowerDiv == UpperDiv)
	return Union;			return Union;
	}			}

	// Chop off the most significant bits that are past the destination bitwidth.			// Chop off the most significant bits that are past the destination bitwidth.
	if (LowerDiv.uge(MaxValue)) {			if (LowerDiv.uge(MaxValue)) {
	APInt Div(getBitWidth(), 0);			// Mask to just the signficant bits and subtract from LowerDiv/UpperDiv.
	APInt::udivrem(LowerDiv, MaxBitValue, Div, LowerDiv);			APInt Adjust = LowerDiv & ~MaxValue;
	UpperDiv -= MaxBitValue * Div;			LowerDiv -= Adjust;
				UpperDiv -= Adjust;
				craig.topperAuthorUnsubmitted Not Done Reply Inline Actions Note the original code used uge, but I believe ugt was sufficient. Thus I've only implemented the equivalent of ugt with the getActiveBits check here. craig.topper: Note the original code used uge, but I believe ugt was sufficient. Thus I've only implemented…
	}			}

	if (UpperDiv.ule(MaxValue))			if (UpperDiv.ule(MaxValue))
	return ConstantRange(LowerDiv.trunc(DstTySize),			return ConstantRange(LowerDiv.trunc(DstTySize),
	UpperDiv.trunc(DstTySize)).unionWith(Union);			UpperDiv.trunc(DstTySize)).unionWith(Union);

	// The truncated value wraps around. Check if we can do better than fullset.			// The truncated value wraps around. Check if we can do better than fullset.
	UpperDiv -= MaxBitValue;			UpperDiv -= MaxValue;
				UpperDiv -= 1;
	if (UpperDiv.ult(LowerDiv))			if (UpperDiv.ult(LowerDiv))
	return ConstantRange(LowerDiv.trunc(DstTySize),			return ConstantRange(LowerDiv.trunc(DstTySize),
	UpperDiv.trunc(DstTySize)).unionWith(Union);			UpperDiv.trunc(DstTySize)).unionWith(Union);

	return ConstantRange(DstTySize, /isFullSet=/true);			return ConstantRange(DstTySize, /isFullSet=/true);
	}			}

	ConstantRange ConstantRange::zextOrTrunc(uint32_t DstTySize) const {			ConstantRange ConstantRange::zextOrTrunc(uint32_t DstTySize) const {
	▲ Show 20 Lines • Show All 340 Lines • Show Last 20 Lines