This is an archive of the discontinued LLVM Phabricator instance.

include/llvm/Support/MathExtras.h
727 ↗	(On Diff #42319)	Do we want types of X and Y to be same ? For instance, the weight can be less than 1. Also we may want to have an overloaded function that is used for common cases where one of the X or Y is constant: template <typename T, uint64_t Multiplier> typename std::enable_if<std::is_unsigned<T>::value, T>::type SaturatingMultiplyAdd(T X, T A, bool *ResultOverflowed = nullptr) { ... } it is probably just slightly more efficient so probably not in this patch.

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

Generally in LLVM for this kind of patch, we would add a couple motivating uses as well instead of as a separate patch. That helps show the context where it was intended to be used and that adding the helper is, in fact, a net win.

include/llvm/Support/MathExtras.h
724 ↗	(On Diff #42319)	I think it's worth adding a note indicating that because all these values are unsigned, there is no distinction between a "fused" operation and non-fused from the perspective of whether saturation occurred.
unittests/Support/MathExtrasTest.cpp
314 ↗	(On Diff #42319)	I don't think we should bother to repeatedly check the case without the `ResultOverflowed` (except maybe one case to make sure that it compiles (i.e. that the default argument does in fact have a default)).

slingn marked 2 inline comments as done.Dec 10 2015, 10:17 AM

slingn added inline comments.

include/llvm/Support/MathExtras.h
727 ↗	(On Diff #42319)	X and Y are the multiplicands. Why would they need to be different types for just the unsigned arithmetic case? I guess the addend (A) could usefully be a negative number but that's inconsistent with SaturatingAdd() as it is currently defined. The pending weighted profile change (D15306) doesn't support weight < 1.

davidxl added inline comments.Dec 10 2015, 10:23 AM

include/llvm/Support/MathExtras.h
727 ↗	(On Diff #42319)	Ok -- sounds reasonable.

In D15385#306733, @silvas wrote:

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

That would be great - and would really simplify the implementation of saturating arithmetic The only caveat is that the builtin_*_overflow intrinsics aren't supported by every compiler that hosts LLVM. For example, GCC 4.x doesn't have builtin_add_overflow() - GCC 5 is required (https://gcc.gnu.org/gcc-5/changes.html). So there would have to be fall-back implementations.

In D15385#307298, @slingn wrote:

In D15385#306733, @silvas wrote:

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

That would be great - and would really simplify the implementation of saturating arithmetic The only caveat is that the builtin_*_overflow intrinsics aren't supported by every compiler that hosts LLVM. For example, GCC 4.x doesn't have builtin_add_overflow() - GCC 5 is required (https://gcc.gnu.org/gcc-5/changes.html). So there would have to be fall-back implementations.

Yeah, it would require some ifdef's and would be extra work. Like I said, "if" we are looking to improve efficiency. I don't think this code at the moment has been measured to be a perf problem though. Also, for PS4, our shipping toolchain is currently compiled with MSVC so this would not affect us anyway (if we used clang-cl, then we could use the __builtin_*_overflow intrinsics, but while clang-cl would be great, I don't think we are planning to do that soon).

slingn mentioned this in D15547: [PGO] Handle and report overflow during profile merge for all types of data.Dec 16 2015, 11:17 AM

Updated for silvas comments.
-Apply SaturatingMultplyAdd() to weighted PGO merging.

Update patch to apply cleanly to HEAD.

Very nice cleanups .. LGTM.

This revision is now accepted and ready to land.Jan 12 2016, 1:47 PM

Closed by commit rL257532: [Support] Add saturating multiply-add support function (authored by slingn). · Explain WhyJan 12 2016, 2:37 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

ProfileData/

SampleProf.h

55 lines

Support/

MathExtras.h

19 lines

lib/

ProfileData/

InstrProf.cpp

17 lines

unittests/

Support/

MathExtrasTest.cpp

54 lines

Diff 44677

llvm/trunk/include/llvm/ProfileData/SampleProf.h

Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	public:

/// Increment the number of samples for this record by \p S.		/// Increment the number of samples for this record by \p S.
/// Optionally scale sample count \p S by \p Weight.		/// Optionally scale sample count \p S by \p Weight.
///		///
/// Sample counts accumulate using saturating arithmetic, to avoid wrapping		/// Sample counts accumulate using saturating arithmetic, to avoid wrapping
/// around unsigned integers.		/// around unsigned integers.
sampleprof_error addSamples(uint64_t S, uint64_t Weight = 1) {		sampleprof_error addSamples(uint64_t S, uint64_t Weight = 1) {
bool Overflowed;		bool Overflowed;
if (Weight > 1) {		NumSamples = SaturatingMultiplyAdd(S, Weight, NumSamples, &Overflowed);
S = SaturatingMultiply(S, Weight, &Overflowed);		return Overflowed ? sampleprof_error::counter_overflow
if (Overflowed)		: sampleprof_error::success;
return sampleprof_error::counter_overflow;
}
NumSamples = SaturatingAdd(NumSamples, S, &Overflowed);
if (Overflowed)
return sampleprof_error::counter_overflow;

return sampleprof_error::success;
}		}

/// Add called function \p F with samples \p S.		/// Add called function \p F with samples \p S.
/// Optionally scale sample count \p S by \p Weight.		/// Optionally scale sample count \p S by \p Weight.
///		///
/// Sample counts accumulate using saturating arithmetic, to avoid wrapping		/// Sample counts accumulate using saturating arithmetic, to avoid wrapping
/// around unsigned integers.		/// around unsigned integers.
sampleprof_error addCalledTarget(StringRef F, uint64_t S,		sampleprof_error addCalledTarget(StringRef F, uint64_t S,
uint64_t Weight = 1) {		uint64_t Weight = 1) {
uint64_t &TargetSamples = CallTargets[F];		uint64_t &TargetSamples = CallTargets[F];
bool Overflowed;		bool Overflowed;
if (Weight > 1) {		TargetSamples =
S = SaturatingMultiply(S, Weight, &Overflowed);		SaturatingMultiplyAdd(S, Weight, TargetSamples, &Overflowed);
if (Overflowed)		return Overflowed ? sampleprof_error::counter_overflow
return sampleprof_error::counter_overflow;		: sampleprof_error::success;
}
TargetSamples = SaturatingAdd(TargetSamples, S, &Overflowed);
if (Overflowed)
return sampleprof_error::counter_overflow;

return sampleprof_error::success;
}		}

/// Return true if this sample record contains function calls.		/// Return true if this sample record contains function calls.
bool hasCalls() const { return CallTargets.size() > 0; }		bool hasCalls() const { return CallTargets.size() > 0; }

uint64_t getSamples() const { return NumSamples; }		uint64_t getSamples() const { return NumSamples; }
const CallTargetMap &getCallTargets() const { return CallTargets; }		const CallTargetMap &getCallTargets() const { return CallTargets; }

Show All 28 Lines
/// within the body of the function.		/// within the body of the function.
class FunctionSamples {		class FunctionSamples {
public:		public:
FunctionSamples() : TotalSamples(0), TotalHeadSamples(0) {}		FunctionSamples() : TotalSamples(0), TotalHeadSamples(0) {}
void print(raw_ostream &OS = dbgs(), unsigned Indent = 0) const;		void print(raw_ostream &OS = dbgs(), unsigned Indent = 0) const;
void dump() const;		void dump() const;
sampleprof_error addTotalSamples(uint64_t Num, uint64_t Weight = 1) {		sampleprof_error addTotalSamples(uint64_t Num, uint64_t Weight = 1) {
bool Overflowed;		bool Overflowed;
if (Weight > 1) {		TotalSamples =
Num = SaturatingMultiply(Num, Weight, &Overflowed);		SaturatingMultiplyAdd(Num, Weight, TotalSamples, &Overflowed);
if (Overflowed)		return Overflowed ? sampleprof_error::counter_overflow
return sampleprof_error::counter_overflow;		: sampleprof_error::success;
}
TotalSamples = SaturatingAdd(TotalSamples, Num, &Overflowed);
if (Overflowed)
return sampleprof_error::counter_overflow;

return sampleprof_error::success;
}		}
sampleprof_error addHeadSamples(uint64_t Num, uint64_t Weight = 1) {		sampleprof_error addHeadSamples(uint64_t Num, uint64_t Weight = 1) {
bool Overflowed;		bool Overflowed;
if (Weight > 1) {		TotalHeadSamples =
Num = SaturatingMultiply(Num, Weight, &Overflowed);		SaturatingMultiplyAdd(Num, Weight, TotalHeadSamples, &Overflowed);
if (Overflowed)		return Overflowed ? sampleprof_error::counter_overflow
return sampleprof_error::counter_overflow;		: sampleprof_error::success;
}
TotalHeadSamples = SaturatingAdd(TotalHeadSamples, Num, &Overflowed);
if (Overflowed)
return sampleprof_error::counter_overflow;

return sampleprof_error::success;
}		}
sampleprof_error addBodySamples(uint32_t LineOffset, uint32_t Discriminator,		sampleprof_error addBodySamples(uint32_t LineOffset, uint32_t Discriminator,
uint64_t Num, uint64_t Weight = 1) {		uint64_t Num, uint64_t Weight = 1) {
return BodySamples[LineLocation(LineOffset, Discriminator)].addSamples(		return BodySamples[LineLocation(LineOffset, Discriminator)].addSamples(
Num, Weight);		Num, Weight);
}		}
sampleprof_error addCalledTargetSamples(uint32_t LineOffset,		sampleprof_error addCalledTargetSamples(uint32_t LineOffset,
uint32_t Discriminator,		uint32_t Discriminator,
▲ Show 20 Lines • Show All 138 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Support/MathExtras.h

Show First 20 Lines • Show All 711 Lines • ▼ Show 20 Lines	SaturatingMultiply(T X, T Y, bool *ResultOverflowed = nullptr) {
}		}
Z <<= 1;		Z <<= 1;
if (X & 1)		if (X & 1)
return SaturatingAdd(Z, Y, ResultOverflowed);		return SaturatingAdd(Z, Y, ResultOverflowed);

return Z;		return Z;
}		}

		/// \brief Multiply two unsigned integers, X and Y, and add the unsigned
		/// integer, A to the product. Clamp the result to the maximum representable
		/// value of T on overflow. ResultOverflowed indicates if the result is larger
		/// than the maximum representable value of type T.
		/// Note that this is purely a convenience function as there is no distinction
		/// where overflow occurred in a 'fused' multiply-add for unsigned numbers.
		template <typename T>
		typename std::enable_if<std::is_unsigned<T>::value, T>::type
		SaturatingMultiplyAdd(T X, T Y, T A, bool *ResultOverflowed = nullptr) {
		bool Dummy;
		bool &Overflowed = ResultOverflowed ? *ResultOverflowed : Dummy;

		T Product = SaturatingMultiply(X, Y, &Overflowed);
		if (Overflowed)
		return Product;

		return SaturatingAdd(A, Product, &Overflowed);
		}

extern const float huge_valf;		extern const float huge_valf;
} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/trunk/lib/ProfileData/InstrProf.cpp

Show First 20 Lines • Show All 263 Lines • ▼ Show 20 Lines	instrprof_error InstrProfValueSiteRecord::merge(InstrProfValueSiteRecord &Input,
auto I = ValueData.begin();		auto I = ValueData.begin();
auto IE = ValueData.end();		auto IE = ValueData.end();
instrprof_error Result = instrprof_error::success;		instrprof_error Result = instrprof_error::success;
for (auto J = Input.ValueData.begin(), JE = Input.ValueData.end(); J != JE;		for (auto J = Input.ValueData.begin(), JE = Input.ValueData.end(); J != JE;
++J) {		++J) {
while (I != IE && I->Value < J->Value)		while (I != IE && I->Value < J->Value)
++I;		++I;
if (I != IE && I->Value == J->Value) {		if (I != IE && I->Value == J->Value) {
uint64_t JCount = J->Count;
bool Overflowed;		bool Overflowed;
if (Weight > 1) {		I->Count = SaturatingMultiplyAdd(J->Count, Weight, I->Count, &Overflowed);
JCount = SaturatingMultiply(JCount, Weight, &Overflowed);
if (Overflowed)
Result = instrprof_error::counter_overflow;
}
I->Count = SaturatingAdd(I->Count, JCount, &Overflowed);
if (Overflowed)		if (Overflowed)
Result = instrprof_error::counter_overflow;		Result = instrprof_error::counter_overflow;
++I;		++I;
continue;		continue;
}		}
ValueData.insert(I, *J);		ValueData.insert(I, *J);
}		}
return Result;		return Result;
Show All 35 Lines	instrprof_error InstrProfRecord::merge(InstrProfRecord &Other,
// or a hash collision.		// or a hash collision.
if (Counts.size() != Other.Counts.size())		if (Counts.size() != Other.Counts.size())
return instrprof_error::count_mismatch;		return instrprof_error::count_mismatch;

instrprof_error Result = instrprof_error::success;		instrprof_error Result = instrprof_error::success;

for (size_t I = 0, E = Other.Counts.size(); I < E; ++I) {		for (size_t I = 0, E = Other.Counts.size(); I < E; ++I) {
bool Overflowed;		bool Overflowed;
uint64_t OtherCount = Other.Counts[I];		Counts[I] =
if (Weight > 1) {		SaturatingMultiplyAdd(Other.Counts[I], Weight, Counts[I], &Overflowed);
OtherCount = SaturatingMultiply(OtherCount, Weight, &Overflowed);
if (Overflowed)
Result = instrprof_error::counter_overflow;
}
Counts[I] = SaturatingAdd(Counts[I], OtherCount, &Overflowed);
if (Overflowed)		if (Overflowed)
Result = instrprof_error::counter_overflow;		Result = instrprof_error::counter_overflow;
}		}

for (uint32_t Kind = IPVK_First; Kind <= IPVK_Last; ++Kind)		for (uint32_t Kind = IPVK_First; Kind <= IPVK_Last; ++Kind)
MergeResult(Result, mergeValueProfData(Kind, Other, Weight));		MergeResult(Result, mergeValueProfData(Kind, Other, Weight));

return Result;		return Result;
▲ Show 20 Lines • Show All 268 Lines • Show Last 20 Lines

llvm/trunk/unittests/Support/MathExtrasTest.cpp

	Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines

	TEST(MathExtras, SaturatingMultiply) {			TEST(MathExtras, SaturatingMultiply) {
	SaturatingMultiplyTestHelper<uint8_t>();			SaturatingMultiplyTestHelper<uint8_t>();
	SaturatingMultiplyTestHelper<uint16_t>();			SaturatingMultiplyTestHelper<uint16_t>();
	SaturatingMultiplyTestHelper<uint32_t>();			SaturatingMultiplyTestHelper<uint32_t>();
	SaturatingMultiplyTestHelper<uint64_t>();			SaturatingMultiplyTestHelper<uint64_t>();
	}			}

				template<typename T>
				void SaturatingMultiplyAddTestHelper()
				{
				const T Max = std::numeric_limits<T>::max();
				bool ResultOverflowed;

				// Test basic multiply-add.
				EXPECT_EQ(T(16), SaturatingMultiplyAdd(T(2), T(3), T(10)));
				EXPECT_EQ(T(16), SaturatingMultiplyAdd(T(2), T(3), T(10), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				// Test multiply overflows, add doesn't overflow
				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, T(0), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply doesn't overflow, add overflows
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), T(1), Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply-add with Max as operand
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), T(1), Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), Max, T(1), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, T(1), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply-add with 0 as operand
				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(1), T(0), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(0), T(1), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(0), T(0), T(1), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(0), SaturatingMultiplyAdd(T(0), T(0), T(0), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				}

				TEST(MathExtras, SaturatingMultiplyAdd) {
				SaturatingMultiplyAddTestHelper<uint8_t>();
				SaturatingMultiplyAddTestHelper<uint16_t>();
				SaturatingMultiplyAddTestHelper<uint32_t>();
				SaturatingMultiplyAddTestHelper<uint64_t>();
				}

	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[Support] Add saturating multiply-add support functionClosedPublic

Details

Diff Detail