This is an archive of the discontinued LLVM Phabricator instance.

include/llvm/Support/MathExtras.h
727	Do we want types of X and Y to be same ? For instance, the weight can be less than 1. Also we may want to have an overloaded function that is used for common cases where one of the X or Y is constant: template <typename T, uint64_t Multiplier> typename std::enable_if<std::is_unsigned<T>::value, T>::type SaturatingMultiplyAdd(T X, T A, bool *ResultOverflowed = nullptr) { ... } it is probably just slightly more efficient so probably not in this patch.

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

Generally in LLVM for this kind of patch, we would add a couple motivating uses as well instead of as a separate patch. That helps show the context where it was intended to be used and that adding the helper is, in fact, a net win.

include/llvm/Support/MathExtras.h
724	I think it's worth adding a note indicating that because all these values are unsigned, there is no distinction between a "fused" operation and non-fused from the perspective of whether saturation occurred.
unittests/Support/MathExtrasTest.cpp
314	I don't think we should bother to repeatedly check the case without the `ResultOverflowed` (except maybe one case to make sure that it compiles (i.e. that the default argument does in fact have a default)).

slingn marked 2 inline comments as done.Dec 10 2015, 10:17 AM

slingn added inline comments.

include/llvm/Support/MathExtras.h
727	X and Y are the multiplicands. Why would they need to be different types for just the unsigned arithmetic case? I guess the addend (A) could usefully be a negative number but that's inconsistent with SaturatingAdd() as it is currently defined. The pending weighted profile change (D15306) doesn't support weight < 1.

davidxl added inline comments.Dec 10 2015, 10:23 AM

include/llvm/Support/MathExtras.h
727	Ok -- sounds reasonable.

In D15385#306733, @silvas wrote:

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

That would be great - and would really simplify the implementation of saturating arithmetic The only caveat is that the builtin_*_overflow intrinsics aren't supported by every compiler that hosts LLVM. For example, GCC 4.x doesn't have builtin_add_overflow() - GCC 5 is required (https://gcc.gnu.org/gcc-5/changes.html). So there would have to be fall-back implementations.

In D15385#307298, @slingn wrote:

In D15385#306733, @silvas wrote:

it is probably just slightly more efficient so probably not in this patch.

If we are looking to improve efficiency, the clearest improvement to make is to use __builtin_*_with_overflow which will dramatically reduce the cost since we can directly use the processor flags instead of using some explicit calculation.

That would be great - and would really simplify the implementation of saturating arithmetic The only caveat is that the builtin_*_overflow intrinsics aren't supported by every compiler that hosts LLVM. For example, GCC 4.x doesn't have builtin_add_overflow() - GCC 5 is required (https://gcc.gnu.org/gcc-5/changes.html). So there would have to be fall-back implementations.

Yeah, it would require some ifdef's and would be extra work. Like I said, "if" we are looking to improve efficiency. I don't think this code at the moment has been measured to be a perf problem though. Also, for PS4, our shipping toolchain is currently compiled with MSVC so this would not affect us anyway (if we used clang-cl, then we could use the __builtin_*_overflow intrinsics, but while clang-cl would be great, I don't think we are planning to do that soon).

slingn mentioned this in D15547: [PGO] Handle and report overflow during profile merge for all types of data.Dec 16 2015, 11:17 AM

Updated for silvas comments.
-Apply SaturatingMultplyAdd() to weighted PGO merging.

Update patch to apply cleanly to HEAD.

Very nice cleanups .. LGTM.

This revision is now accepted and ready to land.Jan 12 2016, 1:47 PM

Closed by commit rL257532: [Support] Add saturating multiply-add support function (authored by slingn). · Explain WhyJan 12 2016, 2:37 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

Support/

MathExtras.h

18 lines

unittests/

Support/

MathExtrasTest.cpp

62 lines

Diff 42319

include/llvm/Support/MathExtras.h

Show First 20 Lines • Show All 711 Lines • ▼ Show 20 Lines	SaturatingMultiply(T X, T Y, bool *ResultOverflowed = nullptr) {
}		}
Z <<= 1;		Z <<= 1;
if (X & 1)		if (X & 1)
return SaturatingAdd(Z, Y, ResultOverflowed);		return SaturatingAdd(Z, Y, ResultOverflowed);

return Z;		return Z;
}		}

		/// \brief Multiply two unsigned integers, X and Y, of type T. Add the
		/// unsigned integer, A, also of type T, to the product.
		/// Clamp the result to the maximum representable value of T on overflow.
		/// ResultOverflowed indicates if the result is larger than the maximum
		/// representable value of type T.
		silvasUnsubmitted Done Reply Inline Actions I think it's worth adding a note indicating that because all these values are unsigned, there is no distinction between a "fused" operation and non-fused from the perspective of whether saturation occurred. silvas: I think it's worth adding a note indicating that because all these values are unsigned, there…
		template <typename T>
		typename std::enable_if<std::is_unsigned<T>::value, T>::type
		SaturatingMultiplyAdd(T X, T Y, T A, bool *ResultOverflowed = nullptr) {
		davidxlUnsubmitted Not Done Reply Inline Actions Do we want types of X and Y to be same ? For instance, the weight can be less than 1. Also we may want to have an overloaded function that is used for common cases where one of the X or Y is constant: template <typename T, uint64_t Multiplier> typename std::enable_if<std::is_unsigned<T>::value, T>::type SaturatingMultiplyAdd(T X, T A, bool ResultOverflowed = nullptr) { ... } it is probably just slightly more efficient so probably not in this patch. davidxl:* Do we want types of X and Y to be same ? For instance, the weight can be less than 1. Also we…
		slingnAuthorUnsubmitted Not Done Reply Inline Actions X and Y are the multiplicands. Why would they need to be different types for just the unsigned arithmetic case? I guess the addend (A) could usefully be a negative number but that's inconsistent with SaturatingAdd() as it is currently defined. The pending weighted profile change (D15306) doesn't support weight < 1. slingn: X and Y are the multiplicands. Why would they need to be different types for just the unsigned…
		davidxlUnsubmitted Not Done Reply Inline Actions Ok -- sounds reasonable. davidxl: Ok -- sounds reasonable.
		bool Dummy;
		bool &Overflowed = ResultOverflowed ? *ResultOverflowed : Dummy;

		T Product = SaturatingMultiply(X, Y, &Overflowed);
		if (Overflowed)
		return Product;

		return SaturatingAdd(A, Product, &Overflowed);
		}

extern const float huge_valf;		extern const float huge_valf;
} // End llvm namespace		} // End llvm namespace

#endif		#endif

unittests/Support/MathExtrasTest.cpp

	Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines

	TEST(MathExtras, SaturatingMultiply) {			TEST(MathExtras, SaturatingMultiply) {
	SaturatingMultiplyTestHelper<uint8_t>();			SaturatingMultiplyTestHelper<uint8_t>();
	SaturatingMultiplyTestHelper<uint16_t>();			SaturatingMultiplyTestHelper<uint16_t>();
	SaturatingMultiplyTestHelper<uint32_t>();			SaturatingMultiplyTestHelper<uint32_t>();
	SaturatingMultiplyTestHelper<uint64_t>();			SaturatingMultiplyTestHelper<uint64_t>();
	}			}

				template<typename T>
				void SaturatingMultiplyAddTestHelper()
				{
				const T Max = std::numeric_limits<T>::max();
				bool ResultOverflowed;

				// Test basic multiply-add.
				EXPECT_EQ(T(16), SaturatingMultiplyAdd(T(2), T(3), T(10)));
				silvasUnsubmitted Done Reply Inline Actions I don't think we should bother to repeatedly check the case without the `ResultOverflowed` (except maybe one case to make sure that it compiles (i.e. that the default argument does in fact have a default)). silvas: I don't think we should bother to repeatedly check the case without the `ResultOverflowed`…
				EXPECT_EQ(T(16), SaturatingMultiplyAdd(T(2), T(3), T(10), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				// Test multiply overflows, add doesn't overflow
				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, T(0), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply doesn't overflow, add overflows
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), T(1), Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply-add with Max as operand
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), T(1), Max));
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), T(1), Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), Max, T(1)));
				EXPECT_EQ(Max, SaturatingMultiplyAdd(T(1), Max, T(1), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, T(1)));
				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, T(1), &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, Max));
				EXPECT_EQ(Max, SaturatingMultiplyAdd(Max, Max, Max, &ResultOverflowed));
				EXPECT_TRUE(ResultOverflowed);

				// Test multiply-add with 0 as operand
				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(1), T(0)));
				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(1), T(0), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(0), T(1)));
				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(1), T(0), T(1), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(0), T(0), T(1)));
				EXPECT_EQ(T(1), SaturatingMultiplyAdd(T(0), T(0), T(1), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				EXPECT_EQ(T(0), SaturatingMultiplyAdd(T(0), T(0), T(0)));
				EXPECT_EQ(T(0), SaturatingMultiplyAdd(T(0), T(0), T(0), &ResultOverflowed));
				EXPECT_FALSE(ResultOverflowed);

				}

				TEST(MathExtras, SaturatingMultiplyAdd) {
				SaturatingMultiplyAddTestHelper<uint8_t>();
				SaturatingMultiplyAddTestHelper<uint16_t>();
				SaturatingMultiplyAddTestHelper<uint32_t>();
				SaturatingMultiplyAddTestHelper<uint64_t>();
				}

	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[Support] Add saturating multiply-add support functionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 42319

include/llvm/Support/MathExtras.h

unittests/Support/MathExtrasTest.cpp

[Support] Add saturating multiply-add support function
ClosedPublic