Download Raw Diff

Details

Reviewers

craig.topper
sanjoy
efriedma

Commits

rG70dbd5fbd0a5: Infer lowest bits of an integer Multiply when the low bits of the operands are…
rL320269: Infer lowest bits of an integer Multiply when the low bits of the operands are…

Summary

When the lowest bits of the operands to an integer multiply are known, the low bits of the result are deducible.
Code to deduce known-zero bottom bits already existed, but this change improves on that by deducing known-ones.

Diff Detail

Repository: rL LLVM

Event Timeline

PFerreira created this revision.Jun 8 2017, 4:50 AM

craig.topper added inline comments.Jun 9 2017, 12:28 PM

lib/Analysis/ValueTracking.cpp
382 ↗	(On Diff #101893)	I think this line is equivalent to bottomKnown.getActiveBits(). Also please capitalize all variable names per coding standards.
384 ↗	(On Diff #101893)	Shouldn't we be able to do something like this instead of a loop BottomKnownOne = bottomKnown.getLoBits(trailKnown); BottomKnownZero = (~bottomKnown).getLoBits(trailKnown); Known.Zero \|= BottomKnownZero; Known.One \|= BottomKnownOne;
385 ↗	(On Diff #101893)	Use bottomKnown[bit]

Updated the diff with the suggestions. I feel a bit silly for having made the first patch with that for-loop, when I used equivalent bit-ops further up the function.

Also changed an "add" to an "or" on the unit test to test known-bits of the "or" operation as well, there I already had an add there anyway.

New new patch addresses all of Craig's comments.

craig.topper added inline comments.Jun 19 2017, 10:57 AM

lib/Analysis/ValueTracking.cpp
356 ↗	(On Diff #102158)	Is this And necessary? The Zero bits should be mutex with the One bits here.

Update diff following Craig's comment.

PFerreira marked an inline comment as done.Jun 20 2017, 2:08 AM

PFerreira added inline comments.

lib/Analysis/ValueTracking.cpp
356 ↗	(On Diff #102158)	I replaced with an assertion, but I can remove it if you think it's not necessary.

Prodding the review, in case the emails regarding this slipped through.

Ping - maybe before the 5.0 branch?

Sorry I accidentally lost track of this.

LGTM

This revision is now accepted and ready to land.Jul 12 2017, 12:29 AM

Hm, I'm not sure what I'm supposed to do next - do I wait for someone with commit rights to put this in?

hfinkel added a reviewer: sanjoy.Jul 14 2017, 7:14 AM

@sanjoy , can you please take a look at this? I recall this coming up as something we may not be able to do in the face of undef, etc.

I recall this coming up as something we may not be able to do in the face of undef, etc.

I can't see anything special here compared to, for example, computeKnownBitsAddSub.

lib/Analysis/ValueTracking.cpp
383 ↗	(On Diff #103176)	TrailKnown is wrong. http://rise4fun.com/Alive/s2b Conservatively, you could just use "TrailKnown = TrailBitsKnown". Or maybe you could get a little more aggressive with some trickery involving trailing zeros; I haven't worked out the exact math.

This revision now requires changes to proceed.Jul 14 2017, 3:01 PM

In D34029#809603, @hfinkel wrote:

@sanjoy , can you please take a look at this? I recall this coming up as something we may not be able to do in the face of undef, etc.

This seems fine to me given the current definition of undef.

Use TrailBitsKnown instead.

Use TrailBitsKnown instead.

Please also add testcases to cover these cases (the case where we miscompiled, and maybe a case where we could improve this implementation).

Added the test you asked me about, slightly modified. I wanted to make sure that the ComputeKnownBits picked the correct bitwidth. The sample test you mentioned had the two inputs with the same bit width, so I just added two more. I can put it back to "3" if you prefer.

I tried to figure out how this could be improved, and considered your idea of leading zeros. I think I understood what you were thinking, but while trying to figure out the math I couldn't see how it would be possible. Having an extra leading known-zero on one of the operands over another operand would not help to figure out more digits, as far as I could see.

I tried to figure out how this could be improved, and considered your idea of leading zeros. I think I understood what you were thinking, but while trying to figure out the math I couldn't see how it would be possible. Having an extra leading known-zero on one of the operands over another operand would not help to figure out more digits, as far as I could see.

Trailing zeros, not leading zeros.

Suppose you have multiply operands "a" and "b". The bottom three bits of "a" are 110, and the bottom two bits of "b" are 11. "a" is divisible by 2, so "a * b == 2 * ((a / 2) * b)". The bottom two bits of "a / 2" are 11, and the bottom two bits of "b" are 11, so the bottom two bits of "(a / 2) * b" are "01". Therefore the bottom three bits of "2 * ((a / 2) * b)" are "010".

Thanks for the suggestion. This was fun!

I've updated the diff with your improvement suggestion, and changed the expected results of the unit tests to match.
I took the opportunity to refactor the trail-zero computation since this new method can compute the trailing zeros of the result.

efriedma added inline comments.Jul 28 2017, 5:28 PM

lib/Analysis/ValueTracking.cpp
377 ↗	(On Diff #107267)	Reusing the name ResultBitsKnown is confusing.
380 ↗	(On Diff #107267)	These little comments aren't that useful for understanding the logic; needs one big paragraph explaining the logic for computing ResultBitsKnown.

Added an explanation of what's being done.
Because I was the one writing it, it makes sense to me. Is it clear enough for others?

Just prodding for an update, in case the emails fell through.

Sorry, got buried in my inbox. Won't really have time to look again until next week.

Ping. I understand people might be busy with 5.0

I like to see a more formal proof for this sort of thing... but I don't have enough time to mess with alive. Got as far as http://rise4fun.com/Alive/a7O .

lib/Analysis/ValueTracking.cpp
365 ↗	(On Diff #108912)	Is this supposed "b/n", rather than "b/s"?

I have been fiddling with Alive, but it has been crashing on me (even on simple proofs). I guessed this was a temporary issue, but after 2 weeks it is still not working with "Oops, it seems that this tool encountered an issue."
The last version of the expression I got was

Name: mul_to_const
Pre: C1 >= 0 && C1 < 32 && C2 >= 0 && C2 < 64 && CLZ1 == countTrailingZeros(C1) && CLZ2 == countTrailingZeros(C2) && C3==((1 << (6+CLZ1+CLZ2))-1)
%aa = shl i32 %a, 5
%bb = shl i32 %b, 6
%aaa = or i32 %aa, C1
%bbb = or i32 %bb, C2
%aaaa = lshr i32 %aaa, CLZ1
%bbbb = lshr i32 %bbb, CLZ2
%mul = mul i32 %aaaa, %bbbb
%adjust = add i32 C1, C2
%result = shl %mul, %adjust
%mask = and i32 %result, C3

=>

%mask = i32 ((C1*C2)&C3)

Alive is up again.

I tried a bit more; got an alive proof working. Does this look right? (The precondition for C7 is kind of complicated, but I think it matches the computation in this patch.)

https://rise4fun.com/Alive/zCv

Name: mul_to_const
Pre: (C1 >= 0) && (C1 < (1 << C5)) && (C2 >= 0) && (C2 < (1 << C6)) && (C7 == (1 << (umin(countTrailingZeros(C1), C5) + umin(countTrailingZeros(C2), C6) + umin(C5 - umin(countTrailingZeros(C1), C5), C6 - umin(countTrailingZeros(C2), C6)))) - 1)

%aa = shl i8 %a, C5
%bb = shl i8 %b, C6
%aaa = or i8 %aa, C1
%bbb = or i8 %bb, C2
%mul = mul i8 %aaa, %bbb
%mask = and i8 %mul, C7

  =>

%mask = i8 ((C1*C2)&C7)

I've added Eli's proof to it, not sure exactly how to present it.

Took me a bit to digest it, but I agree that it matches what I'm trying to do in this patch. I can try to describe the definition of C7 in plain text (correlate it to the variables declared in the patch) if you prefer.

Ping.

LGTM.

Sorry about the delay. I ran some tests, and it looks like this doesn't cause any regressions. Granted, it didn't lead to any performance improvement either (I guess we don't really use this information in many cases). But having the information around could be helpful in the future.

Do you have commit access?

This revision is now accepted and ready to land.Nov 6 2017, 3:03 PM

I do not have commit access.

The changes behind this patch were developed internally (in my Company) and provide improvements to specific Modules. In particular, it allows us to infer the bottom bits of some address calculations (where we can use faster memory operations when the pointer has some specific alignment). I can't really go into details unfortunately. I do hope that at some point in the future, someone else will benefit from this more openly :)

Ping. Nearly there!

ping

I can commit this on behalf of my former colleague if there are no objections. I'll wait a day or so.

Closed by commit rL320269: Infer lowest bits of an integer Multiply when the low bits of the operands are… (authored by sdardis). · Explain WhyDec 9 2017, 3:26 PM

This revision was automatically updated to reflect the committed changes.

Diff 126284

llvm/trunk/lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 330 Lines • ▼ Show 20 Lines	if (Op0 == Op1) {
if (!isKnownNonNegative)		if (!isKnownNonNegative)
isKnownNegative = (isKnownNegativeOp1 && isKnownNonNegativeOp0 &&		isKnownNegative = (isKnownNegativeOp1 && isKnownNonNegativeOp0 &&
isKnownNonZero(Op0, Depth, Q)) \|\|		isKnownNonZero(Op0, Depth, Q)) \|\|
(isKnownNegativeOp0 && isKnownNonNegativeOp1 &&		(isKnownNegativeOp0 && isKnownNonNegativeOp1 &&
isKnownNonZero(Op1, Depth, Q));		isKnownNonZero(Op1, Depth, Q));
}		}
}		}

// If low bits are zero in either operand, output low known-0 bits.		assert(!Known.hasConflict() && !Known2.hasConflict());
// Also compute a conservative estimate for high known-0 bits.		// Compute a conservative estimate for high known-0 bits.
// More trickiness is possible, but this is sufficient for the
// interesting case of alignment computation.
unsigned TrailZ = Known.countMinTrailingZeros() +
Known2.countMinTrailingZeros();
unsigned LeadZ = std::max(Known.countMinLeadingZeros() +		unsigned LeadZ = std::max(Known.countMinLeadingZeros() +
Known2.countMinLeadingZeros(),		Known2.countMinLeadingZeros(),
BitWidth) - BitWidth;		BitWidth) - BitWidth;

TrailZ = std::min(TrailZ, BitWidth);
LeadZ = std::min(LeadZ, BitWidth);		LeadZ = std::min(LeadZ, BitWidth);

		// The result of the bottom bits of an integer multiply can be
		// inferred by looking at the bottom bits of both operands and
		// multiplying them together.
		// We can infer at least the minimum number of known trailing bits
		// of both operands. Depending on number of trailing zeros, we can
		// infer more bits, because (ab) <=> ((a/m) (b/n)) * (m*n) assuming
		// a and b are divisible by m and n respectively.
		// We then calculate how many of those bits are inferrable and set
		// the output. For example, the i8 mul:
		// a = XXXX1100 (12)
		// b = XXXX1110 (14)
		// We know the bottom 3 bits are zero since the first can be divided by
		// 4 and the second by 2, thus having ((12/4) * (14/2)) * (2*4).
		// Applying the multiplication to the trimmed arguments gets:
		// XX11 (3)
		// X111 (7)
		// -------
		// XX11
		// XX11
		// XX11
		// XX11
		// -------
		// XXXXX01
		// Which allows us to infer the 2 LSBs. Since we're multiplying the result
		// by 8, the bottom 3 bits will be 0, so we can infer a total of 5 bits.
		// The proof for this can be described as:
		// Pre: (C1 >= 0) && (C1 < (1 << C5)) && (C2 >= 0) && (C2 < (1 << C6)) &&
		// (C7 == (1 << (umin(countTrailingZeros(C1), C5) +
		// umin(countTrailingZeros(C2), C6) +
		// umin(C5 - umin(countTrailingZeros(C1), C5),
		// C6 - umin(countTrailingZeros(C2), C6)))) - 1)
		// %aa = shl i8 %a, C5
		// %bb = shl i8 %b, C6
		// %aaa = or i8 %aa, C1
		// %bbb = or i8 %bb, C2
		// %mul = mul i8 %aaa, %bbb
		// %mask = and i8 %mul, C7
		// =>
		// %mask = i8 ((C1*C2)&C7)
		// Where C5, C6 describe the known bits of %a, %b
		// C1, C2 describe the known bottom bits of %a, %b.
		// C7 describes the mask of the known bits of the result.
		APInt Bottom0 = Known.One;
		APInt Bottom1 = Known2.One;

		// How many times we'd be able to divide each argument by 2 (shr by 1).
		// This gives us the number of trailing zeros on the multiplication result.
		unsigned TrailBitsKnown0 = (Known.Zero \| Known.One).countTrailingOnes();
		unsigned TrailBitsKnown1 = (Known2.Zero \| Known2.One).countTrailingOnes();
		unsigned TrailZero0 = Known.countMinTrailingZeros();
		unsigned TrailZero1 = Known2.countMinTrailingZeros();
		unsigned TrailZ = TrailZero0 + TrailZero1;

		// Figure out the fewest known-bits operand.
		unsigned SmallestOperand = std::min(TrailBitsKnown0 - TrailZero0,
		TrailBitsKnown1 - TrailZero1);
		unsigned ResultBitsKnown = std::min(SmallestOperand + TrailZ, BitWidth);

		APInt BottomKnown = Bottom0.getLoBits(TrailBitsKnown0) *
		Bottom1.getLoBits(TrailBitsKnown1);

Known.resetAll();		Known.resetAll();
Known.Zero.setLowBits(TrailZ);
Known.Zero.setHighBits(LeadZ);		Known.Zero.setHighBits(LeadZ);
		Known.Zero \|= (~BottomKnown).getLoBits(ResultBitsKnown);
		Known.One \|= BottomKnown.getLoBits(ResultBitsKnown);

// Only make use of no-wrap flags if we failed to compute the sign bit		// Only make use of no-wrap flags if we failed to compute the sign bit
// directly. This matters if the multiplication always overflows, in		// directly. This matters if the multiplication always overflows, in
// which case we prefer to follow the result of the direct computation,		// which case we prefer to follow the result of the direct computation,
// though as the program is invoking undefined behaviour we can choose		// though as the program is invoking undefined behaviour we can choose
// whatever we like here.		// whatever we like here.
if (isKnownNonNegative && !Known.isNegative())		if (isKnownNonNegative && !Known.isNegative())
Known.makeNonNegative();		Known.makeNonNegative();
▲ Show 20 Lines • Show All 4,349 Lines • Show Last 20 Lines

llvm/trunk/unittests/Analysis/ValueTrackingTest.cpp

Show All 9 Lines
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/AsmParser/Parser.h"		#include "llvm/AsmParser/Parser.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"
		#include "llvm/Support/KnownBits.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"

using namespace llvm;		using namespace llvm;

namespace {		namespace {

class MatchSelectPatternTest : public testing::Test {		class MatchSelectPatternTest : public testing::Test {
protected:		protected:
▲ Show 20 Lines • Show All 227 Lines • ▼ Show 20 Lines	TEST(ValueTracking, ComputeNumSignBits_PR32045) {

auto *F = M->getFunction("f");		auto *F = M->getFunction("f");
assert(F && "Bad assembly?");		assert(F && "Bad assembly?");

auto *RVal =		auto *RVal =
cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
EXPECT_EQ(ComputeNumSignBits(RVal, M->getDataLayout()), 1u);		EXPECT_EQ(ComputeNumSignBits(RVal, M->getDataLayout()), 1u);
}		}

		TEST(ValueTracking, ComputeKnownBits) {
		StringRef Assembly = "define i32 @f(i32 %a, i32 %b) { "
		" %ash = mul i32 %a, 8 "
		" %aad = add i32 %ash, 7 "
		" %aan = and i32 %aad, 4095 "
		" %bsh = shl i32 %b, 4 "
		" %bad = or i32 %bsh, 6 "
		" %ban = and i32 %bad, 4095 "
		" %mul = mul i32 %aan, %ban "
		" ret i32 %mul "
		"} ";

		LLVMContext Context;
		SMDiagnostic Error;
		auto M = parseAssemblyString(Assembly, Error, Context);
		assert(M && "Bad assembly?");

		auto *F = M->getFunction("f");
		assert(F && "Bad assembly?");

		auto *RVal =
		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
		auto Known = computeKnownBits(RVal, M->getDataLayout());
		ASSERT_FALSE(Known.hasConflict());
		EXPECT_EQ(Known.One.getZExtValue(), 10u);
		EXPECT_EQ(Known.Zero.getZExtValue(), 4278190085u);
		}

		TEST(ValueTracking, ComputeKnownMulBits) {
		StringRef Assembly = "define i32 @f(i32 %a, i32 %b) { "
		" %aa = shl i32 %a, 5 "
		" %bb = shl i32 %b, 5 "
		" %aaa = or i32 %aa, 24 "
		" %bbb = or i32 %bb, 28 "
		" %mul = mul i32 %aaa, %bbb "
		" ret i32 %mul "
		"} ";

		LLVMContext Context;
		SMDiagnostic Error;
		auto M = parseAssemblyString(Assembly, Error, Context);
		assert(M && "Bad assembly?");

		auto *F = M->getFunction("f");
		assert(F && "Bad assembly?");

		auto *RVal =
		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
		auto Known = computeKnownBits(RVal, M->getDataLayout());
		ASSERT_FALSE(Known.hasConflict());
		EXPECT_EQ(Known.One.getZExtValue(), 32u);
		EXPECT_EQ(Known.Zero.getZExtValue(), 95u);
		}

This is an archive of the discontinued LLVM Phabricator instance.

Infer lowest bits of an integer Multiply when the low bits of the operands are known
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 126284

llvm/trunk/lib/Analysis/ValueTracking.cpp

llvm/trunk/unittests/Analysis/ValueTrackingTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Infer lowest bits of an integer Multiply when the low bits of the operands are knownClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 126284

llvm/trunk/lib/Analysis/ValueTracking.cpp

llvm/trunk/unittests/Analysis/ValueTrackingTest.cpp

Infer lowest bits of an integer Multiply when the low bits of the operands are known
ClosedPublic