Download Raw Diff

Details

Reviewers

craig.topper
sanjoy
efriedma

Commits

rG70dbd5fbd0a5: Infer lowest bits of an integer Multiply when the low bits of the operands are…
rL320269: Infer lowest bits of an integer Multiply when the low bits of the operands are…

Summary

When the lowest bits of the operands to an integer multiply are known, the low bits of the result are deducible.
Code to deduce known-zero bottom bits already existed, but this change improves on that by deducing known-ones.

Diff Detail

Event Timeline

PFerreira created this revision.Jun 8 2017, 4:50 AM

craig.topper added inline comments.Jun 9 2017, 12:28 PM

lib/Analysis/ValueTracking.cpp
395	I think this line is equivalent to bottomKnown.getActiveBits(). Also please capitalize all variable names per coding standards.
397	Shouldn't we be able to do something like this instead of a loop BottomKnownOne = bottomKnown.getLoBits(trailKnown); BottomKnownZero = (~bottomKnown).getLoBits(trailKnown); Known.Zero \|= BottomKnownZero; Known.One \|= BottomKnownOne;
398	Use bottomKnown[bit]

Updated the diff with the suggestions. I feel a bit silly for having made the first patch with that for-loop, when I used equivalent bit-ops further up the function.

Also changed an "add" to an "or" on the unit test to test known-bits of the "or" operation as well, there I already had an add there anyway.

New new patch addresses all of Craig's comments.

craig.topper added inline comments.Jun 19 2017, 10:57 AM

lib/Analysis/ValueTracking.cpp
356	Is this And necessary? The Zero bits should be mutex with the One bits here.

Update diff following Craig's comment.

PFerreira marked an inline comment as done.Jun 20 2017, 2:08 AM

PFerreira added inline comments.

lib/Analysis/ValueTracking.cpp
356	I replaced with an assertion, but I can remove it if you think it's not necessary.

Prodding the review, in case the emails regarding this slipped through.

Ping - maybe before the 5.0 branch?

Sorry I accidentally lost track of this.

LGTM

This revision is now accepted and ready to land.Jul 12 2017, 12:29 AM

Hm, I'm not sure what I'm supposed to do next - do I wait for someone with commit rights to put this in?

hfinkel added a reviewer: sanjoy.Jul 14 2017, 7:14 AM

@sanjoy , can you please take a look at this? I recall this coming up as something we may not be able to do in the face of undef, etc.

I recall this coming up as something we may not be able to do in the face of undef, etc.

I can't see anything special here compared to, for example, computeKnownBitsAddSub.

lib/Analysis/ValueTracking.cpp
395	TrailKnown is wrong. http://rise4fun.com/Alive/s2b Conservatively, you could just use "TrailKnown = TrailBitsKnown". Or maybe you could get a little more aggressive with some trickery involving trailing zeros; I haven't worked out the exact math.

This revision now requires changes to proceed.Jul 14 2017, 3:01 PM

In D34029#809603, @hfinkel wrote:

@sanjoy , can you please take a look at this? I recall this coming up as something we may not be able to do in the face of undef, etc.

This seems fine to me given the current definition of undef.

Use TrailBitsKnown instead.

Use TrailBitsKnown instead.

Please also add testcases to cover these cases (the case where we miscompiled, and maybe a case where we could improve this implementation).

Added the test you asked me about, slightly modified. I wanted to make sure that the ComputeKnownBits picked the correct bitwidth. The sample test you mentioned had the two inputs with the same bit width, so I just added two more. I can put it back to "3" if you prefer.

I tried to figure out how this could be improved, and considered your idea of leading zeros. I think I understood what you were thinking, but while trying to figure out the math I couldn't see how it would be possible. Having an extra leading known-zero on one of the operands over another operand would not help to figure out more digits, as far as I could see.

I tried to figure out how this could be improved, and considered your idea of leading zeros. I think I understood what you were thinking, but while trying to figure out the math I couldn't see how it would be possible. Having an extra leading known-zero on one of the operands over another operand would not help to figure out more digits, as far as I could see.

Trailing zeros, not leading zeros.

Suppose you have multiply operands "a" and "b". The bottom three bits of "a" are 110, and the bottom two bits of "b" are 11. "a" is divisible by 2, so "a * b == 2 * ((a / 2) * b)". The bottom two bits of "a / 2" are 11, and the bottom two bits of "b" are 11, so the bottom two bits of "(a / 2) * b" are "01". Therefore the bottom three bits of "2 * ((a / 2) * b)" are "010".

Thanks for the suggestion. This was fun!

I've updated the diff with your improvement suggestion, and changed the expected results of the unit tests to match.
I took the opportunity to refactor the trail-zero computation since this new method can compute the trailing zeros of the result.

efriedma added inline comments.Jul 28 2017, 5:28 PM

lib/Analysis/ValueTracking.cpp
377	Reusing the name ResultBitsKnown is confusing.
380	These little comments aren't that useful for understanding the logic; needs one big paragraph explaining the logic for computing ResultBitsKnown.

Added an explanation of what's being done.
Because I was the one writing it, it makes sense to me. Is it clear enough for others?

Just prodding for an update, in case the emails fell through.

Sorry, got buried in my inbox. Won't really have time to look again until next week.

Ping. I understand people might be busy with 5.0

I like to see a more formal proof for this sort of thing... but I don't have enough time to mess with alive. Got as far as http://rise4fun.com/Alive/a7O .

lib/Analysis/ValueTracking.cpp
371	Is this supposed "b/n", rather than "b/s"?

I have been fiddling with Alive, but it has been crashing on me (even on simple proofs). I guessed this was a temporary issue, but after 2 weeks it is still not working with "Oops, it seems that this tool encountered an issue."
The last version of the expression I got was

Name: mul_to_const
Pre: C1 >= 0 && C1 < 32 && C2 >= 0 && C2 < 64 && CLZ1 == countTrailingZeros(C1) && CLZ2 == countTrailingZeros(C2) && C3==((1 << (6+CLZ1+CLZ2))-1)
%aa = shl i32 %a, 5
%bb = shl i32 %b, 6
%aaa = or i32 %aa, C1
%bbb = or i32 %bb, C2
%aaaa = lshr i32 %aaa, CLZ1
%bbbb = lshr i32 %bbb, CLZ2
%mul = mul i32 %aaaa, %bbbb
%adjust = add i32 C1, C2
%result = shl %mul, %adjust
%mask = and i32 %result, C3

=>

%mask = i32 ((C1*C2)&C3)

Alive is up again.

I tried a bit more; got an alive proof working. Does this look right? (The precondition for C7 is kind of complicated, but I think it matches the computation in this patch.)

https://rise4fun.com/Alive/zCv

Name: mul_to_const
Pre: (C1 >= 0) && (C1 < (1 << C5)) && (C2 >= 0) && (C2 < (1 << C6)) && (C7 == (1 << (umin(countTrailingZeros(C1), C5) + umin(countTrailingZeros(C2), C6) + umin(C5 - umin(countTrailingZeros(C1), C5), C6 - umin(countTrailingZeros(C2), C6)))) - 1)

%aa = shl i8 %a, C5
%bb = shl i8 %b, C6
%aaa = or i8 %aa, C1
%bbb = or i8 %bb, C2
%mul = mul i8 %aaa, %bbb
%mask = and i8 %mul, C7

  =>

%mask = i8 ((C1*C2)&C7)

I've added Eli's proof to it, not sure exactly how to present it.

Took me a bit to digest it, but I agree that it matches what I'm trying to do in this patch. I can try to describe the definition of C7 in plain text (correlate it to the variables declared in the patch) if you prefer.

Ping.

LGTM.

Sorry about the delay. I ran some tests, and it looks like this doesn't cause any regressions. Granted, it didn't lead to any performance improvement either (I guess we don't really use this information in many cases). But having the information around could be helpful in the future.

Do you have commit access?

This revision is now accepted and ready to land.Nov 6 2017, 3:03 PM

I do not have commit access.

The changes behind this patch were developed internally (in my Company) and provide improvements to specific Modules. In particular, it allows us to infer the bottom bits of some address calculations (where we can use faster memory operations when the pointer has some specific alignment). I can't really go into details unfortunately. I do hope that at some point in the future, someone else will benefit from this more openly :)

Ping. Nearly there!

ping

I can commit this on behalf of my former colleague if there are no objections. I'll wait a day or so.

Closed by commit rL320269: Infer lowest bits of an integer Multiply when the low bits of the operands are… (authored by sdardis). · Explain WhyDec 9 2017, 3:26 PM

This revision was automatically updated to reflect the committed changes.

Diff 107267

lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 344 Lines • ▼ Show 20 Lines	if (Op0 == Op1) {
if (!isKnownNonNegative)		if (!isKnownNonNegative)
isKnownNegative = (isKnownNegativeOp1 && isKnownNonNegativeOp0 &&		isKnownNegative = (isKnownNegativeOp1 && isKnownNonNegativeOp0 &&
isKnownNonZero(Op0, Depth, Q)) \|\|		isKnownNonZero(Op0, Depth, Q)) \|\|
(isKnownNegativeOp0 && isKnownNonNegativeOp1 &&		(isKnownNegativeOp0 && isKnownNonNegativeOp1 &&
isKnownNonZero(Op1, Depth, Q));		isKnownNonZero(Op1, Depth, Q));
}		}
}		}

// If low bits are zero in either operand, output low known-0 bits.		// The result of the bottom bits of an integer multiply can
// Also compute a conservative estimate for high known-0 bits.		// be inferred by looking at the bottom bits of both operands
// More trickiness is possible, but this is sufficient for the		// and multiplying them together.
// interesting case of alignment computation.		assert(!Known.hasConflict() && !Known2.hasConflict());
		craig.topperUnsubmitted Done Reply Inline Actions Is this And necessary? The Zero bits should be mutex with the One bits here. craig.topper: Is this And necessary? The Zero bits should be mutex with the One bits here.
		PFerreiraAuthorUnsubmitted Not Done Reply Inline Actions I replaced with an assertion, but I can remove it if you think it's not necessary. PFerreira: I replaced with an assertion, but I can remove it if you think it's not necessary.
unsigned TrailZ = Known.countMinTrailingZeros() +		APInt Bottom0 = Known.One;
Known2.countMinTrailingZeros();		APInt Bottom1 = Known2.One;

		// Compute a conservative estimate for high known-0 bits.
unsigned LeadZ = std::max(Known.countMinLeadingZeros() +		unsigned LeadZ = std::max(Known.countMinLeadingZeros() +
Known2.countMinLeadingZeros(),		Known2.countMinLeadingZeros(),
BitWidth) - BitWidth;		BitWidth) - BitWidth;

TrailZ = std::min(TrailZ, BitWidth);
LeadZ = std::min(LeadZ, BitWidth);		LeadZ = std::min(LeadZ, BitWidth);

		// If there are trailing zeros on either operand, we can infer
		// extra bits of the multiplication result.
		// Find the last bit known on both operands.
		unsigned TrailBitsKnown0 = (Known.Zero \| Known.One).countTrailingOnes();
		unsigned TrailBitsKnown1 = (Known2.Zero \| Known2.One).countTrailingOnes();
		// How many times we'd be able to divide each argument by 2 (shr by 1).
		efriedmaUnsubmitted Not Done Reply Inline Actions Is this supposed "b/n", rather than "b/s"? efriedma: Is this supposed "b/n", rather than "b/s"?
		unsigned TrailZero0 = Known.countMinTrailingZeros();
		unsigned TrailZero1 = Known2.countMinTrailingZeros();
		// Number of trailing zeros on the multiplication result.
		unsigned TrailZ = TrailZero0 + TrailZero1;
		unsigned ResultBitsKnown = std::min(TrailBitsKnown0 - TrailZero0,
		TrailBitsKnown1 - TrailZero1);
		efriedmaUnsubmitted Not Done Reply Inline Actions Reusing the name ResultBitsKnown is confusing. efriedma: Reusing the name ResultBitsKnown is confusing.
		// We know at least the trailing zeros, plus any other known bits
		// of the operands.
		ResultBitsKnown = std::min(ResultBitsKnown + TrailZ, BitWidth);
		efriedmaUnsubmitted Not Done Reply Inline Actions These little comments aren't that useful for understanding the logic; needs one big paragraph explaining the logic for computing ResultBitsKnown. efriedma: These little comments aren't that useful for understanding the logic; needs one big paragraph…

		// Finally, these are the known bottom bits of the result.
		APInt BottomKnown = Bottom0.getLoBits(TrailBitsKnown0) *
		Bottom1.getLoBits(TrailBitsKnown1);

Known.resetAll();		Known.resetAll();
Known.Zero.setLowBits(TrailZ);
Known.Zero.setHighBits(LeadZ);		Known.Zero.setHighBits(LeadZ);
		Known.Zero \|= (~BottomKnown).getLoBits(ResultBitsKnown);
		Known.One \|= BottomKnown.getLoBits(ResultBitsKnown);

// Only make use of no-wrap flags if we failed to compute the sign bit		// Only make use of no-wrap flags if we failed to compute the sign bit
// directly. This matters if the multiplication always overflows, in		// directly. This matters if the multiplication always overflows, in
// which case we prefer to follow the result of the direct computation,		// which case we prefer to follow the result of the direct computation,
// though as the program is invoking undefined behaviour we can choose		// though as the program is invoking undefined behaviour we can choose
// whatever we like here.		// whatever we like here.
		craig.topperUnsubmitted Done Reply Inline Actions I think this line is equivalent to bottomKnown.getActiveBits(). Also please capitalize all variable names per coding standards. craig.topper: I think this line is equivalent to bottomKnown.getActiveBits(). Also please capitalize all…
		efriedmaUnsubmitted Not Done Reply Inline Actions TrailKnown is wrong. http://rise4fun.com/Alive/s2b Conservatively, you could just use "TrailKnown = TrailBitsKnown". Or maybe you could get a little more aggressive with some trickery involving trailing zeros; I haven't worked out the exact math. efriedma: TrailKnown is wrong. http://rise4fun.com/Alive/s2b Conservatively, you could just use…
if (isKnownNonNegative && !Known.isNegative())		if (isKnownNonNegative && !Known.isNegative())
Known.makeNonNegative();		Known.makeNonNegative();
		craig.topperUnsubmitted Done Reply Inline Actions Shouldn't we be able to do something like this instead of a loop BottomKnownOne = bottomKnown.getLoBits(trailKnown); BottomKnownZero = (~bottomKnown).getLoBits(trailKnown); Known.Zero \|= BottomKnownZero; Known.One \|= BottomKnownOne; craig.topper: Shouldn't we be able to do something like this instead of a loop BottomKnownOne = bottomKnown.
else if (isKnownNegative && !Known.isNonNegative())		else if (isKnownNegative && !Known.isNonNegative())
		craig.topperUnsubmitted Done Reply Inline Actions Use bottomKnown[bit] craig.topper: Use bottomKnown[bit]
Known.makeNegative();		Known.makeNegative();
}		}

void llvm::computeKnownBitsFromRangeMetadata(const MDNode &Ranges,		void llvm::computeKnownBitsFromRangeMetadata(const MDNode &Ranges,
KnownBits &Known) {		KnownBits &Known) {
unsigned BitWidth = Known.getBitWidth();		unsigned BitWidth = Known.getBitWidth();
unsigned NumRanges = Ranges.getNumOperands() / 2;		unsigned NumRanges = Ranges.getNumOperands() / 2;
assert(NumRanges >= 1);		assert(NumRanges >= 1);
▲ Show 20 Lines • Show All 4,094 Lines • Show Last 20 Lines

unittests/Analysis/ValueTrackingTest.cpp

Show All 9 Lines
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/AsmParser/Parser.h"		#include "llvm/AsmParser/Parser.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/InstIterator.h"		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"
		#include "llvm/Support/KnownBits.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"

using namespace llvm;		using namespace llvm;

namespace {		namespace {

class MatchSelectPatternTest : public testing::Test {		class MatchSelectPatternTest : public testing::Test {
protected:		protected:
▲ Show 20 Lines • Show All 227 Lines • ▼ Show 20 Lines	TEST(ValueTracking, ComputeNumSignBits_PR32045) {

auto *F = M->getFunction("f");		auto *F = M->getFunction("f");
assert(F && "Bad assembly?");		assert(F && "Bad assembly?");

auto *RVal =		auto *RVal =
cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
EXPECT_EQ(ComputeNumSignBits(RVal, M->getDataLayout()), 1u);		EXPECT_EQ(ComputeNumSignBits(RVal, M->getDataLayout()), 1u);
}		}

		TEST(ValueTracking, ComputeKnownBits) {
		StringRef Assembly = "define i32 @f(i32 %a, i32 %b) { "
		" %ash = mul i32 %a, 8 "
		" %aad = add i32 %ash, 7 "
		" %aan = and i32 %aad, 4095 "
		" %bsh = shl i32 %b, 4 "
		" %bad = or i32 %bsh, 6 "
		" %ban = and i32 %bad, 4095 "
		" %mul = mul i32 %aan, %ban "
		" ret i32 %mul "
		"} ";

		LLVMContext Context;
		SMDiagnostic Error;
		auto M = parseAssemblyString(Assembly, Error, Context);
		assert(M && "Bad assembly?");

		auto *F = M->getFunction("f");
		assert(F && "Bad assembly?");

		auto *RVal =
		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
		auto Known = computeKnownBits(RVal, M->getDataLayout());
		ASSERT_FALSE(Known.hasConflict());
		EXPECT_EQ(Known.One.getZExtValue(), 10u);
		EXPECT_EQ(Known.Zero.getZExtValue(), 4278190085u);
		}

		TEST(ValueTracking, ComputeKnownMulBits) {
		StringRef Assembly = "define i32 @f(i32 %a, i32 %b) { "
		" %aa = shl i32 %a, 5 "
		" %bb = shl i32 %b, 5 "
		" %aaa = or i32 %aa, 24 "
		" %bbb = or i32 %bb, 28 "
		" %mul = mul i32 %aaa, %bbb "
		" ret i32 %mul "
		"} ";

		LLVMContext Context;
		SMDiagnostic Error;
		auto M = parseAssemblyString(Assembly, Error, Context);
		assert(M && "Bad assembly?");

		auto *F = M->getFunction("f");
		assert(F && "Bad assembly?");

		auto *RVal =
		cast<ReturnInst>(F->getEntryBlock().getTerminator())->getOperand(0);
		auto Known = computeKnownBits(RVal, M->getDataLayout());
		ASSERT_FALSE(Known.hasConflict());
		EXPECT_EQ(Known.One.getZExtValue(), 32u);
		EXPECT_EQ(Known.Zero.getZExtValue(), 95u);
		}

This is an archive of the discontinued LLVM Phabricator instance.

Infer lowest bits of an integer Multiply when the low bits of the operands are known
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 107267

lib/Analysis/ValueTracking.cpp

unittests/Analysis/ValueTrackingTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

Infer lowest bits of an integer Multiply when the low bits of the operands are knownClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 107267

lib/Analysis/ValueTracking.cpp

unittests/Analysis/ValueTrackingTest.cpp

Infer lowest bits of an integer Multiply when the low bits of the operands are known
ClosedPublic