This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
BasicAliasAnalysis.h
-
lib/Analysis/
-
Analysis/
5/8
BasicAliasAnalysis.cpp
-
test/Analysis/BasicAA/
-
Analysis/
-
BasicAA/
1/3
gep-modulo.ll

Differential D99424

[BasicAA] Be more careful with modulo ops on VariableGEPIndex.
ClosedPublic

Authored by fhahn on Mar 26 2021, 9:58 AM.

Download Raw Diff

Details

Reviewers

asbirlea
jdoerfert
hfinkel
nikic

Commits

rG91fa3565da16: [BasicAA] Be more careful with modulo ops on VariableGEPIndex.

Summary

(V * Scale) % X may not produce the same result for any possible value
of V, e.g. if the multiplication overflows. This means we currently
incorrectly determine NoAlias in some cases.

This patch adjusts the code linarizing GEPs to try to track if the
modulo semantics are preserved. There might still be GEPs and cases
where we miss, but it should address a few critical cases.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Mar 26 2021, 9:58 AM

Herald added subscribers: arphaman, hiraditya. · View Herald TranscriptMar 26 2021, 9:58 AM

fhahn requested review of this revision.Mar 26 2021, 9:58 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2021, 9:58 AM

fhahn mentioned this in D91027: [BasicAA] Generalize base offset modulus handling.Mar 26 2021, 9:59 AM

Harbormaster completed remote builds in B95891: Diff 333578.Mar 26 2021, 10:34 AM

I'm somewhat confused by the implementation approach in this patch. I think there's really two orthogonal cases: The first is if all operations are non-wrapping, in which case we can optimize for arbitrary modulos. The second is that we can always use power of two factors from the GCD, because arithmetic is over a power of two field.

I think handling for these two cases should be split. The first part can be handled via flag, and the second part can be handled by only extracting the power of two part if the flag is not set. I think this should both make the semantics of the flag clear and be more precise.

Rebased on top of latest changes.

In D99424#2653505, @nikic wrote:

I'm somewhat confused by the implementation approach in this patch. I think there's really two orthogonal cases: The first is if all operations are non-wrapping, in which case we can optimize for arbitrary modulos. The second is that we can always use power of two factors from the GCD, because arithmetic is over a power of two field.

I think handling for these two cases should be split. The first part can be handled via flag, and the second part can be handled by only extracting the power of two part if the flag is not set. I think this should both make the semantics of the flag clear and be more precise.

I'm not sure where the logic should be split? I tried to just have a may-wrap flag and then check if Scale is a power-of-2 when the scale is used for the modulo computations. But I think that wouldn't correctly handle cases like (a + x) * 2^y, if the multiply operand gets distributed. Or were you thinking about something else?

nikic added inline comments.Apr 7 2021, 3:01 PM

llvm/test/Analysis/BasicAA/gep-modulo.ll
135	This is an example of the case I have in mind: This can be NoAlias (https://alive2.llvm.org/ce/z/yYFAAz) even though the mul+sub can wrap. Despite the wrapping, we are still guaranteed that the power-of-2 part of the GCD still holds.

nikic added inline comments.Apr 7 2021, 3:14 PM

llvm/test/Analysis/BasicAA/gep-modulo.ll
135	Eh, maybe I'm wrong here regarding the general case. I'll have to think about it more carefully.

Harbormaster completed remote builds in B97599: Diff 335931.Apr 7 2021, 3:17 PM

llvm/test/Analysis/BasicAA/gep-modulo.ll
135	I think it may hold for cases where we compute `% power-of-2`, but not if the operand is not a power-of-2? I'm not too familiar with the code that actually uses the modulo, but if that's the case we may be able to use this? If so, we should be able to do so by adjusting the check to `if (!DecompGEP1.VarIndices[i].PreservesModulo && !Scale.isPowerOf2())`

The impact of this seems quite limited. Below the AA stats for MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto:

aa.NumMayAlias
Program                                        base      basicaa   diff 
 test-suite...urce/Applications/aha/aha.test   271.00    292.00     7.7%
 test-suite...rks/FreeBench/pifft/pifft.test   14990.00  15689.00   4.7%
 test-suite...CFP2006/433.milc/433.milc.test   13793.00  14243.00   3.3%
 test-suite...-dbl/LinearDependence-dbl.test   522.00    525.00     0.6%
 test-suite...T2000/300.twolf/300.twolf.test   45884.00  46038.00   0.3%
 test-suite...TimberWolfMC/timberwolfmc.test   49643.00  49798.00   0.3%
 test-suite...-flt/LinearDependence-flt.test   1216.00   1219.00    0.2%
 test-suite...CFP2000/177.mesa/177.mesa.test   67899.00  68059.00   0.2%
 test-suite.../CINT2006/429.mcf/429.mcf.test   1134.00   1136.00    0.2%
 test-suite.../CINT2000/181.mcf/181.mcf.test   1173.00   1175.00    0.2%
 test-suite...libquantum/462.libquantum.test   3498.00   3502.00    0.1%
 test-suite...ications/JM/lencod/lencod.test   293097.00 293418.00  0.1%
 test-suite...nal/skidmarks10/skidmarks.test   16112.00  16126.00   0.1%
 test-suite...006/453.povray/453.povray.test   120925.00 121011.00  0.1%
 test-suite.../CINT2000/254.gap/254.gap.test   67251.00  67293.00   0.1%
 Geomean difference                                                 0.5%

aa.NumMustAlias
Program                                        base     basicaa  diff 
 test-suite...CFP2006/433.milc/433.milc.test   1932.00  1901.00  -1.6%
 test-suite.../Benchmarks/Bullet/bullet.test   25887.00 25875.00 -0.0%
 test-suite...CFP2000/177.mesa/177.mesa.test   8657.00  8653.00  -0.0%
 test-suite...006/453.povray/453.povray.test   28386.00 28374.00 -0.0%
 Geomean difference                                              -0.4%

aa.NumNoAlias
Program                                        base       basicaa    diff 
 test-suite...urce/Applications/aha/aha.test   4913.00    4832.00    -1.6%
 test-suite...rks/FreeBench/pifft/pifft.test   96670.00   95195.00   -1.5%
 test-suite...CFP2006/433.milc/433.milc.test   96144.00   95308.00   -0.9%
 test-suite...-dbl/LinearDependence-dbl.test   3488.00    3461.00    -0.8%
 test-suite...-flt/LinearDependence-flt.test   8855.00    8828.00    -0.3%
 test-suite...TimberWolfMC/timberwolfmc.test   306639.00  305968.00  -0.2%
 test-suite...T2000/300.twolf/300.twolf.test   458839.00  458169.00  -0.1%
 test-suite...000/183.equake/183.equake.test   75059.00   75009.00   -0.1%
 test-suite...libquantum/462.libquantum.test   7910.00    7906.00    -0.1%
 test-suite...CFP2000/177.mesa/177.mesa.test   607413.00  607138.00  -0.0%
 test-suite...oxyApps-C/miniAMR/miniAMR.test   120107.00  120076.00  -0.0%
 test-suite.../CINT2000/254.gap/254.gap.test   193489.00  193447.00  -0.0%
 test-suite.../CINT2000/181.mcf/181.mcf.test   10422.00   10420.00   -0.0%
 test-suite.../CINT2006/429.mcf/429.mcf.test   12104.00   12102.00   -0.0%
 test-suite...ications/JM/lencod/lencod.test   2077915.00 2077590.00 -0.0%
 Geomean difference                                                  -0.2%

And the size impact (binary changes to 9 out of 237 binaries)

Same hash: 228 (filtered out)
Remaining: 9
Metric: size.__text

Program                                        base     basicaa  diff
 test-suite...arks/mafft/pairlocalalign.test    233639   233927   0.1%
 test-suite...-dbl/LinearDependence-dbl.test    58195    58243    0.1%
 test-suite.../Benchmarks/Bullet/bullet.test    309159   309175   0.0%
 test-suite...006/453.povray/453.povray.test    1152978  1152994  0.0%
 test-suite...CFP2000/177.mesa/177.mesa.test    694753   694753   0.0%
 test-suite.../CINT2006/403.gcc/403.gcc.test    3238353  3238353  0.0%
 test-suite.../CINT2000/176.gcc/176.gcc.test    1429678  1429678  0.0%
 test-suite...CFP2006/433.milc/433.milc.test    119989   119861  -0.1%
 test-suite...-flt/LinearDependence-flt.test    54951    54743   -0.4%

There's no difference with special handling for % power-of-2.

ping :)

Okay, I've finally gotten around to looking into this in more detail. Because thinking about this really fries my brain, I ended up modelling this in alive2 (hopefully correctly...)

https://alive2.llvm.org/ce/z/HYBxGs This is the general case and shows that we only need nsw flags on the arithmetic, rather than both nuw and nsw. This makes sense, as all the arithmetic involved is signed.

https://alive2.llvm.org/ce/z/qXicaB This shows that it is indeed safe to always take the power-of-2 portion of the scale, even if the arithmetic is not nsw.

In D99424#2709964, @nikic wrote:

Okay, I've finally gotten around to looking into this in more detail. Because thinking about this really fries my brain, I ended up modelling this in alive2 (hopefully correctly...)

https://alive2.llvm.org/ce/z/HYBxGs This is the general case and shows that we only need nsw flags on the arithmetic, rather than both nuw and nsw. This makes sense, as all the arithmetic involved is signed.

https://alive2.llvm.org/ce/z/qXicaB This shows that it is indeed safe to always take the power-of-2 portion of the scale, even if the arithmetic is not nsw.

Thanks for checking with Alive. I've updated the code now to just track the NSW flag for LinearExpression/VariableGEPIndex and added a power-of-2 check for the scale.

Harbormaster completed remote builds in B108588: Diff 351122.Jun 10 2021, 4:45 AM

ping :)

nikic added inline comments.Jun 27 2021, 12:50 PM

llvm/lib/Analysis/BasicAliasAnalysis.cpp
313	Probably doesn't really matter, but this should be `true`. Note that the expression here is `Val * 0 + Const`.
377–378	This should be `E.IsNSW &= NSW`, to keep it as a plain NSW flag without further semantics.
1156	What I had in mind here is something like this: APInt Scale = DecompGEP1.VarIndices[i].Scale; if (!DecompGEP1.VarIndices[i].IsNSW) Scale = APInt::getOneBitSet( Scale.getBitWidth(), Scale.countTrailingZeros()); And then continue with the previous implementation. This preserves a power of two factor, even if the whole scale is not power of two.
1720	This doesn't appear to be tested.
llvm/test/Analysis/BasicAA/gep-alias.ll
164 ↗	(On Diff #351122)	With the suggested change to the power of two handling, this change shouldn't be needed anymore.

fhahn mentioned this in rGef78325c1033: [BasicAA] Add test to cover GetIndexDifference change in D99424..Jun 28 2021, 8:03 AM

Addressed comments, thanks!

fhahn added inline comments.Jun 28 2021, 8:47 AM

llvm/lib/Analysis/BasicAliasAnalysis.cpp
313	Ah right, `Val` is multiplied by 0 here. Updated.
377–378	Yes, I also removed the outdated comment.
1156	Updated, thanks!
1720	I added a test in ef78325c1033 and also removed the outdated comment.
llvm/test/Analysis/BasicAA/gep-alias.ll
164 ↗	(On Diff #351122)	Correct, I removed the changes, thanks!

LGTM, thanks!

This revision is now accepted and ready to land.Jun 28 2021, 8:54 AM

Harbormaster completed remote builds in B111302: Diff 354914.Jun 28 2021, 9:28 AM

Closed by commit rG91fa3565da16: [BasicAA] Be more careful with modulo ops on VariableGEPIndex. (authored by fhahn). · Explain WhyJun 29 2021, 1:24 AM

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG91fa3565da16: [BasicAA] Be more careful with modulo ops on VariableGEPIndex..

I’ve bisected a miscompilation to this commit, FYI. Looking into a repro now.

The problem appears with https://martin.st/temp/pixlet-preproc.c, compiled with “clang -target aarch64-w64-mingw32 -c -O3 pixlet-preproc.c”. I haven’t pinpointed exactly what changes and whether that’s wrong or if the source itself relies on something undefined though, or if there’s some strict aliasing violation.

In D99424#2849458, @mstorsjo wrote:

The problem appears with https://martin.st/temp/pixlet-preproc.c, compiled with “clang -target aarch64-w64-mingw32 -c -O3 pixlet-preproc.c”. I haven’t pinpointed exactly what changes and whether that’s wrong or if the source itself relies on something undefined though, or if there’s some strict aliasing violation.

Thanks for the heads-up!

I suspect that setting Scale = APInt::getOneBitSet(Scale.getBitWidth(), may be at fault here. With the change below, which makes BasicAA more conservative for that case, I don't get any differences with your reproducer. Could you try if that fixes the end-to-end miscomputation?

diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index da489b8d457f..81823b7ea778 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1149,8 +1149,7 @@ AliasResult BasicAAResult::aliasGEP(
     for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {
       APInt Scale = DecompGEP1.VarIndices[i].Scale;
       if (!DecompGEP1.VarIndices[i].IsNSW)
-        Scale = APInt::getOneBitSet(Scale.getBitWidth(),
-                                    Scale.countTrailingZeros());
+        GCD = APInt(Scale.getBitWidth(), 1);

       if (i == 0)
         GCD = Scale.abs();

In D99424#2849581, @fhahn wrote:

In D99424#2849458, @mstorsjo wrote:

The problem appears with https://martin.st/temp/pixlet-preproc.c, compiled with “clang -target aarch64-w64-mingw32 -c -O3 pixlet-preproc.c”. I haven’t pinpointed exactly what changes and whether that’s wrong or if the source itself relies on something undefined though, or if there’s some strict aliasing violation.

If you manage to reduce it further down, that would be very helpful. Unfortunately it looks like the only difference comes from the MachineScheduler, so it is a bit tricky to extract a nice test case.

I've also bisected a miscompile to this patch. I don't know at all what is happening but the suggested fix

-        Scale = APInt::getOneBitSet(Scale.getBitWidth(),
-                                    Scale.countTrailingZeros());
+        GCD = APInt(Scale.getBitWidth(), 1);

makes the miscompile go away.

In D99424#2849817, @uabelho wrote:

I've also bisected a miscompile to this patch. I don't know at all what is happening but the suggested fix

Any chance you could share the reproducer?

In D99424#2849830, @fhahn wrote:

In D99424#2849817, @uabelho wrote:

I've also bisected a miscompile to this patch. I don't know at all what is happening but the suggested fix

Any chance you could share the reproducer?

I'll see. It's for my out-of-tree target so it takes a bit of fiddling to extract something that could be useful to you.

In D99424#2849581, @fhahn wrote:

In D99424#2849458, @mstorsjo wrote:

The problem appears with https://martin.st/temp/pixlet-preproc.c, compiled with “clang -target aarch64-w64-mingw32 -c -O3 pixlet-preproc.c”. I haven’t pinpointed exactly what changes and whether that’s wrong or if the source itself relies on something undefined though, or if there’s some strict aliasing violation.

Thanks for the heads-up!

I suspect that setting Scale = APInt::getOneBitSet(Scale.getBitWidth(), may be at fault here. With the change below, which makes BasicAA more conservative for that case, I don't get any differences with your reproducer. Could you try if that fixes the end-to-end miscomputation?

Thanks! That does indeed fix my issue, and it also fixes another failed testcase that I hadn’t inspected closer yet.

In D99424#2849596, @fhahn wrote:

In D99424#2849581, @fhahn wrote:

In D99424#2849458, @mstorsjo wrote:

The problem appears with https://martin.st/temp/pixlet-preproc.c, compiled with “clang -target aarch64-w64-mingw32 -c -O3 pixlet-preproc.c”. I haven’t pinpointed exactly what changes and whether that’s wrong or if the source itself relies on something undefined though, or if there’s some strict aliasing violation.

If you manage to reduce it further down, that would be very helpful. Unfortunately it looks like the only difference comes from the MachineScheduler, so it is a bit tricky to extract a nice test case.

FWIW, if you’ve got access to aarch64 linux, I would expect that you can reproduce the same issue there. (I haven’t verified myself yet.) To try it out at runtime, you should be able to do this:

git clone git://source.ffmpeg.org/ffmpeg
cd ffmpeg
./configure --cc=clang --samples=/path/to/empty/dir
make fate-rsync # sync data samples for running tests, into the path specified above
make -j$(nproc) fate-pixlet-rgb fate-twinvq

The miscompilation happens in object files libavcodec/pixlet.o and libavcodec/twinvq.o.

(Sorry I'm not of more assistance in narrowing down the actual change itself at the moment.)

Ok, after another look, I think the problem was that setting Scale as in this patch may turn Scale from a negative to a positive number, which in turn means that AllNonNegative may be set incorrectly, which should be independent of the scale used for the GCD.

I pushed a test for that scenario (returned noalias based on all indices being non-negative) in f4ea6531e677. And a fix to avoid nuking the sign of the original Scale: e6d22d0174e0 . This is likely only temporary, as relying on the sign of the scale is also problematic when the operations wrap.

@nikic it would be great if you could take a quick look at the commits. I'll probably post a follow-up in the next few days for review.

@mstorsjo @uabelho could you check if e6d22d0174e0 fixes the issues you are seeing? I checked with @mstorsjo's example and with the new patch we get the same results as with D99424 reverted.

In D99424#2851015, @fhahn wrote:

@mstorsjo @uabelho could you check if e6d22d0174e0 fixes the issues you are seeing? I checked with @mstorsjo's example and with the new patch we get the same results as with D99424 reverted.

Thanks! It does seem like that commit fixes the issues I was observing in both those two testcases.

In D99424#2851015, @fhahn wrote:

Ok, after another look, I think the problem was that setting Scale as in this patch may turn Scale from a negative to a positive number, which in turn means that AllNonNegative may be set incorrectly, which should be independent of the scale used for the GCD.

I pushed a test for that scenario (returned noalias based on all indices being non-negative) in f4ea6531e677. And a fix to avoid nuking the sign of the original Scale: e6d22d0174e0 . This is likely only temporary, as relying on the sign of the scale is also problematic when the operations wrap.

@nikic it would be great if you could take a quick look at the commits. I'll probably post a follow-up in the next few days for review.

@mstorsjo @uabelho could you check if e6d22d0174e0 fixes the issues you are seeing? I checked with @mstorsjo's example and with the new patch we get the same results as with D99424 reverted.

It helps in my case. The miscompile goes away with e6d22d0174e0. Thanks!

Thanks @mstorsjo & @uabelho !

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

BasicAliasAnalysis.h

3 lines

lib/

Analysis/

BasicAliasAnalysis.cpp

59 lines

test/

Analysis/

BasicAA/

gep-modulo.ll

8 lines

Diff 355138

llvm/include/llvm/Analysis/BasicAliasAnalysis.h

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	struct VariableGEPIndex {
unsigned ZExtBits;		unsigned ZExtBits;
unsigned SExtBits;		unsigned SExtBits;

APInt Scale;		APInt Scale;

// Context instruction to use when querying information about this index.		// Context instruction to use when querying information about this index.
const Instruction *CxtI;		const Instruction *CxtI;

		/// True if all operations in this expression are NSW.
		bool IsNSW;

void dump() const {		void dump() const {
print(dbgs());		print(dbgs());
dbgs() << "\n";		dbgs() << "\n";
}		}
void print(raw_ostream &OS) const {		void print(raw_ostream &OS) const {
OS << "(V=" << V->getName()		OS << "(V=" << V->getName()
<< ", zextbits=" << ZExtBits		<< ", zextbits=" << ZExtBits
<< ", sextbits=" << SExtBits		<< ", sextbits=" << SExtBits
▲ Show 20 Lines • Show All 162 Lines • Show Last 20 Lines

llvm/lib/Analysis/BasicAliasAnalysis.cpp

Show First 20 Lines • Show All 278 Lines • ▼ Show 20 Lines
};		};

/// Represents zext(sext(V)) * Scale + Offset.		/// Represents zext(sext(V)) * Scale + Offset.
struct LinearExpression {		struct LinearExpression {
ExtendedValue Val;		ExtendedValue Val;
APInt Scale;		APInt Scale;
APInt Offset;		APInt Offset;

		/// True if all operations in this expression are NSW.
		bool IsNSW;

LinearExpression(const ExtendedValue &Val, const APInt &Scale,		LinearExpression(const ExtendedValue &Val, const APInt &Scale,
const APInt &Offset)		const APInt &Offset, bool IsNSW)
: Val(Val), Scale(Scale), Offset(Offset) {}		: Val(Val), Scale(Scale), Offset(Offset), IsNSW(IsNSW) {}

LinearExpression(const ExtendedValue &Val) : Val(Val) {		LinearExpression(const ExtendedValue &Val) : Val(Val), IsNSW(true) {
unsigned BitWidth = Val.getBitWidth();		unsigned BitWidth = Val.getBitWidth();
Scale = APInt(BitWidth, 1);		Scale = APInt(BitWidth, 1);
Offset = APInt(BitWidth, 0);		Offset = APInt(BitWidth, 0);
}		}
};		};
}		}

/// Analyzes the specified value as a linear expression: "A*V + B", where A and		/// Analyzes the specified value as a linear expression: "A*V + B", where A and
/// B are constant integers.		/// B are constant integers.
static LinearExpression GetLinearExpression(		static LinearExpression GetLinearExpression(
const ExtendedValue &Val, const DataLayout &DL, unsigned Depth,		const ExtendedValue &Val, const DataLayout &DL, unsigned Depth,
AssumptionCache AC, DominatorTree DT) {		AssumptionCache AC, DominatorTree DT) {
// Limit our recursion depth.		// Limit our recursion depth.
if (Depth == 6)		if (Depth == 6)
return Val;		return Val;

if (const ConstantInt *Const = dyn_cast<ConstantInt>(Val.V))		if (const ConstantInt *Const = dyn_cast<ConstantInt>(Val.V))
return LinearExpression(Val, APInt(Val.getBitWidth(), 0),		return LinearExpression(Val, APInt(Val.getBitWidth(), 0),
Val.evaluateWith(Const->getValue()));		Val.evaluateWith(Const->getValue()), true);
		nikicUnsubmitted Not Done Reply Inline Actions Probably doesn't really matter, but this should be `true`. Note that the expression here is `Val * 0 + Const`. nikic: Probably doesn't really matter, but this should be `true`. Note that the expression here is…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Ah right, `Val` is multiplied by 0 here. Updated. fhahn: Ah right, `Val` is multiplied by 0 here. Updated.

if (const BinaryOperator *BOp = dyn_cast<BinaryOperator>(Val.V)) {		if (const BinaryOperator *BOp = dyn_cast<BinaryOperator>(Val.V)) {
if (ConstantInt *RHSC = dyn_cast<ConstantInt>(BOp->getOperand(1))) {		if (ConstantInt *RHSC = dyn_cast<ConstantInt>(BOp->getOperand(1))) {
APInt RHS = Val.evaluateWith(RHSC->getValue());		APInt RHS = Val.evaluateWith(RHSC->getValue());
// The only non-OBO case we deal with is or, and only limited to the		// The only non-OBO case we deal with is or, and only limited to the
// case where it is both nuw and nsw.		// case where it is both nuw and nsw.
bool NUW = true, NSW = true;		bool NUW = true, NSW = true;
if (isa<OverflowingBinaryOperator>(BOp)) {		if (isa<OverflowingBinaryOperator>(BOp)) {
NUW &= BOp->hasNoUnsignedWrap();		NUW &= BOp->hasNoUnsignedWrap();
NSW &= BOp->hasNoSignedWrap();		NSW &= BOp->hasNoSignedWrap();
}		}
if (!Val.canDistributeOver(NUW, NSW))		if (!Val.canDistributeOver(NUW, NSW))
return Val;		return Val;

		LinearExpression E(Val);
switch (BOp->getOpcode()) {		switch (BOp->getOpcode()) {
default:		default:
// We don't understand this instruction, so we can't decompose it any		// We don't understand this instruction, so we can't decompose it any
// further.		// further.
return Val;		return Val;
case Instruction::Or:		case Instruction::Or:
// X\|C == X+C if all the bits in C are unset in X. Otherwise we can't		// X\|C == X+C if all the bits in C are unset in X. Otherwise we can't
// analyze it.		// analyze it.
if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), DL, 0, AC,		if (!MaskedValueIsZero(BOp->getOperand(0), RHSC->getValue(), DL, 0, AC,
BOp, DT))		BOp, DT))
return Val;		return Val;

LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case Instruction::Add: {		case Instruction::Add: {
LinearExpression E = GetLinearExpression(		E = GetLinearExpression(Val.withValue(BOp->getOperand(0)), DL,
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);		Depth + 1, AC, DT);
E.Offset += RHS;		E.Offset += RHS;
return E;		E.IsNSW &= NSW;
		break;
}		}
case Instruction::Sub: {		case Instruction::Sub: {
LinearExpression E = GetLinearExpression(		E = GetLinearExpression(Val.withValue(BOp->getOperand(0)), DL,
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);		Depth + 1, AC, DT);
E.Offset -= RHS;		E.Offset -= RHS;
return E;		E.IsNSW &= NSW;
		break;
}		}
case Instruction::Mul: {		case Instruction::Mul: {
LinearExpression E = GetLinearExpression(		E = GetLinearExpression(Val.withValue(BOp->getOperand(0)), DL,
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);		Depth + 1, AC, DT);
E.Offset *= RHS;		E.Offset *= RHS;
E.Scale *= RHS;		E.Scale *= RHS;
return E;		E.IsNSW &= NSW;
		break;
}		}
case Instruction::Shl:		case Instruction::Shl:
// We're trying to linearize an expression of the kind:		// We're trying to linearize an expression of the kind:
// shl i8 -128, 36		// shl i8 -128, 36
// where the shift count exceeds the bitwidth of the type.		// where the shift count exceeds the bitwidth of the type.
// We can't decompose this further (the expression would return		// We can't decompose this further (the expression would return
// a poison value).		// a poison value).
if (RHS.getLimitedValue() > Val.getBitWidth())		if (RHS.getLimitedValue() > Val.getBitWidth())
return Val;		return Val;

LinearExpression E = GetLinearExpression(		E = GetLinearExpression(Val.withValue(BOp->getOperand(0)), DL,
Val.withValue(BOp->getOperand(0)), DL, Depth + 1, AC, DT);		Depth + 1, AC, DT);
E.Offset <<= RHS.getLimitedValue();		E.Offset <<= RHS.getLimitedValue();
E.Scale <<= RHS.getLimitedValue();		E.Scale <<= RHS.getLimitedValue();
return E;		E.IsNSW &= NSW;
		break;
		nikicUnsubmitted Not Done Reply Inline Actions This should be `E.IsNSW &= NSW`, to keep it as a plain NSW flag without further semantics. nikic: This should be `E.IsNSW &= NSW`, to keep it as a plain NSW flag without further semantics.
		fhahnAuthorUnsubmitted Done Reply Inline Actions Yes, I also removed the outdated comment. fhahn: Yes, I also removed the outdated comment.
}		}
		return E;
}		}
}		}

if (isa<ZExtInst>(Val.V))		if (isa<ZExtInst>(Val.V))
return GetLinearExpression(		return GetLinearExpression(
Val.withZExtOfValue(cast<CastInst>(Val.V)->getOperand(0)),		Val.withZExtOfValue(cast<CastInst>(Val.V)->getOperand(0)),
DL, Depth + 1, AC, DT);		DL, Depth + 1, AC, DT);

▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	for (User::const_op_iterator I = GEPOp->op_begin() + 1, E = GEPOp->op_end();
}		}
}		}

// Make sure that we have a scale that makes sense for this target's		// Make sure that we have a scale that makes sense for this target's
// pointer size.		// pointer size.
Scale = adjustToPointerSize(Scale, PointerSize);		Scale = adjustToPointerSize(Scale, PointerSize);

if (!!Scale) {		if (!!Scale) {
VariableGEPIndex Entry = {LE.Val.V, LE.Val.ZExtBits, LE.Val.SExtBits,		VariableGEPIndex Entry = {
Scale, CxtI};		LE.Val.V, LE.Val.ZExtBits, LE.Val.SExtBits, Scale, CxtI, LE.IsNSW};
Decomposed.VarIndices.push_back(Entry);		Decomposed.VarIndices.push_back(Entry);
}		}
}		}

// Take care of wrap-arounds		// Take care of wrap-arounds
if (GepHasConstantOffset)		if (GepHasConstantOffset)
Decomposed.Offset = adjustToPointerSize(Decomposed.Offset, PointerSize);		Decomposed.Offset = adjustToPointerSize(Decomposed.Offset, PointerSize);

▲ Show 20 Lines • Show All 542 Lines • ▼ Show 20 Lines	if (DecompGEP1.Offset != 0 && DecompGEP1.VarIndices.empty()) {
}		}
}		}

if (!DecompGEP1.VarIndices.empty()) {		if (!DecompGEP1.VarIndices.empty()) {
APInt GCD;		APInt GCD;
bool AllNonNegative = DecompGEP1.Offset.isNonNegative();		bool AllNonNegative = DecompGEP1.Offset.isNonNegative();
bool AllNonPositive = DecompGEP1.Offset.isNonPositive();		bool AllNonPositive = DecompGEP1.Offset.isNonPositive();
for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {		for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {
const APInt &Scale = DecompGEP1.VarIndices[i].Scale;		APInt Scale = DecompGEP1.VarIndices[i].Scale;
		if (!DecompGEP1.VarIndices[i].IsNSW)
		Scale = APInt::getOneBitSet(Scale.getBitWidth(),
		Scale.countTrailingZeros());

if (i == 0)		if (i == 0)
GCD = Scale.abs();		GCD = Scale.abs();
		nikicUnsubmitted Done Reply Inline Actions What I had in mind here is something like this: APInt Scale = DecompGEP1.VarIndices[i].Scale; if (!DecompGEP1.VarIndices[i].IsNSW) Scale = APInt::getOneBitSet( Scale.getBitWidth(), Scale.countTrailingZeros()); And then continue with the previous implementation. This preserves a power of two factor, even if the whole scale is not power of two. nikic: What I had in mind here is something like this: ``` APInt Scale = DecompGEP1.VarIndices…
		fhahnAuthorUnsubmitted Done Reply Inline Actions Updated, thanks! fhahn: Updated, thanks!
else		else
GCD = APIntOps::GreatestCommonDivisor(GCD, Scale.abs());		GCD = APIntOps::GreatestCommonDivisor(GCD, Scale.abs());

if (AllNonNegative \|\| AllNonPositive) {		if (AllNonNegative \|\| AllNonPositive) {
// If the Value could change between cycles, then any reasoning about		// If the Value could change between cycles, then any reasoning about
// the Value this cycle may not hold in the next cycle. We'll just		// the Value this cycle may not hold in the next cycle. We'll just
// give up if we can't determine conditions that hold for every cycle:		// give up if we can't determine conditions that hold for every cycle:
const Value *V = DecompGEP1.VarIndices[i].V;		const Value *V = DecompGEP1.VarIndices[i].V;
▲ Show 20 Lines • Show All 544 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = Src.size(); i != e; ++i) {
// than a few variable indexes.		// than a few variable indexes.
for (unsigned j = 0, e = Dest.size(); j != e; ++j) {		for (unsigned j = 0, e = Dest.size(); j != e; ++j) {
if (!isValueEqualInPotentialCycles(Dest[j].V, V) \|\|		if (!isValueEqualInPotentialCycles(Dest[j].V, V) \|\|
Dest[j].ZExtBits != ZExtBits \|\| Dest[j].SExtBits != SExtBits)		Dest[j].ZExtBits != ZExtBits \|\| Dest[j].SExtBits != SExtBits)
continue;		continue;

// If we found it, subtract off Scale V's from the entry in Dest. If it		// If we found it, subtract off Scale V's from the entry in Dest. If it
// goes to zero, remove the entry.		// goes to zero, remove the entry.
if (Dest[j].Scale != Scale)		if (Dest[j].Scale != Scale) {
Dest[j].Scale -= Scale;		Dest[j].Scale -= Scale;
else		Dest[j].IsNSW = false;
		} else
		nikicUnsubmitted Not Done Reply Inline Actions This doesn't appear to be tested. nikic: This doesn't appear to be tested.
		fhahnAuthorUnsubmitted Done Reply Inline Actions I added a test in ef78325c1033 and also removed the outdated comment. fhahn: I added a test in ef78325c1033 and also removed the outdated comment.
Dest.erase(Dest.begin() + j);		Dest.erase(Dest.begin() + j);
Scale = 0;		Scale = 0;
break;		break;
}		}

// If we didn't consume this entry, add it to the end of the Dest list.		// If we didn't consume this entry, add it to the end of the Dest list.
if (!!Scale) {		if (!!Scale) {
VariableGEPIndex Entry = {V, ZExtBits, SExtBits, -Scale, Src[i].CxtI};		VariableGEPIndex Entry = {V, ZExtBits, SExtBits,
		-Scale, Src[i].CxtI, Src[i].IsNSW};
Dest.push_back(Entry);		Dest.push_back(Entry);
}		}
}		}
}		}

bool BasicAAResult::constantOffsetHeuristic(		bool BasicAAResult::constantOffsetHeuristic(
const SmallVectorImpl<VariableGEPIndex> &VarIndices,		const SmallVectorImpl<VariableGEPIndex> &VarIndices,
LocationSize MaybeV1Size, LocationSize MaybeV2Size, const APInt &BaseOffset,		LocationSize MaybeV1Size, LocationSize MaybeV2Size, const APInt &BaseOffset,
▲ Show 20 Lines • Show All 110 Lines • Show Last 20 Lines

llvm/test/Analysis/BasicAA/gep-modulo.ll

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
}		}

; %gep.idx and %gep.3 must-alias if %mul overflows		; %gep.idx and %gep.3 must-alias if %mul overflows
; (e.g. %idx == 3689348814741910323).		; (e.g. %idx == 3689348814741910323).
define void @may_overflow_mul_sub_i64([16 x i8]* %ptr, i64 %idx) {		define void @may_overflow_mul_sub_i64([16 x i8]* %ptr, i64 %idx) {
; CHECK-LABEL: Function: may_overflow_mul_sub_i64: 3 pointers, 0 call sites		; CHECK-LABEL: Function: may_overflow_mul_sub_i64: 3 pointers, 0 call sites
; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx		; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx
; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3		; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3
; CHECK-NEXT: NoAlias: i8* %gep.3, i8* %gep.idx		; CHECK-NEXT: MayAlias: i8* %gep.3, i8* %gep.idx
;		;
%mul = mul i64 %idx, 5		%mul = mul i64 %idx, 5
%sub = sub i64 %mul, 1		%sub = sub i64 %mul, 1
%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub		%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub
store i8 0, i8* %gep.idx, align 1		store i8 0, i8* %gep.idx, align 1
%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3		%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3
store i8 1, i8* %gep.3, align 1		store i8 1, i8* %gep.3, align 1
ret void		ret void
Show All 28 Lines	;
store i8 1, i8* %gep.3, align 1		store i8 1, i8* %gep.3, align 1
ret void		ret void
}		}

define void @only_nuw_mul_sub_i64([16 x i8]* %ptr, i64 %idx) {		define void @only_nuw_mul_sub_i64([16 x i8]* %ptr, i64 %idx) {
; CHECK-LABEL: Function: only_nuw_mul_sub_i64: 3 pointers, 0 call sites		; CHECK-LABEL: Function: only_nuw_mul_sub_i64: 3 pointers, 0 call sites
; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx		; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx
; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3		; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3
; CHECK-NEXT: NoAlias: i8* %gep.3, i8* %gep.idx		; CHECK-NEXT: MayAlias: i8* %gep.3, i8* %gep.idx
;		;
%mul = mul nuw i64 %idx, 5		%mul = mul nuw i64 %idx, 5
%sub = sub nuw i64 %mul, 1		%sub = sub nuw i64 %mul, 1
%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub		%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub
store i8 0, i8* %gep.idx, align 1		store i8 0, i8* %gep.idx, align 1
%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3		%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3
store i8 1, i8* %gep.3, align 1		store i8 1, i8* %gep.3, align 1
ret void		ret void
}		}

		; Even though the mul and sub may overflow %gep.idx and %gep.3 cannot alias
		; because we multiply by a power-of-2.
define void @may_overflow_mul_pow2_sub_i64([16 x i8]* %ptr, i64 %idx) {		define void @may_overflow_mul_pow2_sub_i64([16 x i8]* %ptr, i64 %idx) {
; CHECK-LABEL: Function: may_overflow_mul_pow2_sub_i64: 3 pointers, 0 call sites		; CHECK-LABEL: Function: may_overflow_mul_pow2_sub_i64: 3 pointers, 0 call sites
; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx		; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.idx
; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3		; CHECK-NEXT: PartialAlias (off 3): [16 x i8]* %ptr, i8* %gep.3
; CHECK-NEXT: NoAlias: i8* %gep.3, i8* %gep.idx		; CHECK-NEXT: NoAlias: i8* %gep.3, i8* %gep.idx
		nikicUnsubmitted Not Done Reply Inline Actions This is an example of the case I have in mind: This can be NoAlias (https://alive2.llvm.org/ce/z/yYFAAz) even though the mul+sub can wrap. Despite the wrapping, we are still guaranteed that the power-of-2 part of the GCD still holds. nikic: This is an example of the case I have in mind: This can be NoAlias (https://alive2.llvm.
		nikicUnsubmitted Not Done Reply Inline Actions Eh, maybe I'm wrong here regarding the general case. I'll have to think about it more carefully. nikic: Eh, maybe I'm wrong here regarding the general case. I'll have to think about it more carefully.
		fhahnAuthorUnsubmitted Done Reply Inline Actions I think it may hold for cases where we compute `% power-of-2`, but not if the operand is not a power-of-2? I'm not too familiar with the code that actually uses the modulo, but if that's the case we may be able to use this? If so, we should be able to do so by adjusting the check to `if (!DecompGEP1.VarIndices[i].PreservesModulo && !Scale.isPowerOf2())` fhahn: I think it may hold for cases where we compute `% power-of-2`, but not if the operand is not a…
;		;
%mul = mul i64 %idx, 8		%mul = mul i64 %idx, 8
%sub = sub i64 %mul, 1		%sub = sub i64 %mul, 1
%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub		%gep.idx = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub
store i8 0, i8* %gep.idx, align 1		store i8 0, i8* %gep.idx, align 1
%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3		%gep.3 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 3
store i8 1, i8* %gep.3, align 1		store i8 1, i8* %gep.3, align 1
ret void		ret void
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines
}		}

; %mul.1 and %sub.2 are equal, if %idx = 9, because %mul.1 overflows. Hence		; %mul.1 and %sub.2 are equal, if %idx = 9, because %mul.1 overflows. Hence
; %gep.mul.1 and %gep.sub.2 may alias.		; %gep.mul.1 and %gep.sub.2 may alias.
define void @may_overflow_pointer_diff([16 x i8]* %ptr, i64 %idx) {		define void @may_overflow_pointer_diff([16 x i8]* %ptr, i64 %idx) {
; CHECK-LABEL: Function: may_overflow_pointer_diff: 3 pointers, 0 call sites		; CHECK-LABEL: Function: may_overflow_pointer_diff: 3 pointers, 0 call sites
; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.mul.1		; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.mul.1
; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.sub.2		; CHECK-NEXT: MayAlias: [16 x i8]* %ptr, i8* %gep.sub.2
; CHECK-NEXT: NoAlias: i8* %gep.mul.1, i8* %gep.sub.2		; CHECK-NEXT: MayAlias: i8* %gep.mul.1, i8* %gep.sub.2
;		;
%mul.1 = mul i64 %idx, 6148914691236517207		%mul.1 = mul i64 %idx, 6148914691236517207
%gep.mul.1 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %mul.1		%gep.mul.1 = getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %mul.1
store i8 1, i8* %gep.mul.1, align 1		store i8 1, i8* %gep.mul.1, align 1
%mul.2 = mul nsw i64 %idx, 3		%mul.2 = mul nsw i64 %idx, 3
%sub.2 = sub nsw i64 %mul.2, 12		%sub.2 = sub nsw i64 %mul.2, 12
%gep.sub.2= getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub.2		%gep.sub.2= getelementptr [16 x i8], [16 x i8]* %ptr, i32 0, i64 %sub.2
store i8 0, i8* %gep.sub.2, align 1		store i8 0, i8* %gep.sub.2, align 1

ret void		ret void
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[BasicAA] Be more careful with modulo ops on VariableGEPIndex.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 355138

llvm/include/llvm/Analysis/BasicAliasAnalysis.h

llvm/lib/Analysis/BasicAliasAnalysis.cpp

llvm/test/Analysis/BasicAA/gep-modulo.ll

[BasicAA] Be more careful with modulo ops on VariableGEPIndex.
ClosedPublic