This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
2/6
DAGCombiner.cpp
-
test/CodeGen/ARM/
-
CodeGen/
-
ARM/
3/3
memset-align.ll

Differential D75238

[DAGCombine] Fix alias analysis for unaligned accesses
ClosedPublic

Authored by dmgreen on Feb 27 2020, 4:00 AM.

Download Raw Diff

Details

Reviewers

efriedma
niravd
courbet
kristof.beyls
ostannard
yabinc
nickdesaulniers

Commits

rG1de10705594c: [DAGCombine] Fix alias analysis for unaligned accesses

Summary

The alias analysis in DAG Combine looks at the BaseAlign, the Offset and the Size of two accesses, and determines if they are known to access different parts of memory by the fact that they are different offsets from inside that "alignment window". It does not seem to account for accesses that are not a multiple of the size, and may overflow from one alignment window into another.

For example in the test case we have a 19byte memset that is splits into a 16 byte neon store and an unaligned 4 byte store with a 15 byte offset. This 15byte offset (with a base align of 8) wraps around to the next alignment windows. When compared to an access that is a 16byte offset (of the same 4byte size and 8byte basealign), the two accesses are said not to alias.

I've fixed this here by just ensuring that the offsets are a multiple of the size, ensuring that they don't overlap by wrapping. Fixes PR45035, which was exposed by the UseAA changes in the arm backend.

Diff Detail

Event Timeline

dmgreen created this revision.Feb 27 2020, 4:00 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2020, 4:00 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

nickdesaulniers added a subscriber: nickdesaulniers.Feb 27 2020, 11:04 AM

manojgupta added subscribers: manojgupta, denik.Feb 27 2020, 11:09 AM

nickdesaulniers marked an inline comment as done.Feb 27 2020, 11:15 AM

nickdesaulniers added inline comments.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21135	Probably should hoist `MUC0.NumBytes` and `MUC1.NumBytes` above the if into their own locals, like the offsets and alignments; would help make this condition and block more readable. Also, was mod'ing `SrcValueOffset1` by `MUC0.NumBytes` intentional? Should it be `MUC1.NumBytes`?
llvm/test/CodeGen/ARM/memset-align.ll
3	What is going on with this target triple? LOL

srhines marked an inline comment as done.Feb 27 2020, 11:40 AM

srhines added a subscriber: srhines.

srhines added inline comments.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21135	`MUC0.NumBytes == MUC1.NumBytes` from earlier in this conditional.
llvm/test/CodeGen/ARM/memset-align.ll
3	The "10000" changes the Android target API level to 10000 (which basically means everything is available). I think it is safe to remove it for the purpose of this test, but @yabinc 's example used this in the command line because of how it works on the platform build for Android.

Thanks for the fix! I have verified that it fixes tests broken by using AA.

nickdesaulniers added inline comments.Feb 27 2020, 12:27 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21135	Still...

In D75238#1896424, @yabinc wrote:

Thanks for the fix! I have verified that it fixes tests broken by using AA.

Brilliant, thanks!

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21135	Yeah, it was intentional from the fact that they are equal. I can change it though, it sounds cleaner that way.
llvm/test/CodeGen/ARM/memset-align.ll
3	Righteo. I'll changed it to.. thumbv8-unknown-linux-android. Let me know if something else would be better.

Added variables and new triple.

nickdesaulniers added inline comments.Feb 27 2020, 3:31 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21131	The hunk you modify on L21156-L21157 highlights was looks like maybe a pre-existing incorrect usage of `Optional::operator`; there's no check that `MUC0.NumBytes` `hasValue` before making use of the value (which may not exist). In particular, I'm concerned what the value of `-1` might do in those expressions. Or you could do: auto& Size0 = MUC0.NumBytes; auto& Size1 = MUC1.NumBytes; if ( ... && Size0.hasValue() && Size0.hasValue() && ...) foo(Size0, *Size1) Though I'm not sure whether the lower hunk should have a guard as well vs use a value of `0`. Using `-1` for a possible value of the `Optional<int64_t>` might be problematic, though I assume the number of bytes in a "memory use characteristic" is unlikely. (I wish the "LLVM Programmers Manual" had a section on `llvm::Optional`). So maybe it's ok to keep as is, but then I really think the lower hunk should also have some kind of guard.

dmgreen marked an inline comment as done.Feb 28 2020, 5:07 AM

dmgreen added inline comments.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
21131	Good point I chose -1 as it's the same as MemoryLocation::UnknownSize (and only didn't use that directly here as `(int64_t)MemoryLocation::UnknownSize` makes things difficult to follow with how verbose it is). That Size is not passed straight through to AA->alias below though, so might cause weirdness if there was ever a case where MUC0.MMO had a value but Size didn't. I'll changed it to how you suggest, with the references, and added the checks below too in case.

dmgreen updated this revision to Diff 247231.Feb 28 2020, 5:09 AM

Right on, sorry my suggestion turned into a bit of a yak shave, but I think the added guard may have fixed another pre-existing bug. Assuming that change didn't trip up any exists tests, LGTM. Thanks for the quick fix!

This revision is now accepted and ready to land.Feb 28 2020, 10:23 AM

Thanks. The test seem fine and the benchmarks/codesize tests I ran didn't show any change.

I presume this is one for the branch. @hans if this doesn't have any trouble with the buildbots, do you mind?

Closed by commit rG1de10705594c: [DAGCombine] Fix alias analysis for unaligned accesses (authored by dmgreen). · Explain WhyFeb 28 2020, 10:51 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

16 lines

test/

CodeGen/

ARM/

memset-align.ll

39 lines

Diff 247109

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 21,115 Lines • ▼ Show 20 Lines	bool DAGCombiner::isAlias(SDNode Op0, SDNode Op1) const {
// but it only matters for memory nodes other than load /store.		// but it only matters for memory nodes other than load /store.
if ((MUC0.MMO->isInvariant() && MUC1.MMO->isStore()) \|\|		if ((MUC0.MMO->isInvariant() && MUC1.MMO->isStore()) \|\|
(MUC1.MMO->isInvariant() && MUC0.MMO->isStore()))		(MUC1.MMO->isInvariant() && MUC0.MMO->isStore()))
return false;		return false;

// If we know required SrcValue1 and SrcValue2 have relatively large		// If we know required SrcValue1 and SrcValue2 have relatively large
// alignment compared to the size and offset of the access, we may be able		// alignment compared to the size and offset of the access, we may be able
// to prove they do not alias. This check is conservative for now to catch		// to prove they do not alias. This check is conservative for now to catch
// cases created by splitting vector types.		// cases created by splitting vector types, it only works when the offsets are
		// multiples of the size of the data.
int64_t SrcValOffset0 = MUC0.MMO->getOffset();		int64_t SrcValOffset0 = MUC0.MMO->getOffset();
int64_t SrcValOffset1 = MUC1.MMO->getOffset();		int64_t SrcValOffset1 = MUC1.MMO->getOffset();
unsigned OrigAlignment0 = MUC0.MMO->getBaseAlignment();		unsigned OrigAlignment0 = MUC0.MMO->getBaseAlignment();
unsigned OrigAlignment1 = MUC1.MMO->getBaseAlignment();		unsigned OrigAlignment1 = MUC1.MMO->getBaseAlignment();
		int64_t Size0 = MUC0.NumBytes.hasValue() ? *MUC0.NumBytes : -1;
		int64_t Size1 = MUC1.NumBytes.hasValue() ? *MUC1.NumBytes : -1;
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions The hunk you modify on L21156-L21157 highlights was looks like maybe a pre-existing incorrect usage of `Optional::operator`; there's no check that `MUC0.NumBytes` `hasValue` before making use of the value (which may not exist). In particular, I'm concerned what the value of `-1` might do in those expressions. Or you could do: auto& Size0 = MUC0.NumBytes; auto& Size1 = MUC1.NumBytes; if ( ... && Size0.hasValue() && Size0.hasValue() && ...) foo(Size0, Size1) Though I'm not sure whether the lower hunk should have a guard as well vs use a value of `0`. Using `-1` for a possible value of the `Optional<int64_t>` might be problematic, though I assume the number of bytes in a "memory use characteristic" is unlikely. (I wish the "LLVM Programmers Manual" had a section on `llvm::Optional`). So maybe it's ok to keep as is, but then I really think the lower hunk should also have some kind of guard. nickdesaulniers:* The hunk you modify on L21156-L21157 highlights was looks like maybe a pre-existing incorrect…
		dmgreenAuthorUnsubmitted Done Reply Inline Actions Good point I chose -1 as it's the same as MemoryLocation::UnknownSize (and only didn't use that directly here as `(int64_t)MemoryLocation::UnknownSize` makes things difficult to follow with how verbose it is). That Size is not passed straight through to AA->alias below though, so might cause weirdness if there was ever a case where MUC0.MMO had a value but Size didn't. I'll changed it to how you suggest, with the references, and added the checks below too in case. dmgreen: Good point I chose -1 as it's the same as MemoryLocation::UnknownSize (and only didn't use…
if (OrigAlignment0 == OrigAlignment1 && SrcValOffset0 != SrcValOffset1 &&		if (OrigAlignment0 == OrigAlignment1 && SrcValOffset0 != SrcValOffset1 &&
MUC0.NumBytes.hasValue() && MUC1.NumBytes.hasValue() &&		Size0 != -1 && Size1 != -1 && Size0 == Size1 && OrigAlignment0 > Size0 &&
MUC0.NumBytes == MUC1.NumBytes && OrigAlignment0 > *MUC0.NumBytes) {		SrcValOffset0 % Size0 == 0 && SrcValOffset1 % Size1 == 0) {
int64_t OffAlign0 = SrcValOffset0 % OrigAlignment0;		int64_t OffAlign0 = SrcValOffset0 % OrigAlignment0;
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Probably should hoist `MUC0.NumBytes` and `MUC1.NumBytes` above the if into their own locals, like the offsets and alignments; would help make this condition and block more readable. Also, was mod'ing `SrcValueOffset1` by `MUC0.NumBytes` intentional? Should it be `MUC1.NumBytes`? nickdesaulniers: Probably should hoist `MUC0.NumBytes` and `MUC1.NumBytes` above the if into their own locals…
		srhinesUnsubmitted Not Done Reply Inline Actions `MUC0.NumBytes == MUC1.NumBytes` from earlier in this conditional. srhines: `MUC0.NumBytes == MUC1.NumBytes` from earlier in this conditional.
		nickdesaulniersUnsubmitted Not Done Reply Inline Actions Still... nickdesaulniers: Still...
		dmgreenAuthorUnsubmitted Done Reply Inline Actions Yeah, it was intentional from the fact that they are equal. I can change it though, it sounds cleaner that way. dmgreen: Yeah, it was intentional from the fact that they are equal. I can change it though, it sounds…
int64_t OffAlign1 = SrcValOffset1 % OrigAlignment1;		int64_t OffAlign1 = SrcValOffset1 % OrigAlignment1;

// There is no overlap between these relatively aligned accesses of		// There is no overlap between these relatively aligned accesses of
// similar size. Return no alias.		// similar size. Return no alias.
if ((OffAlign0 + *MUC0.NumBytes) <= OffAlign1 \|\|		if ((OffAlign0 + Size0) <= OffAlign1 \|\| (OffAlign1 + Size1) <= OffAlign0)
(OffAlign1 + *MUC1.NumBytes) <= OffAlign0)
return false;		return false;
}		}

bool UseAA = CombinerGlobalAA.getNumOccurrences() > 0		bool UseAA = CombinerGlobalAA.getNumOccurrences() > 0
? CombinerGlobalAA		? CombinerGlobalAA
: DAG.getSubtarget().useAA();		: DAG.getSubtarget().useAA();
#ifndef NDEBUG		#ifndef NDEBUG
if (CombinerAAOnlyFunc.getNumOccurrences() &&		if (CombinerAAOnlyFunc.getNumOccurrences() &&
CombinerAAOnlyFunc != DAG.getMachineFunction().getName())		CombinerAAOnlyFunc != DAG.getMachineFunction().getName())
UseAA = false;		UseAA = false;
#endif		#endif

if (UseAA && AA && MUC0.MMO->getValue() && MUC1.MMO->getValue()) {		if (UseAA && AA && MUC0.MMO->getValue() && MUC1.MMO->getValue()) {
// Use alias analysis information.		// Use alias analysis information.
int64_t MinOffset = std::min(SrcValOffset0, SrcValOffset1);		int64_t MinOffset = std::min(SrcValOffset0, SrcValOffset1);
int64_t Overlap0 = *MUC0.NumBytes + SrcValOffset0 - MinOffset;		int64_t Overlap0 = Size0 + SrcValOffset0 - MinOffset;
int64_t Overlap1 = *MUC1.NumBytes + SrcValOffset1 - MinOffset;		int64_t Overlap1 = Size1 + SrcValOffset1 - MinOffset;
AliasResult AAResult = AA->alias(		AliasResult AAResult = AA->alias(
MemoryLocation(MUC0.MMO->getValue(), Overlap0,		MemoryLocation(MUC0.MMO->getValue(), Overlap0,
UseTBAA ? MUC0.MMO->getAAInfo() : AAMDNodes()),		UseTBAA ? MUC0.MMO->getAAInfo() : AAMDNodes()),
MemoryLocation(MUC1.MMO->getValue(), Overlap1,		MemoryLocation(MUC1.MMO->getValue(), Overlap1,
UseTBAA ? MUC1.MMO->getAAInfo() : AAMDNodes()));		UseTBAA ? MUC1.MMO->getAAInfo() : AAMDNodes()));
if (AAResult == NoAlias)		if (AAResult == NoAlias)
return false;		return false;
}		}
▲ Show 20 Lines • Show All 303 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/memset-align.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=thumbv8-unknown-linux-android -o - \| FileCheck %s

				nickdesaulniersUnsubmitted Done Reply Inline Actions What is going on with this target triple? LOL nickdesaulniers: What is going on with this target triple? LOL
				srhinesUnsubmitted Done Reply Inline Actions The "10000" changes the Android target API level to 10000 (which basically means everything is available). I think it is safe to remove it for the purpose of this test, but @yabinc 's example used this in the command line because of how it works on the platform build for Android. srhines: The "10000" changes the Android target API level to 10000 (which basically means everything is…
				dmgreenAuthorUnsubmitted Done Reply Inline Actions Righteo. I'll changed it to.. thumbv8-unknown-linux-android. Let me know if something else would be better. dmgreen: Righteo. I'll changed it to.. thumbv8-unknown-linux-android. Let me know if something else…
				%struct.af = type <{ i64, i64, i8, i8, i8, [5 x i8] }>

				define void @test() {
				; CHECK-LABEL: test:
				; CHECK: @ %bb.0: @ %entry
				; CHECK-NEXT: .save {r7, lr}
				; CHECK-NEXT: push {r7, lr}
				; CHECK-NEXT: .pad #24
				; CHECK-NEXT: sub sp, #24
				; CHECK-NEXT: mov r0, sp
				; CHECK-NEXT: mov.w r1, #-1
				; CHECK-NEXT: vmov.i32 q8, #0x0
				; CHECK-NEXT: movs r2, #15
				; CHECK-NEXT: mov r3, r0
				; CHECK-NEXT: strd r1, r1, [sp, #8]
				; CHECK-NEXT: strd r1, r1, [sp]
				; CHECK-NEXT: str r1, [sp, #16]
				; CHECK-NEXT: vst1.64 {d16, d17}, [r3], r2
				; CHECK-NEXT: movs r2, #0
				; CHECK-NEXT: str r2, [r3]
				; CHECK-NEXT: str r1, [sp, #20]
				; CHECK-NEXT: bl callee
				; CHECK-NEXT: add sp, #24
				; CHECK-NEXT: pop {r7, pc}
				entry:
				%a = alloca %struct.af, align 8
				%0 = bitcast %struct.af* %a to i8*
				%1 = bitcast %struct.af* %a to i8*
				call void @llvm.memset.p0i8.i64(i8* align 8 %1, i8 -1, i64 24, i1 false)
				call void @llvm.memset.p0i8.i64(i8* align 8 %0, i8 0, i64 19, i1 false)
				call void @callee(%struct.af* %a)
				ret void
				}

				declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i1 immarg)
				declare void @callee(%struct.af*) local_unnamed_addr #1