This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
1
MemoryDependenceAnalysis.h
-
lib/Analysis/
-
Analysis/
-
MemoryDependenceAnalysis.cpp
-
test/Transforms/GVN/
-
Transforms/
-
GVN/
-
access_group.ll

Differential D108281

[GVN] Improve alias analysis for parallel accesses
AbandonedPublic

Authored by alban.bridonneau on Aug 18 2021, 2:15 AM.

Download Raw Diff

Details

Reviewers

david-arm
hfinkel
dfukalov
nikic
Meinersbur
fhahn

Summary

This patch helps to eliminate more loads in loops with the omp simd pragma,
as part of the GVN pass.
This pragma indicates groups of load/stores that we know are not
aliasing. Through better alias analysis, we can find more accurately
a memory access which provides the same data as the loads
we want to eliminate.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,250 ms	x64 debian > libomp.api::omp_get_wtime.c

Event Timeline

alban.bridonneau created this revision.Aug 18 2021, 2:15 AM

Herald added subscribers: jeroen.dobbelaere, bmahjour, hiraditya. · View Herald TranscriptAug 18 2021, 2:15 AM

alban.bridonneau requested review of this revision.Aug 18 2021, 2:15 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 2:15 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B120080: Diff 367150.Aug 18 2021, 2:42 AM

alban.bridonneau added reviewers: david-arm, hfinkel, dfukalov, nikic.Aug 18 2021, 3:06 AM

The build error seems unrelated to the this patch. There is another build next to mine, for a different patch, with the same unit test failing.

kiranchandramohan added reviewers: Meinersbur, fhahn.Aug 18 2021, 5:53 AM

Based on the current definition of llvm.loop.parallel_accesses it's not clear if it can be used to draw conclusions about aliasing. That MD only really provides guarantees about lack of loop-carried dependencies. The load/stores could still alias if they have loop-independent (ie intra-iteration) dependencies.

llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h
467	`belongToSameLoopParallelAccessGroup` might be a more appropriate name. Is the BB parameter really needed? If so a comment should be added to describe what it is.

Thanks for the review. I started looking at the metadata definition and previous messages from the mailing list. It seems you're right, and the metadata doesn't give the information i thought it did.
I'd love to get some confirmation, if someone knows for sure that this is an abuse of this particular metadata.
I'll come back to the IR of the original test case, and see if there some other information i should have used.

llvm.loop.parallel_accesses is only relative to the loop that it is attached to. The information is not useful if not also passing the loop under consideration. That is, the access may be "parallel" in one loop but not in another. For instance:

for (int i = 0; i < n; ++i) {
  for (int j = 0; j < n; ++j) {
    A[j] = ...; // DepInst
    use(A[j]); // QueryInst
  }
}

Clearly, DepInst and QueryInst do alias every time, yet the j-loop can be execute in parallel so llvm.loop.parallel_accesses can be attached to it. However, the i-loop cannot be executed in parallel.

This revision now requires changes to proceed.Aug 18 2021, 9:04 AM

Could I have some more information with regards to the proposed changes?

With regards to having the correct loop, i am guessing what we want is to use the loop directly related to the basic block of the query instruction. I am envisioning something like

BasicBlock *BB1= QueryInst->getParent();
Loop *L = Loops.getLoopFor(BB1);

In which case, we can remove the BB argument to the function created in this patch, and we can also ensure that both the QueryInst and the DepInst are from the same Basic Block, and and the same loop. Does that match what you had in mind?

With regards to using this metadata for better alias analysis, does that mean the particular case in this patch is alright with you?

In D108281#2952563, @alban.bridonneau wrote:

In which case, we can remove the BB argument to the function created in this patch, and we can also ensure that both the QueryInst and the DepInst are from the same Basic Block, and and the same loop. Does that match what you had in mind?

I think any use of llvm.loop.parallel_accesses that does not take into account which loop it is attached to case be correct. A simple mitigation is to test whether it is parallel in all loops. Still, there is non-reducible control flow and intra-iteration dependencies, as in your test case.

With regards to using this metadata for better alias analysis, does that mean the particular case in this patch is alright with you?

It would help if the test would mention what it is trying to achieve. A pseudocode equivalent would also be helpful.

I think the test is incorrect. %gepout may alias in the same iteration with %gep1 when %out == %in. From llvm.loop.parallel_accesses metadata one could at most infer that %gep1 does not alias with %gepout from other iterations of the omp.inner.for.body loop. Even that is probably also not valid to infer. Strictly, the metadata only states that the memory accesses are not in the way of parallelizing/vectorizing the loop. It may also well be that the store is a trivial store (always writes the same value the memory already contains, e.g. because one of %0/%1 is always 0[*]) and hence aliasing still might have occured.

It is controversial whether hardware has well-defined behavior for such writes.

Thank you for the clarifications!
I see now that the llvm.loop.parallel_accesses metadata can't be used in the way that we wanted. We have also come back to the original IR and found that there was a more appropriate way to tackle our issue. So I'll close this review. I'll also add a bit more information about our case below, in case you're interested.

Our case is basically the one shown in the unit test:

x1 = load @x 
store @y
x2 = load @x

We should be able to eliminate the 2nd load, but we can only do that if we know that it is not aliasing with the store.
The issue we tried to solve, is that we manage to do the elimination of the load when the loop is written simply, but if we use a pragma simd, and annotate the loop as parallel, we are not managing to eliminate the load. So the pragma that we use to create better performance is actually having the opposite effect.

We're going to work next on the Loop Vectorizer. When the loop is annotated parallel, the vectorizer doesn't create any runtime checks and just vectorizes. Without the parallel annotation, we create runtime memory checks, version the loop, and the Loop Vectorizer creates new metadata on the memory accesses (noalias/alias.scope). This is what we want to use in order to do more load elimination. Even though nothing is needed for vectorization, we might want some level of runtime checks in order to further optimize the vectorized loop.
That's the basic idea anyway, we'll need to work on it, to figure out exactly what to do.

alban.bridonneau abandoned this revision.Aug 23 2021, 1:08 AM

Thanks @alban.bridonneau for the explanation.

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

MemoryDependenceAnalysis.h

14 lines

lib/

Analysis/

MemoryDependenceAnalysis.cpp

49 lines

test/

Transforms/

GVN/

access_group.ll

35 lines

Diff 367150

llvm/include/llvm/Analysis/MemoryDependenceAnalysis.h

Show All 26 Lines
namespace llvm {		namespace llvm {

class AAResults;		class AAResults;
class AssumptionCache;		class AssumptionCache;
class BatchAAResults;		class BatchAAResults;
class DominatorTree;		class DominatorTree;
class PHITransAddr;		class PHITransAddr;
class PhiValues;		class PhiValues;
		class LoopInfo;

/// A memory dependence query can return one of three different answers.		/// A memory dependence query can return one of three different answers.
class MemDepResult {		class MemDepResult {
enum DepType {		enum DepType {
/// Clients of MemDep never see this.		/// Clients of MemDep never see this.
///		///
/// Entries with this marker occur in a LocalDeps map or NonLocalDeps map		/// Entries with this marker occur in a LocalDeps map or NonLocalDeps map
/// when the instruction they previously referenced was removed from		/// when the instruction they previously referenced was removed from
▲ Show 20 Lines • Show All 301 Lines • ▼ Show 20 Lines	private:

/// Current AA implementation, just a cache.		/// Current AA implementation, just a cache.
AAResults &AA;		AAResults &AA;
AssumptionCache &AC;		AssumptionCache &AC;
const TargetLibraryInfo &TLI;		const TargetLibraryInfo &TLI;
DominatorTree &DT;		DominatorTree &DT;
PhiValues &PV;		PhiValues &PV;
PredIteratorCache PredCache;		PredIteratorCache PredCache;
		LoopInfo &Loops;

unsigned DefaultBlockScanLimit;		unsigned DefaultBlockScanLimit;

/// Offsets to dependant clobber loads.		/// Offsets to dependant clobber loads.
using ClobberOffsetsMapType = DenseMap<LoadInst *, int32_t>;		using ClobberOffsetsMapType = DenseMap<LoadInst *, int32_t>;
ClobberOffsetsMapType ClobberOffsets;		ClobberOffsetsMapType ClobberOffsets;

public:		public:
MemoryDependenceResults(AAResults &AA, AssumptionCache &AC,		MemoryDependenceResults(AAResults &AA, AssumptionCache &AC,
const TargetLibraryInfo &TLI, DominatorTree &DT,		const TargetLibraryInfo &TLI, DominatorTree &DT,
PhiValues &PV, unsigned DefaultBlockScanLimit)		PhiValues &PV, LoopInfo &Loops,
: AA(AA), AC(AC), TLI(TLI), DT(DT), PV(PV),		unsigned DefaultBlockScanLimit)
		: AA(AA), AC(AC), TLI(TLI), DT(DT), PV(PV), Loops(Loops),
DefaultBlockScanLimit(DefaultBlockScanLimit) {}		DefaultBlockScanLimit(DefaultBlockScanLimit) {}

/// Handle invalidation in the new PM.		/// Handle invalidation in the new PM.
bool invalidate(Function &F, const PreservedAnalyses &PA,		bool invalidate(Function &F, const PreservedAnalyses &PA,
FunctionAnalysisManager::Invalidator &Inv);		FunctionAnalysisManager::Invalidator &Inv);

/// Some methods limit the number of instructions they will examine.		/// Some methods limit the number of instructions they will examine.
/// The return value of this method is the default limit that will be		/// The return value of this method is the default limit that will be
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	MemDepResult getPointerDependencyFrom(const MemoryLocation &Loc, bool isLoad,
BatchAAResults &BatchAA);		BatchAAResults &BatchAA);

MemDepResult		MemDepResult
getSimplePointerDependencyFrom(const MemoryLocation &MemLoc, bool isLoad,		getSimplePointerDependencyFrom(const MemoryLocation &MemLoc, bool isLoad,
BasicBlock::iterator ScanIt, BasicBlock *BB,		BasicBlock::iterator ScanIt, BasicBlock *BB,
Instruction QueryInst, unsigned Limit,		Instruction QueryInst, unsigned Limit,
BatchAAResults &BatchAA);		BatchAAResults &BatchAA);

		/// Checks if QueryInst and DepInst are in the same access group
		///
		/// Checks the llvm.loop.parallel_access, to help the alias
		/// analysis and find a more appropriate dependent instruction.
		bool isSameLoopAccessGroup(Instruction QueryInst, Instruction DepInst,
		bmahjourUnsubmitted Not Done Reply Inline Actions `belongToSameLoopParallelAccessGroup` might be a more appropriate name. Is the BB parameter really needed? If so a comment should be added to describe what it is. bmahjour: `belongToSameLoopParallelAccessGroup` might be a more appropriate name. Is the BB parameter…
		BasicBlock *BB);

/// This analysis looks for other loads and stores with invariant.group		/// This analysis looks for other loads and stores with invariant.group
/// metadata and the same pointer operand. Returns Unknown if it does not		/// metadata and the same pointer operand. Returns Unknown if it does not
/// find anything, and Def if it can be assumed that 2 instructions load or		/// find anything, and Def if it can be assumed that 2 instructions load or
/// store the same value and NonLocal which indicate that non-local Def was		/// store the same value and NonLocal which indicate that non-local Def was
/// found, which can be retrieved by calling getNonLocalPointerDependency		/// found, which can be retrieved by calling getNonLocalPointerDependency
/// with the same queried instruction.		/// with the same queried instruction.
MemDepResult getInvariantGroupPointerDependency(LoadInst LI, BasicBlock BB);		MemDepResult getInvariantGroupPointerDependency(LoadInst LI, BasicBlock BB);

▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/lib/Analysis/MemoryDependenceAnalysis.cpp

Show All 15 Lines
#include "llvm/Analysis/MemoryDependenceAnalysis.h"		#include "llvm/Analysis/MemoryDependenceAnalysis.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/MemoryLocation.h"		#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/PHITransAddr.h"		#include "llvm/Analysis/PHITransAddr.h"
#include "llvm/Analysis/PhiValues.h"		#include "llvm/Analysis/PhiValues.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
▲ Show 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	MemoryDependenceResults::getInvariantGroupPointerDependency(LoadInst *LI,
// result.		// result.
NonLocalDefsCache.try_emplace(		NonLocalDefsCache.try_emplace(
LI, NonLocalDepResult(ClosestDependency->getParent(),		LI, NonLocalDepResult(ClosestDependency->getParent(),
MemDepResult::getDef(ClosestDependency), nullptr));		MemDepResult::getDef(ClosestDependency), nullptr));
ReverseNonLocalDefsCache[ClosestDependency].insert(LI);		ReverseNonLocalDefsCache[ClosestDependency].insert(LI);
return MemDepResult::getNonLocal();		return MemDepResult::getNonLocal();
}		}

		bool MemoryDependenceResults::isSameLoopAccessGroup(Instruction *QueryInst,
		Instruction *DepInst,
		BasicBlock *BB) {
		if (!QueryInst)
		return false;

		MDNode *MD1 = QueryInst->getMetadata(LLVMContext::MD_access_group);
		MDNode *MD2 = DepInst->getMetadata(LLVMContext::MD_access_group);
		if (!MD1 \|\| (MD1 != MD2))
		return false;

		// If the store and load are both part of the same access group,
		// and this access group is referenced in the loop metadata, then
		// we can assume that the store does not alias with the load.
		if (const Loop *L = Loops.getLoopFor(BB)) {
		if (MDNode *ParallelAccesses =
		findOptionMDForLoop(L, "llvm.loop.parallel_accesses")) {
		for (const MDOperand &PMD : drop_begin(ParallelAccesses->operands(), 1)) {
		MDNode *AccGroup = cast<MDNode>(PMD.get());
		if (AccGroup == MD1) {
		return true;
		}
		}
		}
		}
		return false;
		}

MemDepResult MemoryDependenceResults::getSimplePointerDependencyFrom(		MemDepResult MemoryDependenceResults::getSimplePointerDependencyFrom(
const MemoryLocation &MemLoc, bool isLoad, BasicBlock::iterator ScanIt,		const MemoryLocation &MemLoc, bool isLoad, BasicBlock::iterator ScanIt,
BasicBlock BB, Instruction QueryInst, unsigned *Limit,		BasicBlock BB, Instruction QueryInst, unsigned *Limit,
BatchAAResults &BatchAA) {		BatchAAResults &BatchAA) {
bool isInvariantLoad = false;		bool isInvariantLoad = false;

unsigned DefaultLimit = getDefaultBlockScanLimit();		unsigned DefaultLimit = getDefaultBlockScanLimit();
if (!Limit)		if (!Limit)
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	if (LoadInst *LI = dyn_cast<LoadInst>(Inst)) {
return MemDepResult::getClobber(LI);		return MemDepResult::getClobber(LI);
}		}

MemoryLocation LoadLoc = MemoryLocation::get(LI);		MemoryLocation LoadLoc = MemoryLocation::get(LI);

// If we found a pointer, check if it could be the same as our pointer.		// If we found a pointer, check if it could be the same as our pointer.
AliasResult R = BatchAA.alias(LoadLoc, MemLoc);		AliasResult R = BatchAA.alias(LoadLoc, MemLoc);

		if ((R != AliasResult::MustAlias) &&
		isSameLoopAccessGroup(QueryInst, LI, BB))
		R = AliasResult::NoAlias;

if (isLoad) {		if (isLoad) {
if (R == AliasResult::NoAlias)		if (R == AliasResult::NoAlias)
continue;		continue;

// Must aliased loads are defs of each other.		// Must aliased loads are defs of each other.
if (R == AliasResult::MustAlias)		if (R == AliasResult::MustAlias)
return MemDepResult::getDef(Inst);		return MemDepResult::getDef(Inst);

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	if (StoreInst *SI = dyn_cast<StoreInst>(Inst)) {
// Ok, this store might clobber the query pointer. Check to see if it is		// Ok, this store might clobber the query pointer. Check to see if it is
// a must alias: in this case, we want to return this as a def.		// a must alias: in this case, we want to return this as a def.
// FIXME: Use ModRefInfo::Must bit from getModRefInfo call above.		// FIXME: Use ModRefInfo::Must bit from getModRefInfo call above.
MemoryLocation StoreLoc = MemoryLocation::get(SI);		MemoryLocation StoreLoc = MemoryLocation::get(SI);

// If we found a pointer, check if it could be the same as our pointer.		// If we found a pointer, check if it could be the same as our pointer.
AliasResult R = BatchAA.alias(StoreLoc, MemLoc);		AliasResult R = BatchAA.alias(StoreLoc, MemLoc);

		if ((R != AliasResult::MustAlias) &&
		isSameLoopAccessGroup(QueryInst, SI, BB))
		R = AliasResult::NoAlias;

if (R == AliasResult::NoAlias)		if (R == AliasResult::NoAlias)
continue;		continue;
if (R == AliasResult::MustAlias)		if (R == AliasResult::MustAlias)
return MemDepResult::getDef(Inst);		return MemDepResult::getDef(Inst);
if (isInvariantLoad)		if (isInvariantLoad)
continue;		continue;
return MemDepResult::getClobber(Inst);		return MemDepResult::getClobber(Inst);
}		}
▲ Show 20 Lines • Show All 1,144 Lines • ▼ Show 20 Lines

MemoryDependenceResults		MemoryDependenceResults
MemoryDependenceAnalysis::run(Function &F, FunctionAnalysisManager &AM) {		MemoryDependenceAnalysis::run(Function &F, FunctionAnalysisManager &AM) {
auto &AA = AM.getResult<AAManager>(F);		auto &AA = AM.getResult<AAManager>(F);
auto &AC = AM.getResult<AssumptionAnalysis>(F);		auto &AC = AM.getResult<AssumptionAnalysis>(F);
auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);		auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);		auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &PV = AM.getResult<PhiValuesAnalysis>(F);		auto &PV = AM.getResult<PhiValuesAnalysis>(F);
return MemoryDependenceResults(AA, AC, TLI, DT, PV, DefaultBlockScanLimit);		auto &Loops = AM.getResult<LoopAnalysis>(F);
		return MemoryDependenceResults(AA, AC, TLI, DT, PV, Loops,
		DefaultBlockScanLimit);
}		}

char MemoryDependenceWrapperPass::ID = 0;		char MemoryDependenceWrapperPass::ID = 0;

INITIALIZE_PASS_BEGIN(MemoryDependenceWrapperPass, "memdep",		INITIALIZE_PASS_BEGIN(MemoryDependenceWrapperPass, "memdep",
"Memory Dependence Analysis", false, true)		"Memory Dependence Analysis", false, true)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(PhiValuesWrapperPass)		INITIALIZE_PASS_DEPENDENCY(PhiValuesWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
INITIALIZE_PASS_END(MemoryDependenceWrapperPass, "memdep",		INITIALIZE_PASS_END(MemoryDependenceWrapperPass, "memdep",
"Memory Dependence Analysis", false, true)		"Memory Dependence Analysis", false, true)

MemoryDependenceWrapperPass::MemoryDependenceWrapperPass() : FunctionPass(ID) {		MemoryDependenceWrapperPass::MemoryDependenceWrapperPass() : FunctionPass(ID) {
initializeMemoryDependenceWrapperPassPass(*PassRegistry::getPassRegistry());		initializeMemoryDependenceWrapperPassPass(*PassRegistry::getPassRegistry());
}		}

MemoryDependenceWrapperPass::~MemoryDependenceWrapperPass() = default;		MemoryDependenceWrapperPass::~MemoryDependenceWrapperPass() = default;

void MemoryDependenceWrapperPass::releaseMemory() {		void MemoryDependenceWrapperPass::releaseMemory() {
MemDep.reset();		MemDep.reset();
}		}

void MemoryDependenceWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {		void MemoryDependenceWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<PhiValuesWrapperPass>();		AU.addRequired<PhiValuesWrapperPass>();
		AU.addRequired<LoopInfoWrapperPass>();
AU.addRequiredTransitive<AAResultsWrapperPass>();		AU.addRequiredTransitive<AAResultsWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();		AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
}		}

bool MemoryDependenceResults::invalidate(Function &F, const PreservedAnalyses &PA,		bool MemoryDependenceResults::invalidate(Function &F, const PreservedAnalyses &PA,
FunctionAnalysisManager::Invalidator &Inv) {		FunctionAnalysisManager::Invalidator &Inv) {
// Check whether our analysis is preserved.		// Check whether our analysis is preserved.
auto PAC = PA.getChecker<MemoryDependenceAnalysis>();		auto PAC = PA.getChecker<MemoryDependenceAnalysis>();
if (!PAC.preserved() && !PAC.preservedSet<AllAnalysesOn<Function>>())		if (!PAC.preserved() && !PAC.preservedSet<AllAnalysesOn<Function>>())
// If not, give up now.		// If not, give up now.
return true;		return true;

// Check whether the analyses we depend on became invalid for any reason.		// Check whether the analyses we depend on became invalid for any reason.
if (Inv.invalidate<AAManager>(F, PA) \|\|		if (Inv.invalidate<AAManager>(F, PA) \|\|
Inv.invalidate<AssumptionAnalysis>(F, PA) \|\|		Inv.invalidate<AssumptionAnalysis>(F, PA) \|\|
Inv.invalidate<DominatorTreeAnalysis>(F, PA) \|\|		Inv.invalidate<DominatorTreeAnalysis>(F, PA) \|\|
Inv.invalidate<PhiValuesAnalysis>(F, PA))		Inv.invalidate<PhiValuesAnalysis>(F, PA) \|\|
		Inv.invalidate<LoopAnalysis>(F, PA))
return true;		return true;

// Otherwise this analysis result remains valid.		// Otherwise this analysis result remains valid.
return false;		return false;
}		}

unsigned MemoryDependenceResults::getDefaultBlockScanLimit() const {		unsigned MemoryDependenceResults::getDefaultBlockScanLimit() const {
return DefaultBlockScanLimit;		return DefaultBlockScanLimit;
}		}

bool MemoryDependenceWrapperPass::runOnFunction(Function &F) {		bool MemoryDependenceWrapperPass::runOnFunction(Function &F) {
auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();		auto &AA = getAnalysis<AAResultsWrapperPass>().getAAResults();
auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		auto &AC = getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);		auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();		auto &DT = getAnalysis<DominatorTreeWrapperPass>().getDomTree();
auto &PV = getAnalysis<PhiValuesWrapperPass>().getResult();		auto &PV = getAnalysis<PhiValuesWrapperPass>().getResult();
MemDep.emplace(AA, AC, TLI, DT, PV, BlockScanLimit);		auto &Loops = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
		MemDep.emplace(AA, AC, TLI, DT, PV, Loops, BlockScanLimit);
return false;		return false;
}		}

llvm/test/Transforms/GVN/access_group.ll

This file was added.

				; REQUIRES: asserts
				; RUN: opt -gvn -S < %s -debug -o /dev/null 2>&1 \| FileCheck %s

				; CHECK: GVN removed: %2 = load double, double* %gep1, align 8, !llvm.access.group !0
				; CHECK: GVN removed: %2 = load double, double* %gep2, align 8, !llvm.access.group !0

				define void @foo(double* %out, double* %in1, double* %in2, i64 %n) {
				entry:
				br label %omp.inner.for.body

				omp.inner.for.body:
				%iv = phi i64 [ 0, %entry ], [ %iv.next, %omp.inner.for.body ]
				%gep1 = getelementptr inbounds double, double* %in1, i64 %iv
				%gep2 = getelementptr inbounds double, double* %in2, i64 %iv
				%0 = load double, double* %gep1, !llvm.access.group !0
				%1 = load double, double* %gep2, !llvm.access.group !0
				%mul = fmul contract double %0, %1
				%gepout = getelementptr inbounds double, double* %out, i64 %iv
				store double %mul, double* %gepout, !llvm.access.group !0
				%2 = load double, double* %gep1, !llvm.access.group !0
				%3 = load double, double* %gep2, !llvm.access.group !0
				%add = fadd contract double %2, %3
				store double %add, double* %gepout, !llvm.access.group !0
				%iv.next = add i64 %iv, 1
				%exitcond = icmp eq i64 %iv.next, %n
				br i1 %exitcond, label %simd.if.end, label %omp.inner.for.body, !llvm.loop !1

				simd.if.end:
				ret void
				}

				!0 = distinct !{}
				!1 = distinct !{!1, !2, !3}
				!2 = !{!"llvm.loop.parallel_accesses", !0}
				!3 = !{!"llvm.loop.vectorize.enable", i1 true}