Download Raw Diff

Details

Reviewers

fhahn
nikic
mkazantsev

Commits

rG220f6e5271f2: [SimplifyCFG] Ignore ephemeral values when counting insts for threading

Summary

Ignore ephemeral values (only feeding llvm.assume intrinsics) when
computing the instruction count to decide if a block is small enough for
threading. This is similar to the handling of these values in the
InlineCost computation. These instructions will eventually be removed
and shouldn't count against code size (similar to the existing ignoring
of phis).

Without this change, when enabling -fwhole-program-vtables, which causes
type test / assume sequences to be inserted by clang, we can get
different threading decisions. In particular, when building with
instrumentation FDO it can affect the optimizations decisions before FDO
matching, leading to some mismatches.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tejohnson created this revision.Apr 28 2021, 3:27 PM

Herald added subscribers: wenlei, hiraditya. · View Herald TranscriptApr 28 2021, 3:27 PM

tejohnson requested review of this revision.Apr 28 2021, 3:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2021, 3:27 PM

tejohnson added inline comments.Apr 28 2021, 3:29 PM

llvm/test/Transforms/SimplifyCFG/unprofitable-pr.ll
3	The change to the max small block size is needed because the below test cases all include an llvm.assume sequence, which is now ignored.

Harbormaster completed remote builds in B101509: Diff 341331.Apr 28 2021, 4:57 PM

Unless I'm missing something, this has a cache invalidation issue and will likely lead to non-deterministic builds. Say you have a dead block with an assume, which gets added to ephemeral values. Then that block and the instructions in it are removed. Now EphValues contains dangling pointers. Then new instructions get allocated and reuse the same memory. Now EphValues claims that values are ephemeral that aren't ephemeral.

In D101494#2726074, @nikic wrote:

Unless I'm missing something, this has a cache invalidation issue and will likely lead to non-deterministic builds. Say you have a dead block with an assume, which gets added to ephemeral values. Then that block and the instructions in it are removed. Now EphValues contains dangling pointers. Then new instructions get allocated and reuse the same memory. Now EphValues claims that values are ephemeral that aren't ephemeral.

It's basically the same problem we have/had with LoopHeaders, yep.

In D101494#2726080, @lebedev.ri wrote:

In D101494#2726074, @nikic wrote:

Unless I'm missing something, this has a cache invalidation issue and will likely lead to non-deterministic builds. Say you have a dead block with an assume, which gets added to ephemeral values. Then that block and the instructions in it are removed. Now EphValues contains dangling pointers. Then new instructions get allocated and reuse the same memory. Now EphValues claims that values are ephemeral that aren't ephemeral.

It's basically the same problem we have/had with LoopHeaders, yep.

Ah, thanks for pointing that out. I see there is a loop scoped collectEphemeralValues, is that how it was solved for LoopHeaders? I should probably just add a BB scoped collectEphemeralValues and use it in BlockIsSimpleEnoughToThreadThrough. That would avoid this issue and also avoid the need to pass it around.

Sorry, I know nothing about ephemereal values, so I don't think I can give any useful feedback here.

Compute ephemeral values on per-BB basis when needed

Harbormaster completed remote builds in B102024: Diff 342045.Apr 30 2021, 4:56 PM

nikic added inline comments.May 3 2021, 2:31 PM

llvm/lib/Analysis/CodeMetrics.cpp
131	So there's two potential compile-time problems here: The first is that while this starts off with assume inside the block, it may scan uses outside the block as well. Additionally it scans over all assumes in order to find those in the block. I'm not sure whether this is really important in practice, but having seen pathological ephemeral value collection during inlining, I'm being a bit cautious here. I think for the particular case it is used for here, it might make the most sense to not collect ephemeral values upfront, instead compute them on the fly. We can do this by changing the direction of the instruction walk from end to start, and then doing something like: SmallPtrSet<const Value , 32> EphValues; auto IsEphemeral = [&](const Value V) { if (isa<AssumeInst>(V)) return true; return isSafeToSpeculativelyExecute(V) && all_of(V->users(), [&](const User *U) { return EphValues.count(U); }); }; for (Instruction &I : reverse()) { if (IsEphemeral(&I)) EphValues.insert(&I); // Otherwise normal code. } This also has the advantage that we don't need to compute any ephemeral values past the ten or so instructions we look at.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2449	This comment is confusing, in that ephemeral values will not be deleted while threading (unlike phis). They will only be deleted during codegen.

tejohnson marked an inline comment as done.May 4 2021, 6:35 PM

tejohnson added inline comments.

llvm/lib/Analysis/CodeMetrics.cpp
131	I was concerned about this too at first. I collected some stats for a large application build and found that on average there were very few assumptions being checked. That being said, pathological cases could occur, and I agree that it is straightforward to reverse the loop and collect on demand, so I changed it to do that.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
2440	One issue with the reversed loop is that the Size is checked against the limit the iteration after it is incremented. When iterating in forward order, this means that the branch is not counted against the limit, since the loop exits before the subsequent check. But with the reversed order it gets counted and the test started failing since we no longer did the threading. I decided to consolidate the Size increment and check to make it more consistent, and simply bumped up the default limit and the one used in the test so that there is no change to the status quo in terms of non-ephemeral values.

Address comments

Harbormaster completed remote builds in B102660: Diff 342929.May 4 2021, 7:24 PM

LGTM, though personally I'd keep the current value of the limit and change the condition to Size++ > MaxSmallBlockSize instead (using post-increment instead of pre-increment). I think that should retain the behavior. I'm okay either way though.

This revision is now accepted and ready to land.May 8 2021, 1:23 PM

In D101494#2746229, @nikic wrote:

LGTM, though personally I'd keep the current value of the limit and change the condition to Size++ > MaxSmallBlockSize instead (using post-increment instead of pre-increment). I think that should retain the behavior. I'm okay either way though.

Yep, went ahead and switched to that approach.

Address comment

This revision was landed with ongoing or failed builds.May 9 2021, 7:08 PM

Closed by commit rG220f6e5271f2: [SimplifyCFG] Ignore ephemeral values when counting insts for threading (authored by tejohnson). · Explain Why

This revision was automatically updated to reflect the committed changes.

tejohnson added a commit: rG220f6e5271f2: [SimplifyCFG] Ignore ephemeral values when counting insts for threading.

Harbormaster completed remote builds in B103416: Diff 343952.May 9 2021, 8:05 PM

Diff 342045

llvm/include/llvm/Analysis/CodeMetrics.h

Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines	struct CodeMetrics {
/// or similar intrinsics in the loop).		/// or similar intrinsics in the loop).
static void collectEphemeralValues(const Loop L, AssumptionCache AC,		static void collectEphemeralValues(const Loop L, AssumptionCache AC,
SmallPtrSetImpl<const Value *> &EphValues);		SmallPtrSetImpl<const Value *> &EphValues);

/// Collect a functions's ephemeral values (those used only by an		/// Collect a functions's ephemeral values (those used only by an
/// assume or similar intrinsics in the function).		/// assume or similar intrinsics in the function).
static void collectEphemeralValues(const Function L, AssumptionCache AC,		static void collectEphemeralValues(const Function L, AssumptionCache AC,
SmallPtrSetImpl<const Value *> &EphValues);		SmallPtrSetImpl<const Value *> &EphValues);

		/// Collect a basic block's ephemeral values (those used only by an
		/// assume or similar intrinsics in the basic block).
		static void collectEphemeralValues(const BasicBlock BB, AssumptionCache AC,
		SmallPtrSetImpl<const Value *> &EphValues);
};		};

}		}

#endif		#endif

llvm/lib/Analysis/CodeMetrics.cpp

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	for (auto &AssumeVH : AC->assumptions()) {

if (EphValues.insert(I).second)		if (EphValues.insert(I).second)
appendSpeculatableOperands(I, Visited, Worklist);		appendSpeculatableOperands(I, Visited, Worklist);
}		}

completeEphemeralValues(Visited, Worklist, EphValues);		completeEphemeralValues(Visited, Worklist, EphValues);
}		}

		void CodeMetrics::collectEphemeralValues(
		const BasicBlock BB, AssumptionCache AC,
		SmallPtrSetImpl<const Value *> &EphValues) {
		SmallPtrSet<const Value *, 32> Visited;
		SmallVector<const Value *, 16> Worklist;

		for (auto &AssumeVH : AC->assumptions()) {
		if (!AssumeVH)
		continue;
		Instruction *I = cast<Instruction>(AssumeVH);
		if (I->getParent() != BB)
		continue;

		if (EphValues.insert(I).second)
		appendSpeculatableOperands(I, Visited, Worklist);
		}

		completeEphemeralValues(Visited, Worklist, EphValues);
		nikicUnsubmitted Not Done Reply Inline Actions So there's two potential compile-time problems here: The first is that while this starts off with assume inside the block, it may scan uses outside the block as well. Additionally it scans over all assumes in order to find those in the block. I'm not sure whether this is really important in practice, but having seen pathological ephemeral value collection during inlining, I'm being a bit cautious here. I think for the particular case it is used for here, it might make the most sense to not collect ephemeral values upfront, instead compute them on the fly. We can do this by changing the direction of the instruction walk from end to start, and then doing something like: SmallPtrSet<const Value , 32> EphValues; auto IsEphemeral = [&](const Value V) { if (isa<AssumeInst>(V)) return true; return isSafeToSpeculativelyExecute(V) && all_of(V->users(), [&](const User U) { return EphValues.count(U); }); }; for (Instruction &I : reverse()) { if (IsEphemeral(&I)) EphValues.insert(&I); // Otherwise normal code. } This also has the advantage that we don't need to compute any ephemeral values past the ten or so instructions we look at. nikic:* So there's two potential compile-time problems here: The first is that while this starts off…
		tejohnsonAuthorUnsubmitted Done Reply Inline Actions I was concerned about this too at first. I collected some stats for a large application build and found that on average there were very few assumptions being checked. That being said, pathological cases could occur, and I agree that it is straightforward to reverse the loop and collect on demand, so I changed it to do that. tejohnson: I was concerned about this too at first. I collected some stats for a large application build…
		}

/// Fill in the current structure with information gleaned from the specified		/// Fill in the current structure with information gleaned from the specified
/// block.		/// block.
void CodeMetrics::analyzeBasicBlock(		void CodeMetrics::analyzeBasicBlock(
const BasicBlock *BB, const TargetTransformInfo &TTI,		const BasicBlock *BB, const TargetTransformInfo &TTI,
const SmallPtrSetImpl<const Value *> &EphValues, bool PrepareForLTO) {		const SmallPtrSetImpl<const Value *> &EphValues, bool PrepareForLTO) {
++NumBlocks;		++NumBlocks;
// Use a proxy variable for NumInsts of type InstructionCost, so that it can		// Use a proxy variable for NumInsts of type InstructionCost, so that it can
// use InstructionCost's arithmetic properties such as saturation when this		// use InstructionCost's arithmetic properties such as saturation when this
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

Show All 19 Lines
#include "llvm/ADT/Sequence.h"		#include "llvm/ADT/Sequence.h"
#include "llvm/ADT/SetOperations.h"		#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
		#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/EHPersonalities.h"		#include "llvm/Analysis/EHPersonalities.h"
#include "llvm/Analysis/GuardUtils.h"		#include "llvm/Analysis/GuardUtils.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/MemorySSA.h"		#include "llvm/Analysis/MemorySSA.h"
#include "llvm/Analysis/MemorySSAUpdater.h"		#include "llvm/Analysis/MemorySSAUpdater.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
▲ Show 20 Lines • Show All 2,386 Lines • ▼ Show 20 Lines	bool SimplifyCFGOpt::SpeculativelyExecuteBB(BranchInst BI, BasicBlock ThenBB,
for (Instruction *I : SpeculatedDbgIntrinsics)		for (Instruction *I : SpeculatedDbgIntrinsics)
I->eraseFromParent();		I->eraseFromParent();

++NumSpeculations;		++NumSpeculations;
return true;		return true;
}		}

/// Return true if we can thread a branch across this block.		/// Return true if we can thread a branch across this block.
static bool BlockIsSimpleEnoughToThreadThrough(BasicBlock *BB) {		static bool BlockIsSimpleEnoughToThreadThrough(BasicBlock *BB,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'BlockIsSimpleEnoughToThreadThrough' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'BlockIsSimpleEnoughToThreadThrough'…
		AssumptionCache *AC) {
int Size = 0;		int Size = 0;

		SmallPtrSet<const Value *, 32> EphValues;
		if (AC)
		CodeMetrics::collectEphemeralValues(BB, AC, EphValues);

for (Instruction &I : BB->instructionsWithoutDebug()) {		for (Instruction &I : BB->instructionsWithoutDebug()) {
if (Size > MaxSmallBlockSize)		if (Size > MaxSmallBlockSize)
		tejohnsonAuthorUnsubmitted Done Reply Inline Actions One issue with the reversed loop is that the Size is checked against the limit the iteration after it is incremented. When iterating in forward order, this means that the branch is not counted against the limit, since the loop exits before the subsequent check. But with the reversed order it gets counted and the test started failing since we no longer did the threading. I decided to consolidate the Size increment and check to make it more consistent, and simply bumped up the default limit and the one used in the test so that there is no change to the status quo in terms of non-ephemeral values. tejohnson: One issue with the reversed loop is that the Size is checked against the limit the iteration…
return false; // Don't clone large BB's.		return false; // Don't clone large BB's.

// Can't fold blocks that contain noduplicate or convergent calls.		// Can't fold blocks that contain noduplicate or convergent calls.
if (CallInst *CI = dyn_cast<CallInst>(&I))		if (CallInst *CI = dyn_cast<CallInst>(&I))
if (CI->cannotDuplicate() \|\| CI->isConvergent())		if (CI->cannotDuplicate() \|\| CI->isConvergent())
return false;		return false;

// We will delete Phis while threading, so Phis should not be accounted in		// We will delete Phis while threading, so Phis should not be accounted in
// block's size		// block's size. Ditto for ephemeral values which will also be deleted.
		nikicUnsubmitted Done Reply Inline Actions This comment is confusing, in that ephemeral values will not be deleted while threading (unlike phis). They will only be deleted during codegen. nikic: This comment is confusing, in that ephemeral values will not be deleted while threading (unlike…
if (!isa<PHINode>(I))		if (!isa<PHINode>(I) && !EphValues.count(&I))
++Size;		++Size;

// We can only support instructions that do not define values that are		// We can only support instructions that do not define values that are
// live outside of the current basic block.		// live outside of the current basic block.
for (User *U : I.users()) {		for (User *U : I.users()) {
Instruction *UI = cast<Instruction>(U);		Instruction *UI = cast<Instruction>(U);
if (UI->getParent() != BB \|\| isa<PHINode>(UI))		if (UI->getParent() != BB \|\| isa<PHINode>(UI))
return false;		return false;
Show All 19 Lines	static bool FoldCondBranchOnPHI(BranchInst BI, DomTreeUpdater DTU,

// Degenerate case of a single entry PHI.		// Degenerate case of a single entry PHI.
if (PN->getNumIncomingValues() == 1) {		if (PN->getNumIncomingValues() == 1) {
FoldSingleEntryPHINodes(PN->getParent());		FoldSingleEntryPHINodes(PN->getParent());
return true;		return true;
}		}

// Now we know that this block has multiple preds and two succs.		// Now we know that this block has multiple preds and two succs.
if (!BlockIsSimpleEnoughToThreadThrough(BB))		if (!BlockIsSimpleEnoughToThreadThrough(BB, AC))
return false;		return false;

// Okay, this is a simple enough basic block. See if any phi values are		// Okay, this is a simple enough basic block. See if any phi values are
// constants.		// constants.
for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {		for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
ConstantInt *CB = dyn_cast<ConstantInt>(PN->getIncomingValue(i));		ConstantInt *CB = dyn_cast<ConstantInt>(PN->getIncomingValue(i));
if (!CB \|\| !CB->getType()->isIntegerTy(1))		if (!CB \|\| !CB->getType()->isIntegerTy(1))
continue;		continue;
▲ Show 20 Lines • Show All 1,053 Lines • ▼ Show 20 Lines

/// If we have a conditional branch as a predecessor of another block,		/// If we have a conditional branch as a predecessor of another block,
/// this function tries to simplify it. We know		/// this function tries to simplify it. We know
/// that PBI and BI are both conditional branches, and BI is in one of the		/// that PBI and BI are both conditional branches, and BI is in one of the
/// successor blocks of PBI - PBI branches to BI.		/// successor blocks of PBI - PBI branches to BI.
static bool SimplifyCondBranchToCondBranch(BranchInst PBI, BranchInst BI,		static bool SimplifyCondBranchToCondBranch(BranchInst PBI, BranchInst BI,
DomTreeUpdater *DTU,		DomTreeUpdater *DTU,
const DataLayout &DL,		const DataLayout &DL,
const TargetTransformInfo &TTI) {		const TargetTransformInfo &TTI,
		AssumptionCache *AC) {
assert(PBI->isConditional() && BI->isConditional());		assert(PBI->isConditional() && BI->isConditional());
BasicBlock *BB = BI->getParent();		BasicBlock *BB = BI->getParent();

// If this block ends with a branch instruction, and if there is a		// If this block ends with a branch instruction, and if there is a
// predecessor that ends on a branch of the same condition, make		// predecessor that ends on a branch of the same condition, make
// this conditional branch redundant.		// this conditional branch redundant.
if (PBI->getCondition() == BI->getCondition() &&		if (PBI->getCondition() == BI->getCondition() &&
PBI->getSuccessor(0) != PBI->getSuccessor(1)) {		PBI->getSuccessor(0) != PBI->getSuccessor(1)) {
// Okay, the outcome of this conditional branch is statically		// Okay, the outcome of this conditional branch is statically
// knowable. If this block had a single pred, handle specially.		// knowable. If this block had a single pred, handle specially.
if (BB->getSinglePredecessor()) {		if (BB->getSinglePredecessor()) {
// Turn this into a branch on constant.		// Turn this into a branch on constant.
bool CondIsTrue = PBI->getSuccessor(0) == BB;		bool CondIsTrue = PBI->getSuccessor(0) == BB;
BI->setCondition(		BI->setCondition(
ConstantInt::get(Type::getInt1Ty(BB->getContext()), CondIsTrue));		ConstantInt::get(Type::getInt1Ty(BB->getContext()), CondIsTrue));
return true; // Nuke the branch on constant.		return true; // Nuke the branch on constant.
}		}

// Otherwise, if there are multiple predecessors, insert a PHI that merges		// Otherwise, if there are multiple predecessors, insert a PHI that merges
// in the constant and simplify the block result. Subsequent passes of		// in the constant and simplify the block result. Subsequent passes of
// simplifycfg will thread the block.		// simplifycfg will thread the block.
if (BlockIsSimpleEnoughToThreadThrough(BB)) {		if (BlockIsSimpleEnoughToThreadThrough(BB, AC)) {
pred_iterator PB = pred_begin(BB), PE = pred_end(BB);		pred_iterator PB = pred_begin(BB), PE = pred_end(BB);
PHINode *NewPN = PHINode::Create(		PHINode *NewPN = PHINode::Create(
Type::getInt1Ty(BB->getContext()), std::distance(PB, PE),		Type::getInt1Ty(BB->getContext()), std::distance(PB, PE),
BI->getCondition()->getName() + ".pr", &BB->front());		BI->getCondition()->getName() + ".pr", &BB->front());
// Okay, we're going to insert the PHI node. Since PBI is not the only		// Okay, we're going to insert the PHI node. Since PBI is not the only
// predecessor, compute the PHI'd conditional value for all of the preds.		// predecessor, compute the PHI'd conditional value for all of the preds.
// Any predecessor where the condition is not computable we keep symbolic.		// Any predecessor where the condition is not computable we keep symbolic.
for (pred_iterator PI = PB; PI != PE; ++PI) {		for (pred_iterator PI = PB; PI != PE; ++PI) {
▲ Show 20 Lines • Show All 2,910 Lines • ▼ Show 20 Lines	if (PHINode *PN = dyn_cast<PHINode>(BI->getCondition()))
if (PN->getParent() == BI->getParent())		if (PN->getParent() == BI->getParent())
if (FoldCondBranchOnPHI(BI, DTU, DL, Options.AC))		if (FoldCondBranchOnPHI(BI, DTU, DL, Options.AC))
return requestResimplify();		return requestResimplify();

// Scan predecessor blocks for conditional branches.		// Scan predecessor blocks for conditional branches.
for (BasicBlock *Pred : predecessors(BB))		for (BasicBlock *Pred : predecessors(BB))
if (BranchInst *PBI = dyn_cast<BranchInst>(Pred->getTerminator()))		if (BranchInst *PBI = dyn_cast<BranchInst>(Pred->getTerminator()))
if (PBI != BI && PBI->isConditional())		if (PBI != BI && PBI->isConditional())
if (SimplifyCondBranchToCondBranch(PBI, BI, DTU, DL, TTI))		if (SimplifyCondBranchToCondBranch(PBI, BI, DTU, DL, TTI, Options.AC))
return requestResimplify();		return requestResimplify();

// Look for diamond patterns.		// Look for diamond patterns.
if (MergeCondStores)		if (MergeCondStores)
if (BasicBlock *PrevBB = allPredecessorsComeFromSameSource(BB))		if (BasicBlock *PrevBB = allPredecessorsComeFromSameSource(BB))
if (BranchInst *PBI = dyn_cast<BranchInst>(PrevBB->getTerminator()))		if (BranchInst *PBI = dyn_cast<BranchInst>(PrevBB->getTerminator()))
if (PBI != BI && PBI->isConditional())		if (PBI != BI && PBI->isConditional())
if (mergeConditionalStores(PBI, BI, DTU, DL, TTI))		if (mergeConditionalStores(PBI, BI, DTU, DL, TTI))
▲ Show 20 Lines • Show All 217 Lines • Show Last 20 Lines

llvm/test/Transforms/SimplifyCFG/unprofitable-pr.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -simplifycfg-max-small-block-size=10 -S < %s \| FileCheck %s			; RUN: opt -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -simplifycfg-max-small-block-size=6 -S < %s \| FileCheck %s
	; RUN: opt -passes=simplify-cfg -simplifycfg-max-small-block-size=10 -S < %s \| FileCheck %s			; RUN: opt -passes=simplify-cfg -simplifycfg-max-small-block-size=6 -S < %s \| FileCheck %s
				tejohnsonAuthorUnsubmitted Done Reply Inline Actions The change to the max small block size is needed because the below test cases all include an llvm.assume sequence, which is now ignored. tejohnson: The change to the max small block size is needed because the below test cases all include an…

	target datalayout = "e-p:64:64-p5:32:32-A5"			target datalayout = "e-p:64:64-p5:32:32-A5"

	declare void @llvm.assume(i1)			declare void @llvm.assume(i1)
				declare i1 @llvm.type.test(i8*, metadata) nounwind readnone

	define void @test_01(i1 %c, i64* align 1 %ptr) local_unnamed_addr #0 {			define void @test_01(i1 %c, i64* align 1 %ptr) local_unnamed_addr #0 {
	; CHECK-LABEL: @test_01(			; CHECK-LABEL: @test_01(
	; CHECK-NEXT: br i1 [[C:%.]], label [[TRUE2_CRITEDGE:%.]], label [[FALSE1:%.*]]			; CHECK-NEXT: br i1 [[C:%.]], label [[TRUE2_CRITEDGE:%.]], label [[FALSE1:%.*]]
	; CHECK: false1:			; CHECK: false1:
	; CHECK-NEXT: store volatile i64 1, i64* [[PTR:%.*]], align 4			; CHECK-NEXT: store volatile i64 1, i64* [[PTR:%.*]], align 4
	; CHECK-NEXT: [[PTRINT:%.]] = ptrtoint i64 [[PTR]] to i64			; CHECK-NEXT: [[PTRINT:%.]] = ptrtoint i64 [[PTR]] to i64
	; CHECK-NEXT: [[MASKEDPTR:%.*]] = and i64 [[PTRINT]], 7			; CHECK-NEXT: [[MASKEDPTR:%.*]] = and i64 [[PTRINT]], 7
	▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
	true2: ; preds = %true1			true2: ; preds = %true1
	store volatile i64 2, i64* %ptr, align 8			store volatile i64 2, i64* %ptr, align 8
	ret void			ret void

	false2: ; preds = %true1			false2: ; preds = %true1
	store volatile i64 3, i64* %ptr, align 8			store volatile i64 3, i64* %ptr, align 8
	ret void			ret void
	}			}

				; Try the max block size for PRE again but with the bitcast/type test/assume
				; sequence used for whole program devirt.
				define void @test_04(i1 %c, i64* align 1 %ptr, [3 x i8] %vtable) local_unnamed_addr #0 {
				; CHECK-LABEL: @test_04(
				; CHECK-NEXT: br i1 [[C:%.]], label [[TRUE2_CRITEDGE:%.]], label [[FALSE1:%.*]]
				; CHECK: false1:
				; CHECK-NEXT: store volatile i64 1, i64* [[PTR:%.*]], align 4
				; CHECK-NEXT: [[VTABLE:%.]] = bitcast [3 x i8]* %vtable to i8*
				; CHECK-NEXT: [[P:%.]] = call i1 @llvm.type.test(i8 [[VTABLE]], metadata !"foo")
				; CHECK-NEXT: tail call void @llvm.assume(i1 [[P]])
				; CHECK-NEXT: store volatile i64 0, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 3, i64* [[PTR]], align 8
				; CHECK-NEXT: ret void
				; CHECK: true2.critedge:
				; CHECK-NEXT: [[VTABLE:%.]] = bitcast [3 x i8]* %vtable to i8*
				; CHECK-NEXT: [[P:%.]] = call i1 @llvm.type.test(i8 [[VTABLE]], metadata !"foo")
				; CHECK-NEXT: tail call void @llvm.assume(i1 [[P]])
				; CHECK-NEXT: store volatile i64 0, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 -1, i64* [[PTR]], align 8
				; CHECK-NEXT: store volatile i64 2, i64* [[PTR]], align 8
				; CHECK-NEXT: ret void
				;
				br i1 %c, label %true1, label %false1

				true1: ; preds = %false1, %0
				%vtablei8 = bitcast [3 x i8] %vtable to i8*
				%p = call i1 @llvm.type.test(i8* %vtablei8, metadata !"foo")
				tail call void @llvm.assume(i1 %p)
				store volatile i64 0, i64* %ptr, align 8
				store volatile i64 -1, i64* %ptr, align 8
				store volatile i64 -1, i64* %ptr, align 8
				store volatile i64 -1, i64* %ptr, align 8
				store volatile i64 -1, i64* %ptr, align 8
				store volatile i64 -1, i64* %ptr, align 8
				br i1 %c, label %true2, label %false2

				false1: ; preds = %0
				store volatile i64 1, i64* %ptr, align 4
				br label %true1

				true2: ; preds = %true1
				store volatile i64 2, i64* %ptr, align 8
				ret void

				false2: ; preds = %true1
				store volatile i64 3, i64* %ptr, align 8
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Ignore ephemeral values when counting insts for threading
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 342045

llvm/include/llvm/Analysis/CodeMetrics.h

llvm/lib/Analysis/CodeMetrics.cpp

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

llvm/test/Transforms/SimplifyCFG/unprofitable-pr.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] Ignore ephemeral values when counting insts for threadingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 342045

llvm/include/llvm/Analysis/CodeMetrics.h

llvm/lib/Analysis/CodeMetrics.cpp

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

llvm/test/Transforms/SimplifyCFG/unprofitable-pr.ll

[SimplifyCFG] Ignore ephemeral values when counting insts for threading
ClosedPublic