This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/
-
Transforms/
-
Scalar/
1
SROA.cpp
-
Utils/
9
PromoteMemoryToRegister.cpp
-
test/Transforms/SROA/
-
Transforms/
-
SROA/
4
preserve-nonnull.ll

Differential D27114

Preserve nonnull metadata on Loads through SROA & mem2reg.
ClosedPublic

Authored by luqmana on Nov 24 2016, 2:23 PM.

Download Raw Diff

Details

Reviewers

chandlerc
efriedma

Commits

rG3f807c91dc2a: Preserve nonnull metadata on Loads through SROA & mem2reg.
rL298540: Preserve nonnull metadata on Loads through SROA & mem2reg.

Summary

https://llvm.org/bugs/show_bug.cgi?id=31142 :

SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg.

Diff Detail

Build Status

Buildable 4365
Build 4365: arc lint + arc unit

Event Timeline

luqmana updated this revision to Diff 79258.Nov 24 2016, 2:23 PM

luqmana retitled this revision from to Preserve nonnull metadata on Loads through SROA & mem2reg..

luqmana updated this object.

luqmana added a reviewer: chandlerc.

luqmana added a subscriber: llvm-commits.

I don't get it, this patch looks like it is *removing* SROA's existing logic to propagate metadata? And it doesn't add any new test cases, just deletes an existing test?

Whoops, that was my mistake. The diff was inverted and should be corrected now.

Are you always guaranteed that the range doesn't contain zero?

In D27114#605657, @davide wrote:

Are you always guaranteed that the range doesn't contain zero?

Is it valid for the range to contain 0 if the load was already marked as nonnull?

Could you also preserve !range metadata? It behaves similarly to nonzero, and inttoptr/ptrtoint interchanges them.

efriedma added a subscriber: efriedma.Dec 1 2016, 3:35 PM

efriedma added inline comments.

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
405	This isn't a legal transform: you're assigning "nonnull" to a completely unrelated instruction (which could have other users) just because it happens to be a LoadInst. I guess you could get away with this if you can prove there aren't any other uses of the value, though.

luqmana added inline comments.Dec 16 2016, 9:39 PM

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
405	Hmmm, I could just add a check for it being the only use but is it illegal? Just walking through this code: We have an alloca with a single store (OnlyStore) The value stored was from a Load (ReplVal) Now, we have a Load (LI) from this alloca that's dominated by this store and it has !nonnull. Adding !nonnull to ReplVal seems reasonable since (assuming) the !nonnull on LI was valid, we know that value must be non null. So, can we not just propagate it back to ReplVal?

efriedma added inline comments.Dec 19 2016, 10:31 AM

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
405	(assuming) the !nonnull on LI was valid, we know that value must be non null !nonnull basically means "if this load would return null, we can treat it as poison". So you can't propagate backwards unless you can prove that the load will actually execute (LI post-dominates ReplVal), and that the loaded value will be used in a way that causes undefined behavior (a branch or call which uses LI post-dominates ReplVal). In practice, this is essentially impossible to check in LLVM.

arielb1 added inline comments.Dec 19 2016, 11:40 AM

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
405	I am quite sure I was told that !nonnull causes UB rather than poison. Is there some real documentation for that?

I am quite sure I was told that !nonnull causes UB rather than poison. Is there some real documentation for that?

Err, no, you're probably right; I haven't really worked with it, and LangRef doesn't say anything, so "must" defaults to undefined behavior.

You still can't propagate backwards without proving post-dominance.

In D27114#626971, @efriedma wrote:

I am quite sure I was told that !nonnull causes UB rather than poison. Is there some real documentation for that?

Err, no, you're probably right; I haven't really worked with it, and LangRef doesn't say anything, so "must" defaults to undefined behavior.

You still can't propagate backwards without proving post-dominance.

That should hold in this case no? Lines 361-381 assert that LI is dominated by the Store and the Store trivially dominates ReplVal?

Lines 361-381 assert that LI is dominated by the Store and the Store trivially dominates ReplVal?

Post-dominates, not dominates... they're sort of similar, but post-dominance involves following the control flow edges the other way. A compiler textbook should cover this.

Address comments from review.

Preserve nonnull metadata on Loads through SROA & mem2reg.
Pass PostDominatorTree to PromoteMemoryToRegister.
Assert Post-Dominance of Load from alloca before propogating metadata.

In D27114#627071, @efriedma wrote:

Lines 361-381 assert that LI is dominated by the Store and the Store trivially dominates ReplVal?

Post-dominates, not dominates... they're sort of similar, but post-dominance involves following the control flow edges the other way. A compiler textbook should cover this.

Ah yes, missed that detail. Updated to also consider post-dominance.

Remove unnecessary changes.

PostDominatorTree doesn't provide the kind of post-dominance you need for this transformation: LLVM basic blocks have implicit edges which exit the function early, and those edges aren't reflected in the tree. For example, execution doesn't actually continue after a call to exit(). (See also the llvm::isGuaranteedToTransferExecutionToSuccessor() utility.)

In D27114#627264, @efriedma wrote:

PostDominatorTree doesn't provide the kind of post-dominance you need for this transformation: LLVM basic blocks have implicit edges which exit the function early, and those edges aren't reflected in the tree. For example, execution doesn't actually continue after a call to exit(). (See also the llvm::isGuaranteedToTransferExecutionToSuccessor() utility.)

Hmm, is it possible then to do this sort of analysis in LLVM in this context? Or is this optimization just not possible currently?

I don't think there's any existing analysis which computes the sort of post-dominator tree you want.

Anyway, I'm not sure this is really what you want... have you experimented with inserting llvm.assume calls?

In D27114#628254, @efriedma wrote:

I don't think there's any existing analysis which computes the sort of post-dominator tree you want.

Anyway, I'm not sure this is really what you want... have you experimented with inserting llvm.assume calls?

The problem wasn't that. We do have llvm.assume which InstCombine turns assume( (load addr) != null ) to a load addr !nonnull. Then, SROA/mem2reg end up eating the !nonnull metadata.

Some more details at https://github.com/rust-lang/rust/issues/37945 which was linked in the bug link in the description.

The problem wasn't that. We do have llvm.assume which InstCombine turns assume( (load addr) != null ) to a load addr !nonnull. Then, SROA/mem2reg end up eating the !nonnull metadata.

You can do the reverse transform in mem2reg: load addr !nonnull -> assume( (load addr) != null ), then when the load gets erased the nonnull assumption is preserved.

In D27114#628901, @efriedma wrote:

The problem wasn't that. We do have llvm.assume which InstCombine turns assume( (load addr) != null ) to a load addr !nonnull. Then, SROA/mem2reg end up eating the !nonnull metadata.

You can do the reverse transform in mem2reg: load addr !nonnull -> assume( (load addr) != null ), then when the load gets erased the nonnull assumption is preserved.

Or we could stop instcombine from doing this in the first place.

I think we should canonicalize on the assume formulation as it is strictly more general than the nonnull metadata formulation.

Use assume to preserve nonnull-ness of Load.

In D27114#629295, @chandlerc wrote:

In D27114#628901, @efriedma wrote:

The problem wasn't that. We do have llvm.assume which InstCombine turns assume( (load addr) != null ) to a load addr !nonnull. Then, SROA/mem2reg end up eating the !nonnull metadata.

You can do the reverse transform in mem2reg: load addr !nonnull -> assume( (load addr) != null ), then when the load gets erased the nonnull assumption is preserved.

Or we could stop instcombine from doing this in the first place.

I think we should canonicalize on the assume formulation as it is strictly more general than the nonnull metadata formulation.

For this change I just implemented @efriedma's suggestion. I think changing how nonnull-ness is canonicalized is outside the scope of this.

Could you clean up the formatting to match LLVM coding standards? At the very least clang-format would help here (lots of lines over 80 columns, indent looks all wrong).

Run clang-format.

spatel mentioned this in D27855: try to extend nonnull-ness of arguments from a callsite back to its parent function.Jan 3 2017, 2:29 PM

spatel added a subscriber: spatel.Jan 3 2017, 2:37 PM

spatel added inline comments.

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
964–977	Please make a helper function so we don't have duplicated code.

Move duplicate code to helper function.

spatel added inline comments.Jan 4 2017, 8:11 AM

test/Transforms/SROA/preserve-nonnull.ll
2	I don't think you want to include -instcombine here and create a dependency on its canonicalization of the assume to metadata - especially since that may change based on the comments in this review and PR31518 ( https://llvm.org/bugs/show_bug.cgi?id=31518 ). You can use "utils/update_test_checks.py" to auto-generate exact CHECK lines for your test.

Simplify test and remove instcombine dependency.

Thanks for the changes.

Someone else can correct me if I'm wrong, but I think we're delaying improvements in nonnull optimization until we have a clang hack to avoid disastrous optimizations of libc functions:
http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html

In D27114#636348, @spatel wrote:

Thanks for the changes.

Someone else can correct me if I'm wrong, but I think we're delaying improvements in nonnull optimization until we have a clang hack to avoid disastrous optimizations of libc functions:
http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html

Yes, that thread seems largely in favor of getting something in place in Clang first.

If no one else works on a patch to Clang I may in a week or so, but sadly my queue is very full at the moment.

In D27114#636391, @chandlerc wrote:

In D27114#636348, @spatel wrote:

Thanks for the changes.

Someone else can correct me if I'm wrong, but I think we're delaying improvements in nonnull optimization until we have a clang hack to avoid disastrous optimizations of libc functions:
http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html

Yes, that thread seems largely in favor of getting something in place in Clang first.

If no one else works on a patch to Clang I may in a week or so, but sadly my queue is very full at the moment.

I don't think that's relevant to this patch. Once we have the nonnull in metadata form, we should preserve it. The particular problem with optimizing based on nonnull function-argument annotations is propagating the nonnull annotation from a callsite back into the caller. For that, we should wait on associated Clang updates (so that we can avoid doing it for memcpy and friends based on header-file annotations). This change should proceed.

Bunch of little comments below.

I'd like to see two high level thinsg before this goes in though:

I want to make sure Eli is happy with this approach w.r.t. any semantic differences between !nonnull metadata and the assume. Haven't seen an update from him since the switch of formulation.

I'd like at least a test case and a FIXME around the case where SROA rewrites the load and store to be integer loads and stores. This will happen fairly often due to things like unions, memcpy and other "bag of bits" interpretations of memory operations. We should work to not destroy !nonnull in the process by translating it to an assume. I'm happy for this to be handled in a subsequent patch, but I'd like to at least document the issue for anyone who ends up working on it.

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
41	clang-format's header sort is wrong here (because PromoteMemToReg != PromoteMemoryToRegister).
306–313	This deosn't look like the right formatting... was clang-format misconfigured?
967–968	Here as well.
test/Transforms/SROA/preserve-nonnull.ll
7–8	Do you need these?
10	I would at least add the @ and maybe the define to this so that random other thinsg containing this string don't match confusingly down the road.
16	it's good to not use unnamed values in tests as they make future edits to the test brittle.

Fix some formatting and update test.

In D27114#637102, @chandlerc wrote:

Bunch of little comments below.

I'd like to see two high level thinsg before this goes in though:

I want to make sure Eli is happy with this approach w.r.t. any semantic differences between !nonnull metadata and the assume. Haven't seen an update from him since the switch of formulation.

I'd like at least a test case and a FIXME around the case where SROA rewrites the load and store to be integer loads and stores. This will happen fairly often due to things like unions, memcpy and other "bag of bits" interpretations of memory operations. We should work to not destroy !nonnull in the process by translating it to an assume. I'm happy for this to be handled in a subsequent patch, but I'd like to at least document the issue for anyone who ends up working on it.

Updated the patch to address the little comments. As for the second point, I think that's better handled in a separate patch. If Eli is happy with the current approach I'd like to move forward with just this patch.

I want to make sure Eli is happy with this approach w.r.t. any semantic differences between !nonnull metadata and the assume. Haven't seen an update from him since the switch of formulation.

Semantically, this seems consistent with existing code dealing with !nonnull. That said, creating calls to @llvm.assume has the potential to cause performance regressions; there are a bunch of places in the optimizer which don't handle them well, so it's probably worth checking that this change doesn't create more problems than it solves.

In D27114#637151, @efriedma wrote:

I want to make sure Eli is happy with this approach w.r.t. any semantic differences between !nonnull metadata and the assume. Haven't seen an update from him since the switch of formulation.

Semantically, this seems consistent with existing code dealing with !nonnull. That said, creating calls to @llvm.assume has the potential to cause performance regressions; there are a bunch of places in the optimizer which don't handle them well, so it's probably worth checking that this change doesn't create more problems than it solves.

Are there any existing performance tests so I can determine if this might cause a regression?

Well, you can run the LLVM testsuite (http://llvm.org/docs/lnt/quickstart.html)... but that's written entirely written in C and C++, and I don't think clang produces nonnull metadata at all. Maybe you have some Rust performance tests you could use?

In D27114#637151, @efriedma wrote:

Semantically, this seems consistent with existing code dealing with !nonnull. That said, creating calls to @llvm.assume has the potential to cause performance regressions; there are a bunch of places in the optimizer which don't handle them well, so it's probably worth checking that this change doesn't create more problems than it solves.

Apologies if this isn't directly related to this patch, but one thing I noticed and mentioned in D28337: InstCombine is not calling into InstSimplify (and hence value tracking) with a context instruction in many cases. And so computeKnownBitsFromAssume() is not triggered in those cases. If we fix that (assuming that the lack of context instruction optional parameter is an oversight), we should make better use of the assumption machinery.

Apologies if this isn't directly related to this patch, but one thing I noticed and mentioned in D28337: InstCombine is not calling into InstSimplify (and hence value tracking) with a context instruction in many cases. And so computeKnownBitsFromAssume() is not triggered in those cases. If we fix that (assuming that the lack of context instruction optional parameter is an oversight), we should make better use of the assumption machinery.

I was more referring to the fact than m_OneUse etc. doesn't work the way you want it to in the presence of llvm.assume calls.

In D27114#637187, @efriedma wrote:

Well, you can run the LLVM testsuite (http://llvm.org/docs/lnt/quickstart.html)... but that's written entirely written in C and C++, and I don't think clang produces nonnull metadata at all. Maybe you have some Rust performance tests you could use?

Hmmm, I recall us not using as much assume due to supposed LLVM slowdowns but don't have any on hand.

So at least running the test suite I didn't come across any performance regressions. Could we land this and observe the buildbots or something for any regressions and back out as necessary?

Ping

Chandler, are you planning to continue reviewing this?

Ping

hfinkel added inline comments.Feb 9 2017, 4:24 PM

lib/Transforms/Utils/PromoteMemoryToRegister.cpp
408	Even if we're going to do this in general, we shouldn't do this when we can otherwise prove that the address was nonnull. Can you call isKnownNonNullAt and only add the intrinsic when we need it? The same applies to other places where you call addAssumeNonNull.

Don't add assume if the value is already known to be nonnull.

spatel mentioned this in D28204: [ValueTracking] use nonnull argument attribute to eliminate null checks.Feb 10 2017, 8:07 AM

Ping?

Ping

I'm not familiar with this code, but I'd like to see this functionality, so any assistance in reviewing is appreciated.
Other than the inline comment, this is good now?

lib/Transforms/Scalar/SROA.cpp
2391–2404	Is the intent to extend this for some subset of metadata types beyond just MD_nonnull? Either way, just make this one line for now? NewLI->copyMetadata(LI, LLVMContext::MD_nonnull);

Use copyMetadata method instead of manually copying metadata.

Harbormaster completed remote builds in B4365: Diff 90035.Feb 28 2017, 7:57 AM

Ping?

Hal's earlier comment said we don't need to hold this up for clang, but for reference that fix is proposed now:
D30806

@chandlerc / @davide / @efriedma / @hfinkel - can you continue your review? I don't know enough to approve.

I was hoping Chandler would continue to review, but I guess I can do it.

The test here is really bare-bones; it doesn't really cover all the relevant codepaths. It doesn't check the effect of the isKnownNonNullAt, and it doesn't cover one of the codepaths in PromoteMemoryToRegister.cpp at all.

Otherwise looks fine per earlier discussion.

In D27114#704478, @efriedma wrote:

I was hoping Chandler would continue to review, but I guess I can do it.

FWIW, I'm happy to continue it or for you to continue it. I was planning to GC these threads when the clang patch actually landed, but if you can get to them sooner, so much the better.

The test here is really bare-bones; it doesn't really cover all the relevant codepaths. It doesn't check the effect of the isKnownNonNullAt, and it doesn't cover one of the codepaths in PromoteMemoryToRegister.cpp at all.

FWIW, I think this really needs to be addressed before it lands.

I added another test for all the relevant codepaths in PromoteMemoryToRegister.
I also addressed one case I missed earlier: when the alloca in question has
multiple stores but all within the same basic block.

LGTM.

Luqman, thanks for putting up with the long delay in reviewing this.

Do you have commit access, or should I commit this for you?

This revision is now accepted and ready to land.Mar 22 2017, 10:30 AM

In D27114#707685, @efriedma wrote:

LGTM.

Luqman, thanks for putting up with the long delay in reviewing this.

Do you have commit access, or should I commit this for you?

Thanks! No worries and I can commit it.

Closed by commit rL298540: Preserve nonnull metadata on Loads through SROA & mem2reg. (authored by luqmana). · Explain WhyMar 22 2017, 12:28 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

SROA.cpp

4 lines

Utils/

PromoteMemoryToRegister.cpp

38 lines

test/

Transforms/

SROA/

preserve-nonnull.ll

26 lines

Diff 90035

lib/Transforms/Scalar/SROA.cpp

Show First 20 Lines • Show All 2,381 Lines • ▼ Show 20 Lines	if (VecTy) {
NewEndOffset == NewAllocaEndOffset &&		NewEndOffset == NewAllocaEndOffset &&
(canConvertValue(DL, NewAllocaTy, TargetTy) \|\|		(canConvertValue(DL, NewAllocaTy, TargetTy) \|\|
(IsLoadPastEnd && NewAllocaTy->isIntegerTy() &&		(IsLoadPastEnd && NewAllocaTy->isIntegerTy() &&
TargetTy->isIntegerTy()))) {		TargetTy->isIntegerTy()))) {
LoadInst *NewLI = IRB.CreateAlignedLoad(&NewAI, NewAI.getAlignment(),		LoadInst *NewLI = IRB.CreateAlignedLoad(&NewAI, NewAI.getAlignment(),
LI.isVolatile(), LI.getName());		LI.isVolatile(), LI.getName());
if (LI.isVolatile())		if (LI.isVolatile())
NewLI->setAtomic(LI.getOrdering(), LI.getSynchScope());		NewLI->setAtomic(LI.getOrdering(), LI.getSynchScope());

		// Try to preserve nonnull metadata
		if (TargetTy->isPointerTy())
		NewLI->copyMetadata(LI, LLVMContext::MD_nonnull);
V = NewLI;		V = NewLI;

// If this is an integer load past the end of the slice (which means the		// If this is an integer load past the end of the slice (which means the
// bytes outside the slice are undef or this load is dead) just forcibly		// bytes outside the slice are undef or this load is dead) just forcibly
// fix the integer size with correct handling of endianness.		// fix the integer size with correct handling of endianness.
if (auto *AITy = dyn_cast<IntegerType>(NewAllocaTy))		if (auto *AITy = dyn_cast<IntegerType>(NewAllocaTy))
if (auto *TITy = dyn_cast<IntegerType>(TargetTy))		if (auto *TITy = dyn_cast<IntegerType>(TargetTy))
if (AITy->getBitWidth() < TITy->getBitWidth()) {		if (AITy->getBitWidth() < TITy->getBitWidth()) {
V = IRB.CreateZExt(V, TITy, "load.ext");		V = IRB.CreateZExt(V, TITy, "load.ext");
if (DL.isBigEndian())		if (DL.isBigEndian())
V = IRB.CreateShl(V, TITy->getBitWidth() - AITy->getBitWidth(),		V = IRB.CreateShl(V, TITy->getBitWidth() - AITy->getBitWidth(),
		spatelUnsubmitted Not Done Reply Inline Actions Is the intent to extend this for some subset of metadata types beyond just MD_nonnull? Either way, just make this one line for now? NewLI->copyMetadata(LI, LLVMContext::MD_nonnull); spatel: Is the intent to extend this for some subset of metadata types beyond just MD_nonnull? Either…
"endian_shift");		"endian_shift");
}		}
} else {		} else {
Type *LTy = TargetTy->getPointerTo();		Type *LTy = TargetTy->getPointerTo();
LoadInst *NewLI = IRB.CreateAlignedLoad(getNewAllocaSlicePtr(IRB, LTy),		LoadInst *NewLI = IRB.CreateAlignedLoad(getNewAllocaSlicePtr(IRB, LTy),
getSliceAlign(TargetTy),		getSliceAlign(TargetTy),
LI.isVolatile(), LI.getName());		LI.isVolatile(), LI.getName());
if (LI.isVolatile())		if (LI.isVolatile())
▲ Show 20 Lines • Show All 1,883 Lines • Show Last 20 Lines

lib/Transforms/Utils/PromoteMemoryToRegister.cpp

Show All 9 Lines
// This file promotes memory references to be register references. It promotes		// This file promotes memory references to be register references. It promotes
// alloca instructions which only have loads and stores as uses. An alloca is		// alloca instructions which only have loads and stores as uses. An alloca is
// transformed by using iterated dominator frontiers to place PHI nodes, then		// transformed by using iterated dominator frontiers to place PHI nodes, then
// traversing the function in depth-first order to rewrite loads and stores as		// traversing the function in depth-first order to rewrite loads and stores as
// appropriate.		// appropriate.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Utils/PromoteMemToReg.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasSetTracker.h"		#include "llvm/Analysis/AliasSetTracker.h"
		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/IteratedDominanceFrontier.h"		#include "llvm/Analysis/IteratedDominanceFrontier.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/CFG.h"		#include "llvm/IR/CFG.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DIBuilder.h"		#include "llvm/IR/DIBuilder.h"
#include "llvm/IR/DebugInfo.h"		#include "llvm/IR/DebugInfo.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Metadata.h"		#include "llvm/IR/Metadata.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
		#include "llvm/Transforms/Utils/PromoteMemToReg.h"
		chandlercUnsubmitted Not Done Reply Inline Actions clang-format's header sort is wrong here (because PromoteMemToReg != PromoteMemoryToRegister). chandlerc: clang-format's header sort is wrong here (because PromoteMemToReg != PromoteMemoryToRegister).
#include <algorithm>		#include <algorithm>
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "mem2reg"		#define DEBUG_TYPE "mem2reg"

STATISTIC(NumLocalPromoted, "Number of alloca's promoted within one block");		STATISTIC(NumLocalPromoted, "Number of alloca's promoted within one block");
STATISTIC(NumSingleStore, "Number of alloca's promoted with a single store");		STATISTIC(NumSingleStore, "Number of alloca's promoted with a single store");
STATISTIC(NumDeadAlloca, "Number of dead alloca's removed");		STATISTIC(NumDeadAlloca, "Number of dead alloca's removed");
▲ Show 20 Lines • Show All 247 Lines • ▼ Show 20 Lines	private:
void RenamePass(BasicBlock BB, BasicBlock Pred,		void RenamePass(BasicBlock BB, BasicBlock Pred,
RenamePassData::ValVector &IncVals,		RenamePassData::ValVector &IncVals,
std::vector<RenamePassData> &Worklist);		std::vector<RenamePassData> &Worklist);
bool QueuePhiNode(BasicBlock *BB, unsigned AllocaIdx, unsigned &Version);		bool QueuePhiNode(BasicBlock *BB, unsigned AllocaIdx, unsigned &Version);
};		};

} // end of anonymous namespace		} // end of anonymous namespace

		/// Given a LoadInst LI this adds assume(LI != null) after it.
		static void addAssumeNonNull(AssumptionCache AC, LoadInst LI) {
		Function *AssumeIntrinsic =
		Intrinsic::getDeclaration(LI->getModule(), Intrinsic::assume);
		ICmpInst *LoadNotNull = new ICmpInst(ICmpInst::ICMP_NE, LI,
		Constant::getNullValue(LI->getType()));
		LoadNotNull->insertAfter(LI);
		CallInst *CI = CallInst::Create(AssumeIntrinsic, {LoadNotNull});
		CI->insertAfter(LoadNotNull);
		chandlercUnsubmitted Not Done Reply Inline Actions This deosn't look like the right formatting... was clang-format misconfigured? chandlerc: This deosn't look like the right formatting... was clang-format misconfigured?
		AC->registerAssumption(CI);
		}

static void removeLifetimeIntrinsicUsers(AllocaInst *AI) {		static void removeLifetimeIntrinsicUsers(AllocaInst *AI) {
// Knowing that this alloca is promotable, we know that it's safe to kill all		// Knowing that this alloca is promotable, we know that it's safe to kill all
// instructions except for load and store.		// instructions except for load and store.

for (auto UI = AI->user_begin(), UE = AI->user_end(); UI != UE;) {		for (auto UI = AI->user_begin(), UE = AI->user_end(); UI != UE;) {
Instruction I = cast<Instruction>(UI);		Instruction I = cast<Instruction>(UI);
++UI;		++UI;
if (isa<LoadInst>(I) \|\| isa<StoreInst>(I))		if (isa<LoadInst>(I) \|\| isa<StoreInst>(I))
Show All 17 Lines
///		///
/// When there is only a single store, we can use the domtree to trivially		/// When there is only a single store, we can use the domtree to trivially
/// replace all of the dominated loads with the stored value. Do so, and return		/// replace all of the dominated loads with the stored value. Do so, and return
/// true if this has successfully promoted the alloca entirely. If this returns		/// true if this has successfully promoted the alloca entirely. If this returns
/// false there were some loads which were not dominated by the single store		/// false there were some loads which were not dominated by the single store
/// and thus must be phi-ed with undef. We fall back to the standard alloca		/// and thus must be phi-ed with undef. We fall back to the standard alloca
/// promotion algorithm in that case.		/// promotion algorithm in that case.
static bool rewriteSingleStoreAlloca(AllocaInst *AI, AllocaInfo &Info,		static bool rewriteSingleStoreAlloca(AllocaInst *AI, AllocaInfo &Info,
LargeBlockInfo &LBI,		LargeBlockInfo &LBI, DominatorTree &DT,
DominatorTree &DT,		AliasSetTracker *AST,
AliasSetTracker *AST) {		AssumptionCache *AC) {
StoreInst *OnlyStore = Info.OnlyStore;		StoreInst *OnlyStore = Info.OnlyStore;
bool StoringGlobalVal = !isa<Instruction>(OnlyStore->getOperand(0));		bool StoringGlobalVal = !isa<Instruction>(OnlyStore->getOperand(0));
BasicBlock *StoreBB = OnlyStore->getParent();		BasicBlock *StoreBB = OnlyStore->getParent();
int StoreIndex = -1;		int StoreIndex = -1;

// Clear out UsingBlocks. We will reconstruct it here if needed.		// Clear out UsingBlocks. We will reconstruct it here if needed.
Info.UsingBlocks.clear();		Info.UsingBlocks.clear();

Show All 34 Lines	for (auto UI = AI->user_begin(), E = AI->user_end(); UI != E;) {
}		}

// Otherwise, we can safely rewrite this load.		// Otherwise, we can safely rewrite this load.
Value *ReplVal = OnlyStore->getOperand(0);		Value *ReplVal = OnlyStore->getOperand(0);
// If the replacement value is the load, this must occur in unreachable		// If the replacement value is the load, this must occur in unreachable
// code.		// code.
if (ReplVal == LI)		if (ReplVal == LI)
ReplVal = UndefValue::get(LI->getType());		ReplVal = UndefValue::get(LI->getType());

		// If the load was marked as nonnull we don't want to lose
		// that information when we erase this Load. So we preserve
		efriedmaUnsubmitted Not Done Reply Inline Actions This isn't a legal transform: you're assigning "nonnull" to a completely unrelated instruction (which could have other users) just because it happens to be a LoadInst. I guess you could get away with this if you can prove there aren't any other uses of the value, though. efriedma: This isn't a legal transform: you're assigning "nonnull" to a completely unrelated instruction…
		luqmanaAuthorUnsubmitted Not Done Reply Inline Actions Hmmm, I could just add a check for it being the only use but is it illegal? Just walking through this code: We have an alloca with a single store (OnlyStore) The value stored was from a Load (ReplVal) Now, we have a Load (LI) from this alloca that's dominated by this store and it has !nonnull. Adding !nonnull to ReplVal seems reasonable since (assuming) the !nonnull on LI was valid, we know that value must be non null. So, can we not just propagate it back to ReplVal? luqmana: Hmmm, I could just add a check for it being the only use but is it illegal? Just walking…
		efriedmaUnsubmitted Not Done Reply Inline Actions (assuming) the !nonnull on LI was valid, we know that value must be non null !nonnull basically means "if this load would return null, we can treat it as poison". So you can't propagate backwards unless you can prove that the load will actually execute (LI post-dominates ReplVal), and that the loaded value will be used in a way that causes undefined behavior (a branch or call which uses LI post-dominates ReplVal). In practice, this is essentially impossible to check in LLVM. efriedma: > (assuming) the !nonnull on LI was valid, we know that value must be non null !nonnull…
		arielb1Unsubmitted Not Done Reply Inline Actions I am quite sure I was told that !nonnull causes UB rather than poison. Is there some real documentation for that? arielb1: I am quite sure I was told that !nonnull causes UB rather than poison. Is there some real…
		// it with an assume.
		if (AC && LI->getMetadata(LLVMContext::MD_nonnull) &&
		!llvm::isKnownNonNullAt(ReplVal, LI, &DT))
		hfinkelUnsubmitted Not Done Reply Inline Actions Even if we're going to do this in general, we shouldn't do this when we can otherwise prove that the address was nonnull. Can you call isKnownNonNullAt and only add the intrinsic when we need it? The same applies to other places where you call addAssumeNonNull. hfinkel: Even if we're going to do this in general, we shouldn't do this when we can otherwise prove…
		addAssumeNonNull(AC, LI);

LI->replaceAllUsesWith(ReplVal);		LI->replaceAllUsesWith(ReplVal);
if (AST && LI->getType()->isPointerTy())		if (AST && LI->getType()->isPointerTy())
AST->deleteValue(LI);		AST->deleteValue(LI);
LI->eraseFromParent();		LI->eraseFromParent();
LBI.deleteValue(LI);		LBI.deleteValue(LI);
}		}

// Finally, after the scan, check to see if the store is all that is left.		// Finally, after the scan, check to see if the store is all that is left.
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	for (unsigned AllocaNum = 0; AllocaNum != Allocas.size(); ++AllocaNum) {

// Calculate the set of read and write-locations for each alloca. This is		// Calculate the set of read and write-locations for each alloca. This is
// analogous to finding the 'uses' and 'definitions' of each variable.		// analogous to finding the 'uses' and 'definitions' of each variable.
Info.AnalyzeAlloca(AI);		Info.AnalyzeAlloca(AI);

// If there is only a single store to this value, replace any loads of		// If there is only a single store to this value, replace any loads of
// it that are directly dominated by the definition with the value stored.		// it that are directly dominated by the definition with the value stored.
if (Info.DefiningBlocks.size() == 1) {		if (Info.DefiningBlocks.size() == 1) {
if (rewriteSingleStoreAlloca(AI, Info, LBI, DT, AST)) {		if (rewriteSingleStoreAlloca(AI, Info, LBI, DT, AST, AC)) {
// The alloca has been processed, move on.		// The alloca has been processed, move on.
RemoveFromAllocasList(AllocaNum);		RemoveFromAllocasList(AllocaNum);
++NumSingleStore;		++NumSingleStore;
continue;		continue;
}		}
}		}

// If the alloca is only read and written in one basic block, just perform a		// If the alloca is only read and written in one basic block, just perform a
▲ Show 20 Lines • Show All 370 Lines • ▼ Show 20 Lines	if (LoadInst *LI = dyn_cast<LoadInst>(I)) {
continue;		continue;

DenseMap<AllocaInst *, unsigned>::iterator AI = AllocaLookup.find(Src);		DenseMap<AllocaInst *, unsigned>::iterator AI = AllocaLookup.find(Src);
if (AI == AllocaLookup.end())		if (AI == AllocaLookup.end())
continue;		continue;

Value *V = IncomingVals[AI->second];		Value *V = IncomingVals[AI->second];

		// If the load was marked as nonnull we don't want to lose
		// that information when we erase this Load. So we preserve
		// it with an assume.
		if (AC && LI->getMetadata(LLVMContext::MD_nonnull) &&
		!llvm::isKnownNonNullAt(V, LI, &DT))
		chandlercUnsubmitted Not Done Reply Inline Actions Here as well. chandlerc: Here as well.
		addAssumeNonNull(AC, LI);

// Anything using the load now uses the current value.		// Anything using the load now uses the current value.
LI->replaceAllUsesWith(V);		LI->replaceAllUsesWith(V);
if (AST && LI->getType()->isPointerTy())		if (AST && LI->getType()->isPointerTy())
AST->deleteValue(LI);		AST->deleteValue(LI);
BB->getInstList().erase(LI);		BB->getInstList().erase(LI);
} else if (StoreInst *SI = dyn_cast<StoreInst>(I)) {		} else if (StoreInst *SI = dyn_cast<StoreInst>(I)) {
// Delete this instruction and mark the name as the current holder of the		// Delete this instruction and mark the name as the current holder of the
		spatelUnsubmitted Not Done Reply Inline Actions Please make a helper function so we don't have duplicated code. spatel: Please make a helper function so we don't have duplicated code.
// value		// value
AllocaInst *Dest = dyn_cast<AllocaInst>(SI->getPointerOperand());		AllocaInst *Dest = dyn_cast<AllocaInst>(SI->getPointerOperand());
if (!Dest)		if (!Dest)
continue;		continue;

DenseMap<AllocaInst *, unsigned>::iterator ai = AllocaLookup.find(Dest);		DenseMap<AllocaInst *, unsigned>::iterator ai = AllocaLookup.find(Dest);
if (ai == AllocaLookup.end())		if (ai == AllocaLookup.end())
continue;		continue;
Show All 39 Lines

test/Transforms/SROA/preserve-nonnull.ll

This file was added.

				; RUN: opt < %s -sroa -S \| FileCheck %s
				;
				spatelUnsubmitted Not Done Reply Inline Actions I don't think you want to include -instcombine here and create a dependency on its canonicalization of the assume to metadata - especially since that may change based on the comments in this review and PR31518 ( https://llvm.org/bugs/show_bug.cgi?id=31518 ). You can use "utils/update_test_checks.py" to auto-generate exact CHECK lines for your test. spatel: I don't think you want to include -instcombine here and create a dependency on its…
				; Make sure that SROA doesn't lose nonnull metadata
				; on loads from allocas that get optimized out.

				; CHECK-LABEL: define float* @yummy_nonnull
				; CHECK: [[RETURN:%(.)]] = load float, float** %arg, align 8
				; CHECK: [[ASSUME:%(.)]] = icmp ne float {{.*}}[[RETURN]], null
				chandlercUnsubmitted Not Done Reply Inline Actions Do you need these? chandlerc: Do you need these?
				; CHECK: call void @llvm.assume(i1 {{.*}}[[ASSUME]])
				; CHECK: ret float* {{.*}}[[RETURN]]
				chandlercUnsubmitted Not Done Reply Inline Actions I would at least add the @ and maybe the define to this so that random other thinsg containing this string don't match confusingly down the road. chandlerc: I would at least add the @ and maybe the define to this so that random other thinsg containing…

				define float* @yummy_nonnull(float** %arg) {
				entry-block:
				%buf = alloca float*

				%_arg_i8 = bitcast float** %arg to i8*
				chandlercUnsubmitted Not Done Reply Inline Actions it's good to not use unnamed values in tests as they make future edits to the test brittle. chandlerc: it's good to not use unnamed values in tests as they make future edits to the test brittle.
				%_buf_i8 = bitcast float** %buf to i8*
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %_buf_i8, i8* %_arg_i8, i64 8, i32 8, i1 false)

				%ret = load float, float* %buf, align 8, !nonnull !0
				ret float* %ret
				}

				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1)

				!0 = !{}

This is an archive of the discontinued LLVM Phabricator instance.

Preserve nonnull metadata on Loads through SROA & mem2reg.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 90035

lib/Transforms/Scalar/SROA.cpp

lib/Transforms/Utils/PromoteMemoryToRegister.cpp

test/Transforms/SROA/preserve-nonnull.ll

Preserve nonnull metadata on Loads through SROA & mem2reg.
ClosedPublic